[Geany-Devel] Escaping replacement for placeholder in build commands
elextr at gmail.com
Mon Dec 17 01:23:08 UTC 2012
>> Hi Colomban,
>> Hehe, sorry, this email arrived just after another project just did
>> something similar to "sanitise" inputs and broke my setup, making me
>> very cranky, so I didn't want to have it happen to Geany :)
> Oh, ok, but we have you to make sure we don't brake anything, you're
> Geany's regression test suite by yourself ;)
Heh, don't bet on it ;)
>> I agree that the security risk exists, but then it needs a peculiar
>> filename, the only way to fix that would be to ban any shell
>> metacharacters in filenames, but some users won't like that.
> No, the solution is to escape or quote the replacements, so that any
> shell metacharacter in them wouldn't be interpreted by the shell.
Ok, yes if done correctly.
>> As for spaces in filenames, see below.
>>>> I think the idea is to just escape the filenames being replaced into %f
>>>> et al placeholders, not the whole command en masse. IIRC, the only thing
>>>> placed into the placeholders is filenames/paths, which should be able to
>>>> be safely escaped/sanitized without messing up the whole command.
>>> That's it. Another example like Matthew's one: what I want to do is
>>> only to make sure that e.g. "%f" is never replaced by something that
>>> will be interpreted as a command. Those placeholders (%f, %d, %e and
>>> %p) represent filenames or paths and never user-typed commands, so they
>>> should never be interpreted as a command.
>>> For example, the simple command:
>>> gcc "%f" -c -o "%e.o"
>>> Given the current file named:
>>> holy "crap.c
>>> would currently expand to:
>>> gcc "holy "crap.c" -c -o "holy "crap.o"
>>> Which, huh, doesn't mean what you want at all! The commands will be
>>> understood as:
>>> argv = gcc
>>> argv = holy crap.c -c -o holy
>>> argv = crap.o
>>> (if my shell unquoting skills are correct ;))
>> Understand, I think thats right.
>>> What the user actually expected was:
>>> argv = gcc
>>> argv = holy "crap.c
>>> argv = -c
>>> argv = -o
>>> argv = holy "crap.o
>>> Obviously, it's not the same :(
>> Thats obviously whats expected, and yes its not the same. Although
>> inconvenient the above erroneous input would of course just give an
>> error, "holy crap.c -c -o holy" not found. Well unless it is found :)
> It would probably just throw a "file not found" error in this case, but
> who know what's the filename and as you pointed out yourself, who knows
> what was the actual command.
>>> So again, what I want is to make sure the placeholders (%f & co.) are
>>> properly escaped/quoted/whatever so they never get misinterpreted like
>>> above. And as I said, to actually achieve this we have to somewhat
>>> understand the quoting in the user command because we can't escape the
>>> same inside or outside shell quotes -- e.g. %f (without quotes) cannot
>>> be escaped the same way as "%f" (with the quotes).
>> or %e.* for example?
> No, it's the same as unquoted %e, since it's outside a quote.
>>> But still nothing really clever trying to understand a meaning, just
>>> knowing the basics like '' and "" are quotes, and \ is an escape so we
>>> can know how to escape. Actually the implementation I proposed simply
>>> closes any open quote before replacing a quoted placeholder, and then
>>> re-opens it.
>>> I.e. for the above example, it would replace as:
>>> gcc ""'holy "crap.c'"" -c -o ""'holy "crap.o'""
>>> That's a bit ugly but you don't have to see it, and it's perfectly safe.
>>> Also, just to make sure you see what I mean, if the user command didn't
>>> quote the %f and %e, and the filename simply contained a space:
>>> command: gcc %f -c -o %e.o
>>> filename: file name.c
>>> current replacement: gcc file name.c -c -o file name.o
>>> fixed replacement: gcc 'file name.c' -c -o 'file name'.o
>>> You see? :)
>> Yeah, thats a normal command, but continuing my example from above,
>> what do you get if I have
>> cmd %e.*
> It would give:
> cmd 'xxx'.*
>> because if lets say the filename is xxx.c, if you get
>> cmd "xxx.*"
>> then the * won't be treated as a glob since its quoted, so my command
>> won't work. It needs to be
>> cmd "xxx."*
> Nope, it needs to be
> cmd 'xxx'.*
yeah, ok the . is stripped from the %e.
> since we don't have to know what a . means, we just quote the
> placeholder replacement.
>> So is g_shell_quote smart enough for all possibilities of shell
>> metacharacters? Including the full range of shell patterns, and
>> backquote command substitution and etc. Then again what if the
>> filename has a literal * in it, how does g_shell_quote know its
>> literal or if we meant a glob?
> `g_shell_quote()` escapes (or actually, quotes) the string we gives it,
> no matter what, we just don't pass it the user command. I think we have
> a misunderstanding here, so let's be clear and since we all are
> programmers, let's write code :) So here's a incredibly naive (and not
> correct enough, but see my real one for a less naive one) implementation
> in pseudo code:
> def replace_placeholders(doc, cmd):
> f = os.path.basename(doc.filename)
> e = os.path.splitext(f)
> d = os.path.dirname(doc.filename)
> p = app.project_path
> cmd.replace('%f', g_shell_quote(f))
> cmd.replace('%e', g_shell_quote(e))
> cmd.replace('%d', g_shell_quote(d))
> cmd.replace('%p', g_shell_quote(p))
> return cmd
> So you see that only the *replacement* is escaped/quoted, nothing else.
> So nobody will touch your * literal or whatever.
Ok, I didn't get that before. Pseudo code is good :)
So is it *allways* going to be correct to just quote part of the
command? (Apart from subcommands addressed below).
> OK, to chose appropriate quoting you may argue one needs to understand
> sub-command syntax. Expanding the command:
> cmd "$(cmd2 %f)"
> cmd "$(cmd2 "'xxx.c'")"
> is incorrect -- and that's what my patch would do, because it doesn't
> understand sub-shells. We may or may not want to make sure to support
> sub-shells -- that's not that hard, but may also not be that useful, but
If we don't support any particular shell capabilities we need to be
very careful to document it, just so we have something to point to
when people complain, having not read the manual. :)
We do need to support backquote subcommands since that is needed for
GTK compilation to do pkg-config.
>>>>>> So, how to fix it?
>>>>> Don't, its not broke :)
>>> Yes it is, although it's not really annoying because commands have
>>> quotes around placeholders and filenames generally don't include "
>>> literals. But try compiling a file whose name contains a " literal and
>>> you'll see it's broken ;)
>> Heh, ok, but so is using a " literal in a filename in that case :)
> Agreed :) However, I think one shouldn't need to explicitly quote the
> placeholders in the build commands (and I think we don't quote %e in
> `make %e.o` by default); so it also holds for spaces or any other shell
Heh, neither solution is always right, but the manual one is more
flexible, and uses much less code ;) for example you still have to add
code for the subshells and subcommands, but if its right then yes its
> Devel mailing list
> Devel at lists.geany.org
More information about the Devel