[Geany-Devel] Escaping replacement for placeholder in build commands

Sun Dec 16 00:47:24 UTC 2012

Le 16/12/2012 00:00, Lex Trotman a écrit :
> [...]
> 
>>>> I don't think Geany should be interpreting what a user meant by their
>>>> command, there are too many edge cases that don't conform to the
>>>> "usual" gcc or other rules.  That space may be meant to be there,
>>>> Geany has no way of knowing.  Also what if the command has globs in
>>>> it, quoting them stops them working :(
>>
>> Calm down Lex, I'm not trying to break your whole Geany.  Breathe.
>> Inspire; Expire.  Ok ;)
>>
> 
> Hi Colomban,
> 
> Hehe, sorry, this email arrived just after another project just did
> something similar to "sanitise" inputs and broke my setup, making me
> very cranky, so I didn't want to have it happen to Geany :)

Oh, ok, but we have you to make sure we don't brake anything,  you're
Geany's regression test suite by yourself ;)

> I agree that the security risk exists, but then it needs a peculiar
> filename, the only way to fix that would be to ban any shell
> metacharacters in filenames, but some users won't like that.

No, the solution is to escape or quote the replacements, so that any
shell metacharacter in them wouldn't be interpreted by the shell.

> As for spaces in filenames, see below.
> 
>>> I think the idea is to just escape the filenames being replaced into %f
>>> et al placeholders, not the whole command en masse. IIRC, the only thing
>>> placed into the placeholders is filenames/paths, which should be able to
>>> be safely escaped/sanitized without messing up the whole command.
>>
>> That's it.  Another example like Matthew's one:  what I want to do is
>> only to make sure that e.g. "%f" is never replaced by something that
>> will be interpreted as a command.  Those placeholders (%f, %d, %e and
>> %p) represent filenames or paths and never user-typed commands, so they
>> should never be interpreted as a command.
>>
>> For example, the simple command:
>>
>>         gcc "%f" -c -o "%e.o"
>>
>> Given the current file named:
>>
>>         holy "crap.c
>>
>> would currently expand to:
>>
>>         gcc "holy "crap.c" -c -o "holy "crap.o"
>>
>> Which, huh, doesn't mean what you want at all!  The commands will be
>> understood as:
>>
>> argv[0] = gcc
>> argv[1] = holy crap.c -c -o holy
>> argv[2] = crap.o
>>
>> (if my shell unquoting skills are correct ;))
> 
> Understand, I think thats right.
> 
>>
>> What the user actually expected was:
>>
>> argv[0] = gcc
>> argv[1] = holy "crap.c
>> argv[2] = -c
>> argv[3] = -o
>> argv[4] = holy "crap.o
>>
>> Obviously, it's not the same :(
> 
> Thats obviously whats expected, and yes its not the same.  Although
> inconvenient the above erroneous input would of course just give an
> error, "holy crap.c -c -o holy" not found.  Well unless it is found :)

It would probably just throw a "file not found" error in this case, but
who know what's the filename and as you pointed out yourself, who knows
what was the actual command.

>> So again, what I want is to make sure the placeholders (%f & co.) are
>> properly escaped/quoted/whatever so they never get misinterpreted like
>> above.  And as I said, to actually achieve this we have to somewhat
>> understand the quoting in the user command because we can't escape the
>> same inside or outside shell quotes -- e.g. %f (without quotes) cannot
>> be escaped the same way as "%f" (with the quotes).
> 
> or %e.* for example?

No, it's the same as unquoted %e, since it's outside a quote.

>> But still nothing really clever trying to understand a meaning, just
>> knowing the basics like '' and "" are quotes, and \ is an escape so we
>> can know how to escape.  Actually the implementation I proposed simply
>> closes any open quote before replacing a quoted placeholder, and then
>> re-opens it.
>>
>> I.e. for the above example, it would replace as:
>>
>>         gcc ""'holy "crap.c'"" -c -o ""'holy "crap.o'""
>>
>> That's a bit ugly but you don't have to see it, and it's perfectly safe.
>>  Also, just to make sure you see what I mean, if the user command didn't
>> quote the %f and %e, and the filename simply contained a space:
>>
>> command:                gcc %f -c -o %e.o
>> filename:               file name.c
>> current replacement:    gcc file name.c -c -o file name.o
>> fixed replacement:      gcc 'file name.c' -c -o 'file name'.o
>>
>> You see? :)
> 
> Yeah, thats a normal command, but continuing my example from above,
> what do you get if I have
> 
> cmd %e.*

It would give:

	cmd 'xxx'.*

> because if lets say the filename is xxx.c, if you get
> 
> cmd "xxx.*"
> 
> then the * won't be treated as a glob since its quoted, so my command
> won't work.  It needs to be
> 
> cmd "xxx."*

Nope, it needs to be

	cmd 'xxx'.*

since we don't have to know what a . means, we just quote the
placeholder replacement.

> So is g_shell_quote smart enough for all possibilities of shell
> metacharacters?  Including the full range of shell patterns, and
> backquote command substitution and etc. Then again what if the
> filename has a literal * in it, how does g_shell_quote know its
> literal or if we meant a glob?

`g_shell_quote()` escapes (or actually, quotes) the string we gives it,
no matter what, we just don't pass it the user command.  I think we have
a misunderstanding here, so let's be clear and since we all are
programmers, let's write code :)   So here's a incredibly naive (and not
correct enough, but see my real one for a less naive one) implementation
in pseudo code:

	def replace_placeholders(doc, cmd):
		f = os.path.basename(doc.filename)
		e = os.path.splitext(f)[0]
		d = os.path.dirname(doc.filename)
		p = app.project_path

		cmd.replace('%f', g_shell_quote(f))
		cmd.replace('%e', g_shell_quote(e))
		cmd.replace('%d', g_shell_quote(d))
		cmd.replace('%p', g_shell_quote(p))

		return cmd

So you see that only the *replacement* is escaped/quoted, nothing else.
 So nobody will touch your * literal or whatever.

OK, to chose appropriate quoting you may argue one needs to understand
sub-command syntax.  Expanding the command:

	cmd "$(cmd2 %f)"

to

	cmd "$(cmd2 "'xxx.c'")"

is incorrect -- and that's what my patch would do, because it doesn't
understand sub-shells.  We may or may not want to make sure to support
sub-shells -- that's not that hard, but may also not be that useful, but
whatever.

>>>>> So, how to fix it?
>>>>
>>>> Don't, its not broke :)
>>
>> Yes it is, although it's not really annoying because commands have
>> quotes around placeholders and filenames generally don't include "
>> literals.  But try compiling a file whose name contains a " literal and
>> you'll see it's broken ;)
> 
> Heh, ok, but so is using a " literal in a filename in that case :)

Agreed :)  However, I think one shouldn't need to explicitly quote the
placeholders in the build commands (and I think we don't quote %e in
`make %e.o` by default); so it also holds for spaces or any other shell
metacharacter.

Cheers,
Colomban