Hi,
To offload the discussion from PR#441 [1] from this partly off-topic discussion and give it more visibility, I'm moving it here to the ML. There already was a thread on the subject that got mostly forgotten, see [2].
I apologize for the long and complex email, but I don't know how to present that in a more concise way without losing important information or clarity.
Note: I will use *NIX-style command lines as examples. The ones for Windows are slightly different, but the problematic is basically the same (though we don't use the shell there so it's a bit simpler).
1. So, what are we talking about?
Some of our commands can contain placeholders. The most important ones are the build commands, but also e.g. the terminal tool has '%c'. These placeholders are replaced by Geany with various things, like file names, paths, line numbers etc. Some of the commands are executed by a shell (build commands) and some aren't (terminal command).
2. What is the problem?
The replacement for these placeholders can be anything, so they might contain characters that have a meaning in a command. The most obvious example is quotes: imagine a file named `foo "bar.c`. If we just replace the placeholder with the raw value (as we currently do) in a build command, e.g. `gcc -c -o %e.o "%f"`, it gives this:
gcc -c -o foo "bar.o "foo "bar.c"
which is obviously incorrect (see [1] or [2] if it's not obvious for you ;) We need to escape (or quote) replacements in some way or another.
3. So, how can we fix this?
Several solutions have been suggested (well, actually 3.5 is new!), each with pros and cons:
3.1. Insert quoted replacement as needed where they appear
This is what I first implemented: track the quotes in the user command, close them before inserting a quoted/escaped replacement, and reopen it right after. With the above command, it would give this:
gcc -c -o 'foo "bar'.o ""'foo "bar.c'""
which is valid and as we want it (again, see [1] and [2] if it's not clear that it's valid).
This is what the patch at [3] (and [2], but the version there is outdated) does.
3.1.1. Pros
* compatible with any current user commands (no breakage of the user configuration, and fixes replacement in them); * the user doesn't have to worry about the issue.
3.1.2. Cons
* requires understanding of the shell quoting rules (including sub-shell like `` and $()), which might be non-trivial. * if it gets something wrong, the users are mostly screwed (the only thing they could do would be try to adapt the command in a way for Geany to get it right)
3.2. Insert quoted replacements everywhere blindly
Replace blindly each placeholder with an escaped/quoted version of itself. This is like 3.1 but doesn't try to manage quotes in the input command. It would give:
gcc -c -o 'foo "bar'.o "'foo "bar.c'"
(which is incorrect, see the cons below)
3.2.1. Pros
* easy to implement; * simple, the user can easily know exactly what happens.
3.2.2. Cons:
* Incompatible with (some) current user commands: as the placeholder is replaced without care, the placeholders need to appear outside other quotes. E.g. the above example would have to be altered to remove quoting around the %f placeholder not to be quoted twice: `gcc -c -o %e.o %f`.
3.3. Provide format modifier performing specific quoting
Dimitar suggested introducing format modifiers like %"f or %'f that would quote in various ways (see https://github.com/geany/geany/pull/441#issuecomment-87272057).
3.3.1. Pros
* The users can quote in various ways as they see fit; * Explicit control on how things are quoted/escaped; * Could support recursive quoting (quoting twice or even more), so Lex could use it in a `python -c` or similar called by the shell ;) e.g. `python -c "print(%""f.replace(' ', '_'))"` could end up in `python -c "print("foo\ \"bar.c".replace(' ', '_'))"` -- assuming we implement the escaping Python requires. happy? :)
3.3.2. Cons
* it doesn't fix existing user commands (needs to make use of the new format modifiers); * requires to implement various kind of quoting (which requires knowing several shell escaping rules); * complex and uncommon format modifiers (the users have to pick the right one, which might not be obvious); * the users can screw themselves up if they don't use the appropriate format (could work with some replacement but not others).
3.4. Parse command as an argument vector and replace in each argument
Instead of working on the command itself, use g_shell_parse_argv() and perform replacements in argv[1:] (each argument separately).
3.4.1. Pros
* no need for quoting or escaping the replacement (as each argument is done on its own).
3.4.2. Cons
* doesn't work with shell constructs, so it's not a solution for build commands (as an argv is an argv, which can't support sub-shell, piping and whatnot).
3.5. Use environment variables, not placeholders
Change the whole logic and use the environment to pass what currently uses placeholders. E.g. a command could be `gcc -c -o "${GEANY_FILENAME_NOEXT}.o" "$GEANY_FILENAME"`.
3.5.1. Pros
* quite easy to implement (but see cons); * no need for quoting or escaping replacements; * works the same as any environment or shell variable; * environment variables can then also be used by the called commands themselves, not only on their command line.
3.5.2. Cons
* it doesn't fix existing user commands (needs to make use of the new variables); * requires separate logic (e.g. 3.4) when the command is not run in a shell (as something has to read the replace on the command line itself); * the users can screw themselves up if they don't use the appropriate quoting around the variables (could work with some replacement but not others).
4. So, what?
I like 3.1 ("smart" replacement) and 3.5 (environment variables) mostly. 3.1 is the only solution here that doesn't require changes in user commands, so it is seamless on the user. 3.5 leaves the actual replacement to the shell (so the limitations are the ones of the shell), and also gives the information to the children, grandchildren, etc.
Note that we perhaps could use something like 3.1 simply to upgrade user commands to the format of one of the other solutions. This way an error in the implementation of 3.1 can be fixed by the user.
What do you think?
Regards, Colomban
[1] https://github.com/geany/geany/pull/441 [2] http://lists.geany.org/pipermail/devel/2012-December/007329.html [3] https://github.com/geany/geany/pull/441#issuecomment-87336184 and https://gist.github.com/b4n/4c100d6f1defd4751217
On 30.3.2015 г. 20:18, Colomban Wendling wrote:
To offload the discussion from PR#441 [1] from this partly off-topic discussion and give it more visibility, I'm moving it here to the ML. [...]
One thing I forgot to mention is that it would be good to have some kind of OS-variable (single|double) quotes for the static text. Under Unix, one would normally use ' to avoid file name and variable expansion, but under Windows, the normal quotes are double (unless using bash.exe or something).
That is, if the user specifies a build or printing command containing, say, «cx$dat.tmp», the Unix text must be single-quoted to avoid the expansion of $dat. But under Win~1, the single quotes are literal. And "cx$dat.tmp" won't work either, \ under Win~1 will be literal[1].
Maybe %"cx$dat.tmp%", since we already use % as metacharacter?
In the case of solution 3.1,
3.1. Insert quoted replacement as needed where they appear
the variable quote replacement must be done before any placeholders.
[1] Win~1 escaping rules in short: " is escaped with , any literal -es before a " must be duplicated, all other -es are literal.
-- E-gards: Jimmy
Am 30.03.2015 um 19:18 schrieb Colomban Wendling:
Hi,
What do you think?
Is this a real problem (reported by someone) or just theoretical?
Unless there is actually a real problem caused by this for someone I'd vote for not doing anything. After all, somone naming his files `foo "bar.c` should expect to shoot himself in the foot.
Seriously, sounds like a possibility for over-engineering a non-existent problem.
Best regards
Le 30/03/2015 21:59, Thomas Martitz a écrit :
[…] Is this a real problem (reported by someone) or just theoretical?
Mostly theoretical, although we got a supposedly security-related mail about that issue (ref https://bugs.gentoo.org/show_bug.cgi?id=446986)
[…] After all, somone naming his files `foo "bar.c` should expect to shoot himself in the foot.
Agreed, but some people keep playing with guns and complain when they hurt themselves :)
So, well, yes to some extent it's an imaginary problem I agree, but the code does have a problem and it stares back at me each time I pass through build.c :) But no, it's not an issue everyone keep complaining about.
Cheers, Colomban
What do you think?
I have to totally agree with Thomas, its over-engineering a problem that doesn't really exist and risks introducing more problems than it fixes.
Perhaps we should be more explicit in the manual that on *ix build commands are run in the shell and the user is responsible for either quoting the substitutions correctly, or using paths that don't need quoting.
Cheers Lex
Regards, Colomban
[1] https://github.com/geany/geany/pull/441 [2] http://lists.geany.org/pipermail/devel/2012-December/007329.html [3] https://github.com/geany/geany/pull/441#issuecomment-87336184 and https://gist.github.com/b4n/4c100d6f1defd4751217 _______________________________________________ Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel
Le 31/03/2015 02:10, Lex Trotman a écrit :
[…]
Perhaps we should be more explicit in the manual that on *ix build commands are run in the shell and the user is responsible for either quoting the substitutions correctly, […]
The user currently *cannot* do it "correctly" so it works with any possible replacement, that's actually the problem :]
On 31 March 2015 at 11:14, Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 31/03/2015 02:10, Lex Trotman a écrit :
[…]
Perhaps we should be more explicit in the manual that on *ix build commands are run in the shell and the user is responsible for either quoting the substitutions correctly, […]
The user currently *cannot* do it "correctly" so it works with any possible replacement, that's actually the problem :]
See the second part of the sentence which you elided :-D
Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel
On 31.3.2015 г. 03:16, Lex Trotman wrote:
On 31 March 2015 at 11:14, Colomban Wendlinglists.ban@herbesfolles.org wrote:
Le 31/03/2015 02:10, Lex Trotman a écrit :
[…]
Perhaps we should be more explicit in the manual that on *ix build commands are run in the shell and the user is responsible for either quoting the substitutions correctly, […]
The user currently *cannot* do it "correctly" so it works with any possible replacement, that's actually the problem :]
See the second part of the sentence which you elided :-D
There is 1 case where the user can't do it correctly: Context Action.
if (sci_has_selection(doc->editor->sci)) word = sci_get_selection_contents(doc->editor->sci); ... utils_str_replace_all(&command, "%s", word); spawn_async(... command, ...)
This can be fixed by removing the %s specified and passing the text as an argv after the command (the now spawn allows that), with the slight disadvantage that the selection will always be at the end.
--
In all other cases, except if somebody uses a file or directory name with ' under Unix for build or printing, the right thing *can* be done by quoting. But:
- In all default Geany commands, the placeholders are unquoted, meaning that any file name with space(s) currently breaks. Shame on us. :)
- The quotes are different under Unix/Win~1: single vs. double.
I think there is an absolute minimum we can do about it: quote the file/directory name placeholders natively, but if, and only if, the entire command does not contain any single or double quotes. I don't see a way this can fail, but maybe elextr can prove me wrong.
Of course, that's not perfect by any means. But if it'll unquestionably improve the most realistic scenarios (file / directory names with spaces), with a very small effort, I think it's worth.
-- E-gards: Jimmy
On 1 April 2015 at 04:45, Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
On 31.3.2015 г. 03:16, Lex Trotman wrote:
On 31 March 2015 at 11:14, Colomban Wendlinglists.ban@herbesfolles.org wrote:
Le 31/03/2015 02:10, Lex Trotman a écrit :
[…]
Perhaps we should be more explicit in the manual that on *ix build commands are run in the shell and the user is responsible for either quoting the substitutions correctly, […]
The user currently *cannot* do it "correctly" so it works with any possible replacement, that's actually the problem :]
See the second part of the sentence which you elided :-D
There is 1 case where the user can't do it correctly: Context Action.
if (sci_has_selection(doc->editor->sci)) word = sci_get_selection_contents(doc->editor->sci); ... utils_str_replace_all(&command, "%s", word); spawn_async(... command, ...)
This can be fixed by removing the %s specified and passing the text as an argv after the command (the now spawn allows that), with the slight disadvantage that the selection will always be at the end.
Thats a nice way of making it content independent, but at the cost of allways being at the end and being unable to append/prepend anything to the replacement text. This means that for example adding an extension to a filename extracted from the selection would be impossible.
I don't use context actions much, so I can't say how common it would be.
--
In all other cases, except if somebody uses a file or directory name with ' under Unix for build or printing, the right thing *can* be done by quoting. But:
- In all default Geany commands, the placeholders are unquoted, meaning that
any file name with space(s) currently breaks. Shame on us. :)
Well some of the default commands are double quoted, but not all IIRC (filetypes.c uses all double quotes for eg).
- The quotes are different under Unix/Win~1: single vs. double.
So in fact we should have two sets of filetype files, one for linstalls and one for winstalls with different quotes on their commands.
I think there is an absolute minimum we can do about it: quote the file/directory name placeholders natively, but if, and only if, the entire command does not contain any single or double quotes. I don't see a way this can fail, but maybe elextr can prove me wrong.
Only for the commands run under the shell, if you mean "no quote character in the command *and* all the replacement texts"[1] then it would work on Linux but that makes its behaviour depend on the command text and the replacement text so I forsee much confusion. Better to be consistent.
And if we use double quotes on Linux, then $ and ` (at least) will have special powers[2] which you may not want in a filename, but may want elsewhere. So again we don't know if we should escape the special chars or not.
Of course, that's not perfect by any means. But if it'll unquestionably improve the most realistic scenarios (file / directory names with spaces), with a very small effort, I think it's worth.
I think its still "got too many flies on it" as we say here :)
Maybe we should throw away running under the shell, who's silly idea was it anyway :)
Cheers Lex
-- E-gards: Jimmy
[1] think of %f being "fred's data" inside single quotes [2] think of the %f being "$amounts", that will confuse the shell, and will be real fun if amounts is defined :)
On 01.4.2015 г. 04:33, Lex Trotman wrote:
On 1 April 2015 at 04:45, Dimitar Zhekovdimitar.zhekov@gmail.com wrote:
On 31.3.2015 г. 03:16, Lex Trotman wrote:
- In all default Geany commands, the placeholders are unquoted, meaning that
any file name with space(s) currently breaks. Shame on us. :)
Well some of the default commands are double quoted, but not all IIRC (filetypes.c uses all double quotes for eg).
Really... with double quotes, very nice for Windows. :)
- The quotes are different under Unix/Win~1: single vs. double.
So in fact we should have two sets of filetype files, one for linstalls and one for winstalls with different quotes on their commands.
Or remove them if we do minimal native quotation.
I think there is an absolute minimum we can do about it: quote the file/directory name placeholders natively, but if, and only if, the entire command does not contain any single or double quotes. I don't see a way this can fail, but maybe elextr can prove me wrong.
Only for the commands run under the shell, if you mean "no quote character in the command *and* all the replacement texts"[1] then it would work on Linux but that makes its behaviour depend on the command text and the replacement text so I forsee much confusion. Better to be consistent.
I should have clarified: if the original command, before any placeholder expansion, does not contain any quotes. If there is at least a single quote, we assume the user knows what (s)he is doing.
And if we use double quotes on Linux, then $ and ` (at least) will have special powers[2] which you may not want in a filename, but may want elsewhere. So again we don't know if we should escape the special chars or not.
Quote it *natively*. That is "name" under Windows (can not contain "), and 'name' under Unix, with ' escaped. For file and directory name placeholders only. I don't think any child program will want to receive file names with spaces split into several arguments.
Maybe we should throw away running under the shell, who's silly idea was it anyway :)
It'll be a good thing to do IMHO, but won't help in this case. We split the non-shell commands under Unix with glib's g_shell_parse_argv(), which uses the shell syntax. So quoting with 'name' and escaping ' is always right.
[1] think of %f being "fred's data" inside single quotes
'fred'''s data' (I guess you don't mean " as part of the name, but they won't be problematic either)
[2] think of the %f being "$amounts", that will confuse the shell, and will be real fun if amounts is defined :)
'$amounts'. Our current double-quoted placeholders make it "$amounts", so the users users of Geany can have fun.
-- E-gards: Jimmy
On 2 April 2015 at 04:50, Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
On 01.4.2015 г. 04:33, Lex Trotman wrote:
On 1 April 2015 at 04:45, Dimitar Zhekovdimitar.zhekov@gmail.com wrote:
On 31.3.2015 г. 03:16, Lex Trotman wrote:
- In all default Geany commands, the placeholders are unquoted, meaning
that any file name with space(s) currently breaks. Shame on us. :)
Well some of the default commands are double quoted, but not all IIRC (filetypes.c uses all double quotes for eg).
Really... with double quotes, very nice for Windows. :)
Are you sure they don't? see http://blogs.msdn.com/b/twistylittlepassagesallalike/archive/2011/04/23/ever...
Says they work, no better than they work on sh though :)
- The quotes are different under Unix/Win~1: single vs. double.
So in fact we should have two sets of filetype files, one for linstalls and one for winstalls with different quotes on their commands.
Or remove them if we do minimal native quotation.
Indeed, but the link above says the rules are different for win and lin, sigh.
I think there is an absolute minimum we can do about it: quote the file/directory name placeholders natively, but if, and only if, the entire command does not contain any single or double quotes. I don't see a way this can fail, but maybe elextr can prove me wrong.
Only for the commands run under the shell, if you mean "no quote character in the command *and* all the replacement texts"[1] then it would work on Linux but that makes its behaviour depend on the command text and the replacement text so I forsee much confusion. Better to be consistent.
I should have clarified: if the original command, before any placeholder expansion, does not contain any quotes. If there is at least a single quote, we assume the user knows what (s)he is doing.
Well, its a simple rule, but on win, paths ending in backslash will break because you get prog "c:\dir with spaces" and the closing quote is escaped. So its no improvement on the current situation.
Note that the quotes on the C filetype build commands may be because of the system() hack, since it passes stuff through cmd before creating the process and cmd has *another* set of quotes, see the link above.
Also if I read the win rules right, quoting %d would prevent you doing "%d\filename with spaces" or "%d%f" to get absolute paths, since it doesn't do the catenation of separately quoted parts that sh does (as I read it). On sh you would do %d'\filename with spaces' or %d\%f.
And if we use double quotes on Linux, then $ and ` (at least) will have special powers[2] which you may not want in a filename, but may want elsewhere. So again we don't know if we should escape the special chars or not.
Quote it *natively*. That is "name" under Windows (can not contain "), and 'name' under Unix, with ' escaped.
For Posix sh you simply can't put single quotes inside single quotes because backslash is not an escape inside single quotes. You have to end the single quotes just before the single quote, double quote the single quote and then single quote the rest. eg cmd 'no $ \ ` special chars here except '"'"' or here'
For file and directory name placeholders only. I don't think any child program will want to receive file names with spaces split into several arguments.
Awww, thats no fun :)
Maybe we should throw away running under the shell, who's silly idea was it anyway :)
It'll be a good thing to do IMHO, but won't help in this case. We split the non-shell commands under Unix with glib's g_shell_parse_argv(), which uses the shell syntax. So quoting with 'name' and escaping ' is always right.
Not under sh, see above. But if we threw away sh and did our own thing then maybe. Lets invent a new quoting standard for Geany command splitting https://xkcd.com/927/ :)
[1] think of %f being "fred's data" inside single quotes
'fred'''s data' (I guess you don't mean " as part of the name, but they won't be problematic either)
[2] think of the %f being "$amounts", that will confuse the shell, and will be real fun if amounts is defined :)
'$amounts'. Our current double-quoted placeholders make it "$amounts", so the users users of Geany can have fun.
As the shell substitutes whatever amounts is defined to be :) But maybe I *want* to substitute $CFLAGS into the command.
So I still think we should either do nothing, or get it RIGHT.
Adding lots of code to just swap the current issues for a different set doesn't sound sensible, but I'm not sure we have quite hit on "right" yet.
For me right means:
- that it "just works" most of the time, - there are no situations where the user is screwed totally and can't get what they want, and - existing quoting either still works or crashes loudly, it never silently does something different.
And there is still those commands that are not run in the shell to consider, how are they quoted so they split up right.
And windows.
Sigh!
Cheers Lex
-- E-gards: Jimmy _______________________________________________ Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel
On 02.4.2015 г. 03:40, Lex Trotman wrote:
On 31.3.2015 г. 03:16, Lex Trotman wrote:
- In all default Geany commands, the placeholders are unquoted, meaning
that any file name with space(s) currently breaks. Shame on us. :)
Well some of the default commands are double quoted, but not all IIRC (filetypes.c uses all double quotes for eg).
Really... with double quotes, very nice for Windows. :)
Are you sure they don't? see http://blogs.msdn.com/b/twistylittlepassagesallalike/archive/2011/04/23/ever...
Says they work, no better than they work on sh though :)
To find proper quoting for Windows, you'll need to look no further than spawn_append_argument() in the new spawn. :)
The only thing I'm not sure about are \n \v ... - the official VS 2013 msdn [1] says "arguments are delimited by white space, which is either a space or a tab", but the correct(?) solution in this blog includes \n and \v without any explanation. And it may slightly different under the various C RTL-s for Win32.
I was aware of this problem, and that's why, even in the oldest spawn versions, the blank characters in a Windows command line are #define-d, and may be easily changed.
I should have clarified: if the original command, before any placeholder expansion, does not contain any quotes. If there is at least a single quote, we assume the user knows what (s)he is doing.
Well, its a simple rule, but on win, paths ending in backslash will break because you get prog "c:\dir with spaces" and the closing quote is escaped. So its no improvement on the current situation.
What I did not consider is that our placeholders may end in directory separator (pretty obvious actually, \ is the root Win~1 directory). In that case, spawn_append_argument() will do the job.
Note that the quotes on the C filetype build commands may be because of the system() hack, since it passes stuff through cmd before creating the process and cmd has *another* set of quotes, see the link above.
I know, and have no intention to support the cmd rules, neither in spawn nor anywhere else.
Also if I read the win rules right, quoting %d would prevent you doing "%d\filename with spaces" or "%d%f" to get absolute paths, since it doesn't do the catenation of separately quoted parts that sh does (as I read it). On sh you would do %d'\filename with spaces' or %d\%f.
spawn execute command line: showargs "foo""b a r""qux" [...] output: argv[0] = showargs argv[1] = foob a rqux
The command line is passed 1:1.
For Posix sh you simply can't put single quotes inside single quotes because backslash is not an escape inside single quotes. You have to end the single quotes just before the single quote, double quote the single quote and then single quote the rest. eg cmd 'no $ \ ` special chars here except '"'"' or here'
I know... :) We have working natively quoting and escaping and breaking and whatever else is required for native quotation functions for Unix (g_shell_quote) and Windows (spawn_append_argument), so maybe we can stop discussing these details?
It'll be a good thing to do IMHO, but won't help in this case. We split the non-shell commands under Unix with glib's g_shell_parse_argv(), which uses the shell syntax. So quoting with 'name' and escaping ' is always right.
Not under sh, see above. But if we threw away sh and did our own thing then maybe.
Under Unix, we either spawn the command with /bin/sh, which uses the shell syntax, or break it into argv with g_shell_parse_argv, which uses the shell syntax (without subshells, but let's be realistic). There is no difference.
So, in my understanding, you want to drop the shell syntax, and that is why you want to drop /bin/sh. But what stops "our own thing" from producing command line suitable for g_shell_parse_argv, and thus /bin/sh?
'$amounts'. Our current double-quoted placeholders make it "$amounts", so the users users of Geany can have fun.
As the shell substitutes whatever amounts is defined to be :) But maybe I *want* to substitute $CFLAGS into the command.
File and directory placeholders.
Now, if you *want* a file or directory placeholder to result in $CFLAGS, and then $CFLAGS to be expanded as a shell variable, the current double-quoted Geany build commands should be perfect for you, because they do exactly that. :)
So I still think we should either do nothing, or get it RIGHT.
Adding lots of code to just swap the current issues for a different set doesn't sound sensible, but I'm not sure we have quite hit on "right" yet.
It's not "lots", the proper quoting functions already exist. Remember that we must rewrite the placeholders replacement to *at least* do all replacements at once, and maybe add %% for literal % (though it may be currently escaped as % under Unix and quoted with "%" under Windows).
For me right means:
- that it "just works" most of the time,
- there are no situations where the user is screwed totally and can't
get what they want, and
- existing quoting either still works or crashes loudly, it never
silently does something different.
A reasonable minimum.
Personally I wanted to:
- considerably improve the current situation - with something simple (and it's simple, see "lots" above) - without breaking anything, even if that means not fixing something
But my idea was no good.
First, the default Geany commands are improperly double quoted, so we can expect their user-modified derivatives to retain the double quotes, which makes the idea less useful, because of the "no auto quoting if the original command line contains at least one quote of any kind". And it probably does. Gee.
Second, if we unquote the placeholders in all default commands, they will be auto quoted, but only until the user adds a probably unrelated quote, and then they'll break, which is unacceptable.
So I need some more time to think.
And there is still those commands that are not run in the shell to consider, how are they quoted so they split up right.
Once again, they are run shell-like, see above.
And windows.
And bash under Windows (part of MSYS), if we want to be complete.
Sigh!
-''-
[1] https://msdn.microsoft.com/en-us/library/17w5ykft.aspx
-- E-gards: Jimmy
[...]
Are you sure they don't? see
http://blogs.msdn.com/b/twistylittlepassagesallalike/archive/2011/04/23/ever...
Says they work, no better than they work on sh though :)
To find proper quoting for Windows, you'll need to look no further than spawn_append_argument() in the new spawn. :)
Well, thats obviously the name of a quoting function, perhaps it should be spawn_append_argument_win_quoted() so I might have noticed it :)
Its fine for appending an argument, but when replacing a placeholder you can't be certain it isn't inside quotes already. Which is the original b4n point. So we don't yet have a working placeholder replacement quoter (which is I think what you say below where you are thinking more :).
But it doesn't put arguments containing tabs inside " either.
The only thing I'm not sure about are \n \v ... - the official VS 2013 msdn [1]
[1] was what I was looking for when I found the blog, thanks :)
says "arguments are delimited by white space, which is either a space or a tab", but the correct(?) solution in this blog includes \n and \v without any explanation. And it may slightly different under the various C RTL-s for Win32.
It puts arguments containing \n and \v inside " which sounds reasonable but yeah unexplained why.
I was aware of this problem, and that's why, even in the oldest spawn versions, the blank characters in a Windows command line are #define-d, and may be easily changed.
I should have clarified: if the original command, before any placeholder expansion, does not contain any quotes. If there is at least a single quote, we assume the user knows what (s)he is doing.
Well, its a simple rule, but on win, paths ending in backslash will break because you get prog "c:\dir with spaces" and the closing quote is escaped. So its no improvement on the current situation.
What I did not consider is that our placeholders may end in directory separator (pretty obvious actually, \ is the root Win~1 directory). In that case, spawn_append_argument() will do the job.
Yep.
Note that the quotes on the C filetype build commands may be because of the system() hack, since it passes stuff through cmd before creating the process and cmd has *another* set of quotes, see the link above.
I know, and have no intention to support the cmd rules, neither in spawn nor anywhere else.
Agreed, we are trying to get rid of it :)
Also if I read the win rules right, quoting %d would prevent you doing "%d\filename with spaces" or "%d%f" to get absolute paths, since it doesn't do the catenation of separately quoted parts that sh does (as I read it). On sh you would do %d'\filename with spaces' or %d\%f.
Ok, your reference at [1] says it does do catenation, so to answer myself %d"\filename with spaces" should work on win if %d is quoted, same as sh.
[...]
I know... :) We have working natively quoting and escaping and breaking and whatever else is required for native quotation functions for Unix (g_shell_quote) and Windows (spawn_append_argument), so maybe we can stop discussing these details?
As b4n originally said, and I think you agree below, except if the command we are interpolating into has quotes in it already.
[...]
So, in my understanding, you want to drop the shell syntax, and that is why you want to drop /bin/sh. But what stops "our own thing" from producing command line suitable for g_shell_parse_argv, and thus /bin/sh?
I wasn't actually serious about dropping sh, how would we annoy b4n with pipes and subshells :)
'$amounts'. Our current double-quoted placeholders make it "$amounts", so the users users of Geany can have fun.
As the shell substitutes whatever amounts is defined to be :) But maybe I *want* to substitute $CFLAGS into the command.
File and directory placeholders.
Sorry I wasn't clear, these refer to the non-build placeholders that are not specifically files or directories. Where the selection is used for example. But hopefully any quotes that are added will get removed by g_shell_parse_argv() when it splits the command so the $substitutions will work again.
But g_shell_quote() only works if the text being quoted is a complete argument, not if its being interpolated inside existing quotes, eg if a user used `/bin/sh -c "%s"` as a context action expecting the selection to be executed as a shell command. Note the documented example for context actions includes quotes around the placeholder.
Now, if you *want* a file or directory placeholder to result in $CFLAGS, and then $CFLAGS to be expanded as a shell variable, the current double-quoted Geany build commands should be perfect for you, because they do exactly that. :)
Yeah, I doubt you would want to substitute *into* a filename.
[...]
Remember that we must rewrite the placeholders replacement to *at least* do all replacements at once, and maybe add %% for literal % (though it may be currently escaped as % under Unix and quoted with "%" under Windows).
I thought that the %% was just so a command could contain the literal sequence %f, %d etc, so its just something in build_replace_placeholders() not a quoting issue?
For me right means:
- that it "just works" most of the time,
- there are no situations where the user is screwed totally and can't
get what they want, and
- existing quoting either still works or crashes loudly, it never
silently does something different.
A reasonable minimum.
Personally I wanted to:
- considerably improve the current situation
- with something simple (and it's simple, see "lots" above)
Some of the ideas proposed by b4n don't meet my definition of "simple" in that its not obvious they are right and that they don't screw somebody :)
- without breaking anything, even if that means not fixing something
But my idea was no good.
First, the default Geany commands are improperly double quoted, so we can expect their user-modified derivatives to retain the double quotes, which makes the idea less useful, because of the "no auto quoting if the original command line contains at least one quote of any kind". And it probably does. Gee.
Second, if we unquote the placeholders in all default commands, they will be auto quoted, but only until the user adds a probably unrelated quote, and then they'll break, which is unacceptable.
So I need some more time to think.
No problem, take your time.
And there is still those commands that are not run in the shell to consider, how are they quoted so they split up right.
Once again, they are run shell-like, see above.
See above (and b4n's original post) for the problem.
And windows.
And bash under Windows (part of MSYS), if we want to be complete.
Sigh, making win folks install even more weird foreign programs is probably not the best solution, OTOH it could allow build commands to always be run under sh even on win so only one solution to the quoting problem is needed.
Cheers Lex
Sigh!
-''-
[1] https://msdn.microsoft.com/en-us/library/17w5ykft.aspx
-- E-gards: Jimmy _______________________________________________ Devel mailing list Devel@lists.geany.org https://lists.geany.org/cgi-bin/mailman/listinfo/devel
On 03.4.2015 г. 03:52, Lex Trotman wrote:
To find proper quoting for Windows, you'll need to look no further than spawn_append_argument() in the new spawn. :)
Well, thats obviously the name of a quoting function, perhaps it should be spawn_append_argument_win_quoted() so I might have noticed it :)
Its fine for appending an argument, but when replacing a placeholder you can't be certain it isn't inside quotes already. [...]
Sure.
But it doesn't put arguments containing tabs inside " either.
If you mean spawn_append_argument, it checks for at least one " or g_ascii_isspace(), and if so, quotes the entire argument. There's no way it'll miss a tab.
The only thing I'm not sure about are \n \v ... - the official VS 2013 msdn says "arguments are delimited by white space, which is either a space or a tab", [...]
It puts arguments containing \n and \v inside " which sounds reasonable but yeah unexplained why.
Actually, I confused the blanks which *may* cause quoting with the ones should delimit the program name. The former set must be permissive to cope with stupid RTL-s, while the latter must be restrictive to avoid Win~1 to avoid running unexpected things.
So the actual blog error is that '\f' and especially '\r' are not included. But I'm not much better: after checking g_ascii_isspace(), it turns out that '\v' is not a spacing character for glib, despite both C and C++ standards saying it is. *facepalm*
"Everyone quotes command line arguments the wrong way" - truer words have never been said. I'll switch to isspace(), and even if it considers some non-ascii characters to be spaces, that'll do no harm. (I can even quote unconditionally - the resulting CL will be a bit ugly, but it's internal to spawn anyway.)
-- E-gards: Jimmy
Hi,
Attached is a test program which does the following:
1. Expands an array of struct Placeholder { c, rep } in one step, to avoid the current collisions where expanded text may contain %x which is then interpreted as a placeholder.
It would be a shame not to include this.
2. Replaces %% with a literal %.
This is a logical thing to have...
3. Quotes any text between %" %" natively. (Rules 1 and 2 apply identically to the text inside %" %")
*This may be easily removed from the expansion.*
Pros: - Very unlikely to break the current user commands. - bash quoting support for Windows, search and tools: bash -c 'echo "$0" "$1" "$2"' echo %"foo"'bar%" qux echo foo"'bar qux - Simple to implement and understand. Contrary to the grim predictions, it's only ~35 lines, even considering that the expansion may be merged in a single function without it.
Cons: - Uncommon quoter. All good quotes are already taken. :) - With " ' and %" you can shoot yourself in the foot. Well, one can easily do that with shell by specifying $1 or "$CFLAGS", right?
I can propose rules for verifying the existing user commands, fixing some of them and making suggestions for others, and for verifying the new commands entered by the user to some extent. But if we don't like %" I'd skip that part.
4. Defines utils_quote_natively(), which quotes text natively. It will be good to quote at least --include=filespec under Windows, where the child is responsible for the filename expansion.
-- E-gards: Jimmy