[CCing geany devel list, using the address I'm subscribed with to avoid moderation]
On Tue, Dec 15, 2009 at 10:45:25PM +0000, Ximin Luo wrote:
Damián Viano wrote:
I used a text file with (created with echo 'try to replace me \t' >test_file):
try to replace me \t
[snip: me seeing only the third case stated here]
My mistake; "Use regular expressions" needs to be on.
I see, I can reproduce it now. But there's more to it,
In "try to replace me with \t", with Find = "me":
This is the result with: "Use regular expressions" ON "Use escape sequences" OFF
Replace = "\t" replaces "me" with a tab Replace = "\t" replaces "me" with a backslash followed by a tab Replace = "\\t" replaces "me" with two backslashes followed by a tab
"Use regular expressions" ON "Use escape sequences" ON
Replace = "\t" replaces "me" with a tab Replace = "\t" replaces "me" with a tab Replace = "\t" replaces "me" with a backslash followed by a tab
The correct behaviour (iirc) is to escape "\" into ""; so that:
"Use regular expressions" OFF "Use escape sequences" ON
Replace = "\t" replaces "me" with a tab Replace = "\t" replaces "me" with the literal string "\t" Replace = "\\t" replaces "me" with a backslash followed by a tab
Let's hear Enrico's , Frank's and/or Nick's (upstream authors) on this.
Thanks for your report and follow-up Ximin.
Damián.
Damián Viano wrote:
In "try to replace me with \t", with Find = "me":
This is the result with: "Use regular expressions" ON "Use escape sequences" OFF
Replace = "\t" replaces "me" with a tab Replace = "\t" replaces "me" with a backslash followed by a tab Replace = "\\t" replaces "me" with two backslashes followed by a tab
"Use regular expressions" ON "Use escape sequences" ON
Replace = "\t" replaces "me" with a tab Replace = "\t" replaces "me" with a tab Replace = "\t" replaces "me" with a backslash followed by a tab
- the above 2 examples are wrong; "\t" should be "\t" and "\t" should be "\\t"
Right, but what if you want to search for a regular expression, *AND* replace it with something containing the literal "\t"? The above examples cannot do this.
In any case, if escape sequences are ON, then "\" should be escaped to "" regardless of whether regexp is on or not. Actually it should only an option for non-regexp; it should be implicitly ON when you use regexp, as is standard in every other implementation.
The correct behaviour (iirc) is to escape "\" into ""; so that:
"Use regular expressions" OFF "Use escape sequences" ON
Replace = "\t" replaces "me" with a tab Replace = "\t" replaces "me" with the literal string "\t" Replace = "\\t" replaces "me" with a backslash followed by a tab
"Use regular expressions" OFF is kind of useless if I want to search for a regexp. geany should exhibit the above behaviour even if regexp is ON.
I suggest using an already-made regexp library instead. Eg. libpcre, or the code from GNU grep (which does not have exponential corner-cases - see http://swtch.com/~rsc/regexp/regexp1.html).
On Wed, 16 Dec 2009 13:06:42 +0000, Ximin wrote:
Damián Viano wrote:
In "try to replace me with \t", with Find = "me":
This is the result with: "Use regular expressions" ON "Use escape sequences" OFF
Replace = "\t" replaces "me" with a tab Replace = "\t" replaces "me" with a backslash followed by a tab Replace = "\\t" replaces "me" with two backslashes followed by a tab
"Use regular expressions" ON "Use escape sequences" ON
Replace = "\t" replaces "me" with a tab Replace = "\t" replaces "me" with a tab Replace = "\t" replaces "me" with a backslash followed by a tab
- the above 2 examples are wrong; "\t" should be "\t" and "\t"
should be "\\t"
Right, but what if you want to search for a regular expression, *AND* replace it with something containing the literal "\t"? The above examples cannot do this.
In any case, if escape sequences are ON, then "\" should be escaped to "" regardless of whether regexp is on or not. Actually it should only an option for non-regexp; it should be implicitly ON when you use regexp, as is standard in every other implementation.
I had a short look at your examples but to be honest for me this is just not that important. I surely can find more corner cases where the one or the other regexp/special character combination won't work. Either you write a patch to make things better or you maybe just use tools which are more specialised for such tasks like grep, sed and friends.
I suggest using an already-made regexp library instead. Eg. libpcre, or the code from GNU grep (which does not have exponential corner-cases - see http://swtch.com/~rsc/regexp/regexp1.html).
You also said Geany were 'trying to implement its own regexp engine'. This is wrong. Geany uses the regexp implementation of Scintilla, the used editing component. It's not as good as it could be but it does the job and it doesn't pull in new, external dependencies like libpcre. Please read the documentation to learn more about the use of regular expressions within Find and Replace dialogs in Geany. There are some limitations and differences but generally it just works.
Regards, Enrico
Enrico Tröger wrote:
I had a short look at your examples but to be honest for me this is just not that important. I surely can find more corner cases where the one or the other regexp/special character combination won't work. Either you write a patch to make things better or you maybe just use tools which are more specialised for such tasks like grep, sed and friends.
Take it or leave it, it's your software. regexp by these days is pretty much a standard and there are certain expectations people have when an engine claims to "support regexp". It's irrelevant to say "oh well I don't think these standards matter" because people will get tripped up.
If you expect everyone to "write a patch", there's no point for bug-reporting systems. Not everyone has the time to go looking through the source code.
Geany uses the regexp implementation of Scintilla, the used editing component. It's not as good as it could be but it does the job and it doesn't pull in new, external dependencies like libpcre. Please read the documentation to learn more about the use of regular expressions within Find and Replace dialogs in Geany. There are some limitations and differences but generally it just works.
Well excuse me for not looking through the source code before I filed the bug. I will forward this onto Scintilla, then. Also (as I said before), regexp is a *standard* - people should not HAVE to look through documentation to find out the exact quirks of a particular implementation.
BTW libpcre is really not a problem for *NIX - most systems will have some important component that depends on it anyway. And for windows/mac you can just static-link it.
On Thu, 17 Dec 2009 08:44:24 +0000, Ximin wrote:
Hi,
I had a short look at your examples but to be honest for me this is just not that important. I surely can find more corner cases where the one or the other regexp/special character combination won't work. Either you write a patch to make things better or you maybe just use tools which are more specialised for such tasks like grep, sed and friends.
Take it or leave it, it's your software. regexp by these days is pretty much a standard and there are certain expectations people have when an engine claims to "support regexp". It's irrelevant to say "oh well I don't think these standards matter" because people will get tripped up.
Ok. Then take what I said as my personal point of view and so also the little motivation I personally have to work on this.
If you expect everyone to "write a patch", there's no point for bug-reporting systems. Not everyone has the time to go looking through the source code.
This includes me as well. One of the main ideas of open source is that everybody who wants can contribute and make changes to the code as he/she wishes. I don't expect anyone to do anything but I also can't implement anything anyone wishes. My time is limited as well as yours, probably.
Geany uses the regexp implementation of Scintilla, the used editing component. It's not as good as it could be but it does the job and it doesn't pull in new, external dependencies like libpcre. Please read the documentation to learn more about the use of regular expressions within Find and Replace dialogs in Geany. There are some limitations and differences but generally it just works.
Well excuse me for not looking through the source code before I filed the bug. I will forward this onto Scintilla, then. Also (as I said before), regexp is a *standard* - people should not HAVE to look through documentation to find out the exact quirks of a particular implementation.
I agree but still, there is a difference between what would be nice to have and what reality is...:(.
BTW libpcre is really not a problem for *NIX - most systems will have some important component that depends on it anyway. And for windows/mac you can just static-link it.
It's still another dependency. The goal is to use as less external libaries as possible. Btw, we already use the regexp implementation of the C runtime environment (if available) and use the GNU regexp implementation on all other systems (included in Geany's sources). So, if this would be enough, we would not need an external dependency.
Regards, Enrico
On Mon, 28 Dec 2009 18:44:20 +0100 Enrico Tröger enrico.troeger@uvena.de wrote:
BTW libpcre is really not a problem for *NIX - most systems will have some important component that depends on it anyway. And for windows/mac you can just static-link it.
It's still another dependency. The goal is to use as less external libaries as possible. Btw, we already use the regexp implementation of the C runtime environment (if available) and use the GNU regexp implementation on all other systems (included in Geany's sources). So, if this would be enough, we would not need an external dependency.
I think this is a good idea but I'm no expert. If the system/gnu engine is powerful enough, and someone wants to work on the changes then it would be good to change. Then we could probably cut out the scintilla regex implementation.
There's been the TODO item for a while:
o (better search & replace regex support e.g. multiline - use SCI_GETCHARACTERPOINTER and GNU regex?)
Regards, Nick