Hi.
There is a problem with "Search in files" when searching text in non-latin charsets and current locale charset is different from charset of files in the project. For example, charset of files is cp1251 and current locale is utf8.
May be we can introduce new option "charset" in Project options and so we should be able to recode files while performing a search?
May be there is an another solution for the problem, but now the problem seems to be unsolvable and we are forced to use external tools for that task (such as Midnight commander, for instance ;).
On Wed, 12 Nov 2008 08:39:27 +0400, "Walery Studennikov" despairr@gmail.com wrote:
Hi.
There is a problem with "Search in files" when searching text in non-latin charsets and current locale charset is different from charset of files in the project. For example, charset of files is cp1251 and current locale is utf8.
Delete those files and always use UTF-8 :). (j/k)
May be we can introduce new option "charset" in Project options and so we should be able to recode files while performing a search?
Well, why in the project options? If we really need such an option, it should be in the Find in Files dialog. But I'm not really convinced of this.
Just to clarify: you are talking about wrongly displayed search results in the messages window at the bottom when searching in files with a non-UTF-8 encoding?
What is your general system locale? cp1251 or UTF-8?
Regards, Enrico
2008/11/12 Enrico Tröger enrico.troeger@uvena.de:
On Wed, 12 Nov 2008 08:39:27 +0400, "Walery Studennikov" despairr@gmail.com wrote:
Hi.
There is a problem with "Search in files" when searching text in non-latin charsets and current locale charset is different from charset of files in the project. For example, charset of files is cp1251 and current locale is utf8.
Delete those files and always use UTF-8 :). (j/k)
I would be happy if full migration to UTF8 will take only a few days ;)
May be we can introduce new option "charset" in Project options and so we should be able to recode files while performing a search?
Well, why in the project options? If we really need such an option, it should be in the Find in Files dialog. But I'm not really convinced of this.
Yes.
Just to clarify: you are talking about wrongly displayed search results in the messages window at the bottom when searching in files with a non-UTF-8 encoding?
No. Just nothing found (bacause we searching in wrong encoding).
What is your general system locale? cp1251 or UTF-8?
General system locale -- UTF-8. Project files locale -- cp1251.
On Thu, 13 Nov 2008 12:18:50 +0400, "Walery Studennikov" despairr@gmail.com wrote:
Hey,
Just to clarify: you are talking about wrongly displayed search results in the messages window at the bottom when searching in files with a non-UTF-8 encoding?
No. Just nothing found (bacause we searching in wrong encoding).
Oh, ok. Now that you say it, it's obvious. We pass always UTF-8 text to 'grep' but this doesn't match when the file is encoded in any other encoding (and you are using non-Ascii characters).
I added some code in SVN r3221 to provide an encoding list in the Find in Files dialog. The set encoding is used to convert the entered search text into and to display the search results. Actually, the entered search text can be in UTF-8 or in the specified encoding though I only tested it with UTF-8 text since this is the most common case (everything you select or copy from within Geany is UTF-8, always even the file encoding is something else).
Any feedback is welcome.
Regards, Enrico
Enrico Tröger a écrit :
Oh, ok. Now that you say it, it's obvious. We pass always UTF-8 text to 'grep' but this doesn't match when the file is encoded in any other encoding (and you are using non-Ascii characters).
I added some code in SVN r3221 to provide an encoding list in the Find in Files dialog. The set encoding is used to convert the entered search text into and to display the search results. Actually, the entered search text can be in UTF-8 or in the specified encoding though I only tested it with UTF-8 text since this is the most common case (everything you select or copy from within Geany is UTF-8, always even the file encoding is something else).
Any feedback is welcome.
Regards, Enrico
Hi,
Stop me if I say anything stupid, but can't be the research pattern translated to the encoding of each file to match its encoding? It sounds me better than only provide an encoding choice, because choosing an encoding won't really help if some files are in another encoding. Furthermore, sometimes users don't or won't care (and don't know) about file encodings, for example if they work with files created with another editor or another system.
Dunno if it is hard to implement or have a big speed impact, it's just the better behaviour I see for now.
Regards, Colomban
On Thu, 13 Nov 2008 19:36:50 +0100, Colomban Wendling ban-ubuntu@club-internet.fr wrote:
Enrico Tröger a écrit :
Oh, ok. Now that you say it, it's obvious. We pass always UTF-8 text to 'grep' but this doesn't match when the file is encoded in any other encoding (and you are using non-Ascii characters).
I added some code in SVN r3221 to provide an encoding list in the Find in Files dialog. The set encoding is used to convert the entered search text into and to display the search results. Actually, the entered search text can be in UTF-8 or in the specified encoding though I only tested it with UTF-8 text since this is the most common case (everything you select or copy from within Geany is UTF-8, always even the file encoding is something else).
Any feedback is welcome.
Regards, Enrico
Hi,
Stop me if I say anything stupid, but can't be the research pattern translated to the encoding of each file to match its encoding? It sounds me better than only provide an encoding choice, because choosing an encoding won't really help if some files are in another
Well, of course it'd be better if we would could know the encoding of each file, convert the search text into this encoding and then do the search. But there are a few problems with that: we run 'grep [options] search text' in the chosen directory. So, we run one command for all files in this directory (and maybe subdirectories). So we need one search text for all files. Additionally, to search every file with its own encoding would mean to read every file before to detect its encoding. So, we would read the file, detect its encoding and then search it with grep. Bah. Alternatively, to be more effective it'd be better to directly search the file after opened it to detect the encoding. But this would rewriting almost all of the current code.
And last but not least is there still our most loved problem of correctly detecting file encodings. This has never been worked reliable. (i.e. try to open a cp1251 encoded file in Geany, it opens as ISO-8859-1 except your system locale is cp1251 too).
encoding. Furthermore, sometimes users don't or won't care (and don't know) about file encodings, for example if they work with files created with another editor or another system.
I completely agree with you on that but I don't know a better way, see above.
Regards, Enrico
Enrico Tröger a écrit :
But there are a few problems with that: we run 'grep [options] search text' in the chosen directory. So, we run one command for all files in this directory (and maybe subdirectories). So we need one search text for all files. Additionally, to search every file with its own encoding would mean to read every file before to detect its encoding. So, we would read the file, detect its encoding and then search it with grep. Bah. Alternatively, to be more effective it'd be better to directly search the file after opened it to detect the encoding. But this would rewriting almost all of the current code.
Hum, yes, seen this way it seems hard to implement. Bah, 'was just an idea.
And last but not least is there still our most loved problem of correctly detecting file encodings. This has never been worked reliable. (i.e. try to open a cp1251 encoded file in Geany, it opens as ISO-8859-1 except your system locale is cp1251 too).
Hum yes, I know this is hard. And you're right, if the detection fails the problem remains. There's just too many (useless) encodings around here I think.
encoding. Furthermore, sometimes users don't or won't care (and don't know) about file encodings, for example if they work with files created with another editor or another system.
I completely agree with you on that but I don't know a better way, see above.
Hum yes. Well, if the now current behaviour is efficient for how need it, it's cool (for me there's no change since I only use UTF-8).
Regards, Colomban
On Sat, 15 Nov 2008 20:12:17 +0100, Colomban Wendling ban-ubuntu@club-internet.fr wrote:
Enrico Tröger a écrit :
But there are a few problems with that: we run 'grep [options] search text' in the chosen directory. So, we run one command for all files in this directory (and maybe subdirectories). So we need one search text for all files. Additionally, to search every file with its own encoding would mean to read every file before to detect its encoding. So, we would read the file, detect its encoding and then search it with grep. Bah. Alternatively, to be more effective it'd be better to directly search the file after opened it to detect the encoding. But this would rewriting almost all of the current code.
Hum, yes, seen this way it seems hard to implement. Bah, 'was just an idea.
Yeah, great anyway. The whole encoding problem is nasty. Things could be so easy if anyone could use UTF-8 but we are probably still years away from this case.
Regards, Enrico
2008/11/13 Enrico Tröger enrico.troeger@uvena.de:
Just to clarify: you are talking about wrongly displayed search results in the messages window at the bottom when searching in files with a non-UTF-8 encoding?
No. Just nothing found (bacause we searching in wrong encoding).
Oh, ok. Now that you say it, it's obvious. We pass always UTF-8 text to 'grep' but this doesn't match when the file is encoded in any other encoding (and you are using non-Ascii characters).
I added some code in SVN r3221 to provide an encoding list in the Find in Files dialog. The set encoding is used to convert the entered search text into and to display the search results. Actually, the entered search text can be in UTF-8 or in the specified encoding though I only tested it with UTF-8 text since this is the most common case (everything you select or copy from within Geany is UTF-8, always even the file encoding is something else).
Any feedback is welcome.
Yes, it works now, thanks. Just on more feature-request for "Find in files" dialog: search directories ("Directory" field) now are not remembered between different geany runs but it probably should be. When I exit geany and run I again -- directory list is empty. Also, I think, it would be reasonable to set "directory" field to default project directory by default. Now I'm forced to enter default project directory again and again in this dialog. And I think this dialog would be just perfect after those improvements ;)
On Fri, 14 Nov 2008 09:08:00 +0400, "Walery Studennikov" despairr@gmail.com wrote:
2008/11/13 Enrico Tröger enrico.troeger@uvena.de:
Just to clarify: you are talking about wrongly displayed search results in the messages window at the bottom when searching in files with a non-UTF-8 encoding?
No. Just nothing found (bacause we searching in wrong encoding).
Oh, ok. Now that you say it, it's obvious. We pass always UTF-8 text to 'grep' but this doesn't match when the file is encoded in any other encoding (and you are using non-Ascii characters).
I added some code in SVN r3221 to provide an encoding list in the Find in Files dialog. The set encoding is used to convert the entered search text into and to display the search results. Actually, the entered search text can be in UTF-8 or in the specified encoding though I only tested it with UTF-8 text since this is the most common case (everything you select or copy from within Geany is UTF-8, always even the file encoding is something else).
Any feedback is welcome.
Yes, it works now, thanks. Just on more feature-request for "Find in files" dialog: search directories ("Directory" field) now are not remembered between different geany runs but it probably should be. When I exit geany and run I again -- directory list is empty.
Yes, this could be implemented. But I won't work on that in the near future.
Also, I think, it would be reasonable to set "directory" field to default project directory by default.
In current SVN version, the project base dir is added to the directory list in the Find in Files dialog though not chosen by default. But at least it's there once a project is open.
Regards, Enrico
2008/11/16 Enrico Tröger enrico.troeger@uvena.de:
Yes, this could be implemented. But I won't work on that in the near future.
Also, I think, it would be reasonable to set "directory" field to default project directory by default.
In current SVN version, the project base dir is added to the directory list in the Find in Files dialog though not chosen by default. But at least it's there once a project is open.
Not works for me. Project is opened, but directory list is empty.
On Mon, 17 Nov 2008 07:21:45 +0400, "Walery Studennikov" despairr@gmail.com wrote:
2008/11/16 Enrico Tröger enrico.troeger@uvena.de:
Yes, this could be implemented. But I won't work on that in the near future.
Also, I think, it would be reasonable to set "directory" field to default project directory by default.
In current SVN version, the project base dir is added to the directory list in the Find in Files dialog though not chosen by default. But at least it's there once a project is open.
Not works for me. Project is opened, but directory list is empty.
Are you sure to run the latest revision? You need at least 3237. I just tested it again and it's working fine for me.
Regards, Enrico
2008/11/18 Enrico Tröger enrico.troeger@uvena.de:
Also, I think, it would be reasonable to set "directory" field to default project directory by default.
In current SVN version, the project base dir is added to the directory list in the Find in Files dialog though not chosen by default. But at least it's there once a project is open.
Not works for me. Project is opened, but directory list is empty.
Are you sure to run the latest revision? You need at least 3237. I just tested it again and it's working fine for me.
Yes, it works now. Thanx.