On Tue, 25 Oct 2011 01:12:23 +1100, Lex wrote:
Dear Encoding specialist (or Colomban by default :)
Lol, this is a sweet. But please don't count my answer as "Enrico is an encoding specialist".
The thread [1] has exposed a problem with the way Geany handles encodings that it gets from the extraction regexes and from the locale.
Lex, thanks for spending time and efforts to track this issue down and all the debugging which was involved. Actually much appreciated even if otherwise not explicitly said. This is also true for the rest of the gang taking care of the geany-users mailing list in the last weeks/months. Yeah.
But Geany does not recognise CP1252 (Geany only knows WINDOWS-1252) so places that use the encoding index default to UTF-8.
I think the only real problem is that we get an encoding from the locale which doesn't match one of our predefined strings (at the top of src/encodings.c). And this is the only point we should fix, so that the further code relying on the index of the mentioned mapping keeps working.
Some quick ideas to find a solution for the problem:
- try to define whether this is a Windows-only problem or whether it might happen on non-Windows systems as well - we should review the way we retrieve the locale name from the system, for Windows in particular - try to create an additional mapping of possible other locale names which can be directly mapped to the known ones known by Geany*
* there is a file charset.alias or something with a similar name used by iconv, IIRC. And this file holds a mapping of alias names for encodings resp. charsets. I don't remember the details right now but on Windows it would be especially easy to distribute such an additional mapping. Though I still need to find some useful documentation on that and howto do it properly.
Regards, Enrico