[Geany-devel] Geany problems with unknown encodings names

Lex Trotman elextr at xxxxx
Mon Oct 24 23:03:39 UTC 2011


2011/10/25 Enrico Tröger <enrico.troeger at uvena.de>:
> On Tue, 25 Oct 2011 01:12:23 +1100, Lex wrote:
>
>>Dear Encoding specialist (or Colomban by default :)
>
> Lol, this is a sweet.
> But please don't count my answer as "Enrico is an encoding specialist".

Of course Enricoding :)

>
>
>>The thread [1] has exposed a problem with the way Geany handles
>>encodings that it gets from the extraction regexes and from the
>>locale.
>
[...]
> I think the only real problem is that we get an encoding from the
> locale which doesn't match one of our predefined strings (at the top of
> src/encodings.c). And this is the only point we should fix, so that the
> further code relying on the index of the mentioned mapping keeps
> working.
>

I think the problem is we have a set of pre-defined strings whose only
real use should be to make the document->set encoding menu, but which
have snuck into other areas.

> Some quick ideas to find a solution for the problem:
>
> - try to define whether this is a Windows-only problem or whether it
> might happen on non-Windows systems as well

All my locales are UTF-8 so someone else will have to check that one.

AFAICT regex extracted charsets used to open the file will default to
leaving the text alone if it happens to validate as utf-8 otherwise
the same problem can happen.  So its not just windows locales.

> - we should review the way we retrieve the locale name from the system,
> for Windows in particular

Regexes as well.

> - try to create an additional mapping of possible other locale names
> which can be directly mapped to the known ones known by Geany*
>

Thats possibly non-portable or a maintenance issue.

>
> * there is a file charset.alias or something with a similar name used
> by iconv, IIRC. And this file holds a mapping of alias names for
> encodings resp. charsets. I don't remember the details right now but on
> Windows it would be especially easy to distribute such an additional
> mapping. Though I still need to find some useful documentation on that
> and howto do it properly.

IIUC this is a GNU iconv artifact, but g_convert uses the system iconv
if it exists so how to do it portably is the question.

Having a UTF-8 only system means I can't do anything about developing
any fixes so someone else is going to have to do that, sorry.

Cheers
Lex



More information about the Devel mailing list