[geany/geany] UTF-8 issues (Issue #3792) - Github-comments

19 Mar 2024


      Hi,
I'm being redirected here from GLib bug tracker:
I'm using the text editor 'Geany' version 2.0 under Windows 11. According to the "About" dialog, Geany version 2.0 is based on GLib 2.78.0.
I'm trying to convert a large UTF-8 encoded text file into ISO-8859-1 because I'm not skilled enough to properly handle UTF-8 strings in a program I'm making and I currently do not need to support anything but Spanish. However, when I attempt to change the file encoding, it throws an error during save:
`GLib.GException: Hay una secuencia de bytes no válida en la entrada de conversión.`
The supposedly bad character is:
```
'ἀ'
'Greek Small Letter Alpha with Psili'
U+1F00
UTF-8 bytes: 0xE1 0xBC 0x80
```
This is correct. The file I'm processing contains linguistic information and contains a lot of unusual characters such as Greek letters. I've verified the file by hand in an hex editor and the bytes are properly encoded. Therefore, I've determined that this must be a bug in the GLib library. Or, at least, in the way Geany handles GLib exceptions.
My hypothesis is that this particular character is outside the range of characters ISO-8859-1 supports. Therefore, not finding an 1:1 equivalent, it throws a warning to alert the program it's going to lose information in the conversion process. What I don't understand is why it says "There is an invalid byte sequence in the conversion input" if the sequence is actually valid. If I'm right, it should use a different message.
-- 
Reply to this email directly or view it on GitHub:
https://github.com/geany/geany/issues/3792
You are receiving this because you are subscribed to this thread.

Message ID: geany/geany/issues/3792@github.com