Thank you for the described workaround, I'll try it on real files.

Scintilla editing widget claims to handle illegal bytes and show them as lozenge shapes with the hex in them

And that is exactly what is needed here and sometimes already happens with Geany's encoding detection. I only think there could be a manual way in Geany to specify "display what you can in this encoding, otherwise print illegal symbol".

any code point 128 or greater is encoded as a sequence of more than one byte with a value >= 128

I did not know that more than one byte was required in every case above 127, thank you for outlining it. But an UTF-8 buffer shoud be able to store values 128-255 with some hackery. Which might not be needed at all given the Scintilla-level solution above.

encoding "detection" is "search for an encoding that will convert the file to UTF-8"

Maybe that could be better emphasized in the program somehow. Also Without encoding is still a misleading nomenclature. Wouldn't Auto-search or First found, while still generic, be a more appropriate name?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.