[Github-comments] [geany/geany] fails to open Microsoft UTF-16LE file (MSO Word CUSTOM.DIC dictionary file) (#1238)

elextr notifications at xxxxx
Tue Sep 20 00:57:53 UTC 2016


> I'm not sure why we accept invalid UTF-8 (well, it's structurally valid, but contains reserved code points),

>From pickyweedia "Not decoding surrogate halves makes it impossible to store invalid UTF-16, such as Windows filenames, as UTF-8. Therefore, detecting these as errors is often not implemented and there are attempts to define this behavior formally (see WTF-8 and CESU below)."

And Glib needs to round trip Windows filenames to/from UTF-8 so its reasonable that it doesn't object.

>  Or we don't use the same thing for UTF-8 (GLib) than UTF-16 (iconv through GLib), and GLib is more forgiving.

Well, for files that Geany thinks are UTF-8 (or is told by the user are UTF-8) we don't do a conversion, just validate, so its different in that way.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/geany/geany/issues/1238#issuecomment-248172325
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.geany.org/pipermail/github-comments/attachments/20160919/4ba76cfe/attachment.html>


More information about the Github-comments mailing list