[Geany] Geany UTF-16/32 bug and a possible "fix"

Enrico Tröger enrico.troeger at xxxxx
Thu May 24 10:31:29 UTC 2007


On Wed, 23 May 2007 21:17:03 +0300, Harri Koskinen
<geany_fi at fastmonkey.org> wrote:

Hi,

> I noticed that if I disable the NULL-check from the document.c file
> Geany then loads UTF16 and UTF32 encoded files correctly.
> 
> A small 'patch' is attached for quick & dirty testing :-)
Thanks.
With your patch the test
if (filedata->len != (gsize) st.st_size)
will never be executed because filedata->len is exactly st.st_size.
This works for UTF32 encoded files but it prevents completely opening
files which just contain one or more NULL bytes. At the moment, UTF32
files can't be opened (I know) it isn't better. Two weeks ago, I spent
about two or three days finding a better algorithm but without an
acceptable result. The real problem is in the code to detect the
character encodings. Because basically we could open files containing
NULL bytes without problems but then the encoding detection fails.

I won't apply the patch because it only helps opening UTF-32, UTF-16
still fails. But I just committed a fix which at least enables opening
of UTF-16 and UTF-32 encoded files with a valid BOM(Byte-Order-Mark).

We still need a better way to differentiate between files which just
contains NULL bytes and files which are properly encoded in UTF-16/32
and therefore contain NULL bytes. Any pointers are welcome.

If anyone is interested in testing or improving the code, I attach a
tarball with some test files in different encodings (don't wonder
about the contents of these files, just test files ;-)).

Regards,
Enrico

-- 
Get my GPG key from http://www.uvena.de/pub.key
-------------- next part --------------
A non-text attachment was scrubbed...
Name: utf_tests.tar.gz
Type: application/octet-stream
Size: 1833 bytes
Desc: not available
URL: <http://lists.geany.org/pipermail/users/attachments/20070524/df0630dd/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.geany.org/pipermail/users/attachments/20070524/df0630dd/attachment.pgp>


More information about the Users mailing list