[Geany-devel] [CODE REVIEW] Changes to encodings
Matthew Brush
mbrush at xxxxx
Mon May 16 06:20:44 UTC 2011
On 05/15/11 22:09, Lex Trotman wrote:
> Hi Matthew,
>
> Comments from inspection only as I am still development machine
> challenged. Also only on the NUL handling, encodings are not my
> specialty (as in I avoid them like the plague).
Thanks for taking a look!
>
> The problem I see with opening files with embedded NULs is then
> ensuring that it doesn't break when code that expects NUL termination
> accesses the buffer.
I did think of this a little. I think mostly other code related to
Scintilla will work fine since Scintilla doesn't usually assume that a
NUL character will terminate a string, it just treats it like any other
valid UTF-8 character, except displays a special character for control
characters[1]. See below for specifics.
>
> 1. editor.c and others including plugins use sci_add_text a number of
> times, will any of that break? If so it needs to be disabled or fixed.
Using the old function will just lop off the end of a string with
embedded NULs because it uses strlen(), but it will still add the
truncated text fine.
> 2. do searches work with files containing NULs?
Seem to work as expected.
> 3. How do selections work with embedded NULs, AFAICT all the selection
> calls use NUL termination?
Selecting with the mouse/keyboard works the same. As far as selecting
programatically, assuming the Scintilla wrappers are more or less
directly wrapping the Scintilla messages, there shouldn't be a problem.
I'll check these out shortly, but at glance, they look fine.
> 4. All the places sci_get_string, sci_get_line, sci_get_text,
> sci_get_contents and any others that get/set text using NUL
> termination need to be disabled or fixed.
The sci_get_string() function properly finds out the size of the string
using a Scintilla message and then ensuring a final NUL-terminator, so
it should work fine. Any code calling these functions will either
truncate the returned string at the first NUL or handle it completely.
Same goes for the other sci_get_*() functions.
> 5. All of the above goes for plugins too.
Aside from the issues identified with the API breakage, the worst case
scenario should be truncation of the string I think.
> 6. Do lexers work with embedded NULs or does such a file have to be
> filetype None, if so it needs to be enforced.
At least the CPP lexer works fine, others to be tested, though I don't
think they rely too much (or at all) on NUL termination.
Probably a good idea would be to improve the warning dialog that pops up
when you open a file with embedded NULs that explains that some
functions may not work properly.
Cheers,
Matthew Brush
[1] http://www.scintilla.org/ScintillaDoc.html#SCI_SETCONTROLCHARSYMBOL
More information about the Devel
mailing list