On 05/15/11 22:09, Lex Trotman wrote:
Hi Matthew,
Comments from inspection only as I am still development machine challenged. Also only on the NUL handling, encodings are not my specialty (as in I avoid them like the plague).
Thanks for taking a look!
The problem I see with opening files with embedded NULs is then ensuring that it doesn't break when code that expects NUL termination accesses the buffer.
I did think of this a little. I think mostly other code related to Scintilla will work fine since Scintilla doesn't usually assume that a NUL character will terminate a string, it just treats it like any other valid UTF-8 character, except displays a special character for control characters[1]. See below for specifics.
- editor.c and others including plugins use sci_add_text a number of
times, will any of that break? If so it needs to be disabled or fixed.
Using the old function will just lop off the end of a string with embedded NULs because it uses strlen(), but it will still add the truncated text fine.
- do searches work with files containing NULs?
Seem to work as expected.
- How do selections work with embedded NULs, AFAICT all the selection
calls use NUL termination?
Selecting with the mouse/keyboard works the same. As far as selecting programatically, assuming the Scintilla wrappers are more or less directly wrapping the Scintilla messages, there shouldn't be a problem. I'll check these out shortly, but at glance, they look fine.
- All the places sci_get_string, sci_get_line, sci_get_text,
sci_get_contents and any others that get/set text using NUL termination need to be disabled or fixed.
The sci_get_string() function properly finds out the size of the string using a Scintilla message and then ensuring a final NUL-terminator, so it should work fine. Any code calling these functions will either truncate the returned string at the first NUL or handle it completely. Same goes for the other sci_get_*() functions.
- All of the above goes for plugins too.
Aside from the issues identified with the API breakage, the worst case scenario should be truncation of the string I think.
- Do lexers work with embedded NULs or does such a file have to be
filetype None, if so it needs to be enforced.
At least the CPP lexer works fine, others to be tested, though I don't think they rely too much (or at all) on NUL termination.
Probably a good idea would be to improve the warning dialog that pops up when you open a file with embedded NULs that explains that some functions may not work properly.
Cheers, Matthew Brush
[1] http://www.scintilla.org/ScintillaDoc.html#SCI_SETCONTROLCHARSYMBOL