Le 16/05/2011 05:45, Matthew Brush a écrit :
Hi all,
Hey. A bit later, but I finally managed to find some time reading your patches. Below is basically only some comments on the patches, you already discussed some points with Lex, I won't add anything on what was said when I don't think I have something to add.
I was wondering if anyone had time to review some changes I've been working on?
No really as you see, times is rare these days :/ But I'm trying to do my best...
There's two commits involved (attached patches and linked below):
- Allow embedded NUL chars in files (without truncating).
I did a fast and inaccurate review of the patch, see my comments on it: https://github.com/codebrainz/geany/commit/81c89cce736c88f235761ce5765f2361f...
yeah, I'm trying GitHub to see what it actually gives. But I'm afraid it'll make my comments less visible for the ML readers, so unlikely fr them to comment back…
- Add more regex patterns for guessing encoding.
This one too, see https://github.com/codebrainz/geany/commit/5dec6db43b498378be3496d4e89496835...
The first one is kind of large and affects several files/functions. One thing I did which should change is in a few places I added a 'gsize tmp_size' var to pass to the encodings_convert_to_utf8* functions, where a simple NULL size parameter would've sufficed, in these cases, the string is known to be NUL-terminated so this parameter isn't required. Unfortunately I made a commit after this so fixing this will come afterwards (though it's not such a big deal). Also, I'm not completely satisfied with the name of the sci_add_text2() function added to the sciwrappers files, but there was already a function named sci_add_text() which truncates the text at the first NUL, and the new function is a more direct wrapper of the SCI_ADDTEXT message.
sci_add_text_full() ?
having counterpart for set_text() would probably be good, too.
These changes break the plugin API, and would require two small changes to the geany-plugins GeanyVC: geanyvc.c:480 and geanyvc.c:498 as well as GeanyGenDoc: ggd-plugin.c:343. It should be obvious why this breakage was required I hope.
yeah, sounds reasonable for such a change. However, prefer breaking the API once (or at least it a single row), so I wouldn't see these committed before further checking and other checks to whether other APIs have to be updated.
The following bug reports are related to the first set of changes: [1][2][3][4]. A few of the comments seem to suggest that it's not desired to support text files containing NUL control bytes, but IMO, it's nice to be able to show these anyway, especially since Scintilla has a nice way of dealing with these. IMO, as long as the file gets marked read-only, displaying as much as possible is better than truncating at the first NUL byte.
As you discussed with Lex, it's something that may please some, but supporting this partially may simply move the user complaints :D
The second one is a simple refactoring of the code around the regex-based encoding detection and adding a few more patterns to detect a few more types of files. Especially needing review are the actual pattern strings I used since, quite frankly, I suck at regexps. One thing I noticed looking at the bug report here[5], after creating these patches/commits, is that it seems the <?xml ... encoding="" ?> is already supported by the CODING regex, so you can ignore the new XML one I added if that truly is the case (I'll remove it from the proper patch I send in the future). Like I said, my regex skills aren't great :)
As said on GitHub, I'm puzzled with this one.
Any feedback would be much appreciated as I plan to submit these changes as proper patches once I've ensured they are good enough.
In the meanwhile, I will continue testing...
That's the most important part: daily testing ^^
Cheers, Colomban