Currently, with line-wrap disabled, when the typed characters reach the right-hand-side margin of the window, the window view jumps / scrolls to a position a bit more than half the width of the window. (What character was previously seen at about half window position in the line, now gets moved left to begin of window view). That is too much of a jump; would it be possible to make it smaller, ideally the size of the set indentation, or just the (average) width of a character? Maybe as an option for the user to set (though I don't see how it would harm others).
Notepad++, for example, sets the auto-scroll to 4 char-widths (at least when using monospaced); which is better, although no option to change that.
Looking at https://www.scintilla.org/ScintillaDoc.html#ScrollingAndAutomaticScrolling , it seems it would be possible?... Particularly `SCI_GETXOFFSET` and/or `SCI_LINESCROLL(int columns, int lines)`.
I will illustrate why this is desirable; bellow `|` represents window margins Say you type one line, press "enter", then type a second after an indentation:
|Aaaaaaaaaaaaaaaaaaaaaabaaaaaaaaaaaaaaaaaaaa | | Caaaaaaaaaaaaadaaaaaaaaaaaaaaaaaaaaa|
Now when I type one more character... whoa Nelly, too big jump! I see this (too bad font here is not monospaced, at least in code blocks)
|baaaaaaaaaaaaaaaaaaaa | |daaaaaaaaaaaaaaaaaaaaa |
If it were to jump by at most an indentation size, you would see this:
|aaaaaaaaaaaaaaaaaabaaaaaaaaaaaaaaaaaaaa | |Caaaaaaaaaaaaadaaaaaaaaaaaaaaaaaaaaa |
which is better, as you still see the whole line you are editing.
Auto scrolling is done by the Scintilla editing component Geany uses. It has a number of policies [available](https://www.scintilla.org/ScintillaDoc.html#ScrollingAndAutomaticScrolling) and currently Geany sets `CARET_JUMPS` and `CARET_EVEN` for x policy.
It would probably be reasonable for someone to provide a pull request that makes some/all settings available via a user setting, probably in Preferences->Editor->Display, next to `Lines visible around the cursor`.
...or in `filetypes.common`, would be faster to implement I guess
No, `filetypes.common` is not a general catchall for settings, its not even visible in the user interface, and there is also a push back against dumping things in `Various` too (just to pre-warn you :).
I read https://www.scintilla.org/ScintillaDoc.html#ScrollingAndAutomaticScrolling and I understood that, for this particular Issue, it would suffice to expose the `CARET_JUMPS` variable. If `CARET_EVEN` if left as is (value 1), then when typed character goes out of visibility / reaches limit, display will : * move by one position when `CARET_JUMPS=0` * jump and centre on the caret (last typed character) when `CARET_JUMPS=1`
Ok, so only a checkbox needed probably.
SCI_LINESCROLL() I think is an instruction to scroll the screen, not part of autoscroll.
I thought autoscroll would use `SCI_LINESCROLL()` as a way to decide how much to scroll when `CARET_JUMPS=0`. I.e, the size of "one position" in above "move by one position when `CARET_JUMPS=0` ". Or maybe Scintilla dev.-s, by "one position" meant a character width, in https://www.scintilla.org/ScintillaDoc.html#SCI_SETXCARETPOLICY , see 2nd line, last column, in the table with: slop | strict | jumps | even
You need to understand the Scintilla terminology.
`In this document, 'character' normally refers to a byte even when multi-byte characters are used.`
`Positions within the Scintilla document refer to a character or the gap before that character.`
`There are places where the caret can not go where two character bytes make up one character.`
All of which means "move the caret one position" is to the next legal byte location the caret can occupy. Approximately one display character.
PS and nothing to do with the command SCI_LINESCROLL() which is for the application to use to scroll the display manually.
Thank you for the explanations. That's sort of what I guessed in my question in the above reply.
'character' normally refers to a byte even when multi-byte characters are used
That's definitely confusing, even logically self-contradicting. They should have said "byte" if they refer to byte. So positions are between bytes, and usually that will be between visible characters (unless have multi-byte characters).
PS and nothing to do with the command SCI_LINESCROLL() which is for the application to use to scroll the display manually.
Oh, I see; that is for when clicking on the ends of right-side or bottom ribbon to make the view scroll.
Lot's to learn...
I suspect that the terminology is taken from the original Windows edit control Scintilla originally emulated.
And that was probably designed well before multi-byte characters, hence the code page crap that is still in Scintilla.
It may seem unusual in this age of Unicode, but the world didn't suddenly wake up Unicode, it evolved to it, with many false steps along the way, each leaving its legacy scars on applications like Scintilla.
And even Unicode isn't perfect, even if you stored each code point in the same number of bytes (ie no UTF-8 or UTF-16 encoding) there are still combinations of two code points that map to only one glyph (eg c̦ which is two code points and if you copy it to Geany you can delete forward and it will remove the `c`, but not the cedilla and vice versa if you backspace and it takes two forward cursor movements to forward over it).
Geany always uses UTF-8 encoding in the buffer, so it only meets the weird world of other encodings at load or save time. But that does mean variable length code points and issues like the above that make screen positions and positions in the buffer hard to relate.
there are still combinations of two code points that map to only one glyph (eg c̦
?!? what the .... Wikipedia (emphases mine):
Unicode is a computing industry standard for the *consistent* encoding, representation, and handling of text expressed in most of the world's writing systems.
In text processing, Unicode takes the role of providing a *unique* code point—a number, not a glyph—for each *character*
Whether or not it is trully consistent depends on their interpretation of "character", because this section https://en.wikipedia.org/wiki/Unicode#Ready-made_versus_composite_characters talks about "main characters" and "diacritical marks" combining to make what they call in earlier sections "abstract characters".
I personally believe it's a bad approach: not only because it makes it harder for computing industry, but it is in principle inconsistent with treatment of most, if not all characters. (ex: A made of 3 bars, B of 1 bar and 2 semi-circles or partial circles...), thus `(almost) any visible character can be regarded as a combination of some small, primitive "marks" (and historically probably evolved that way).`
Not a perfect standard at all.
--- But my practical take-away is , still, that a character (and I mean a visible character, including example of c̦ ) on the screen is represented by 1 or, for "complex" characters, more bytes. And the caret tries to step in between those bytes; sometimes ending up in "illegal positions" and not showing up.
"diacritical marks" combining to make what they call "abstract character".
https://www.unicode.org/charts/PDF/U0300.pdf and https://www.unicode.org/charts/PDF/U1AB0.pdf and https://www.unicode.org/charts/PDF/U1DC0.pdf and https://www.unicode.org/charts/PDF/U20D0.pdf and https://www.unicode.org/charts/PDF/UFE20.pdf
Note that what they may be combined with isn't defined, thats language dependent. Some of the commonest combinations are also single code points in the standard, indeed this ç has the single code point u+00e7 and is treated as a single thing by Geany/Scintilla. But I don't think all legal combinations are single code pointed, and then there are the symbolic ones.
The caret will only move between code points, so its consistent that it takes two steps through the two code point version of c̦. Left to itself Scintilla will not put the caret within multiple bytes defining a code point, but the user program can still address those positions.
The computation of those "legal positions" must take into account all that mess of standards and encodings...
The [Unicode](https://www.unicode.org) CLDR contains all the data about code points and the semantics, combining, bi-di, zero wide, narrow, and dual wide. But one thing it does not define is visual glyphs.
To be fair to Unicode, its messy because human languages are messy, damn those humans, why can't they all just speak numbers like us bots. :grin:
Anyhow this has gotten slightly away from your original issue, which as I said just needs the Scintilla setting to be supported by Geany. All you need is Glade 3.8 to modify the UI but still support GTK2, but most distros don't provide it, so you need to compile your own 3.8.5 from https://ftp.gnome.org/pub/GNOME/sources/glade3/3.8/.
I suspect most implementations prefer to work with the 1-code point versions of those combined characters; so willy-nilly those standard commitees will be pushed in the right direction.
The caret will only move between code points...
Nice, some relief here; now those tables matching unicode code points to corresponding code units in say UTF8 encoding should make it simpler.
... why can't they all just speak numbers like us bots. :grin:
Uh-oh, what series/model are you? :)
github-comments@lists.geany.org