[Geany-Users] Count Sel Chars

Lex Trotman elextr at xxxxx
Thu Nov 14 08:50:12 UTC 2013


On 14 November 2013 19:07, janc at janc.es <janc at janc.es> wrote:

> Hi! friends.
>
> I found ehwn selecting chars in a linux UTF-8 text, in status bar it
> count double if it is an extended char, I mean out of ASCII table.
>

Actually its counting octets in the underlying UTF-8 encoding that the
buffer uses, so it could count as high as four for a specific code point.
Or possibly higher when a glyph is made of two or more combining characters.


>
> I was using that selection to format the output of script.
>
> I think it should be the number of 'text chars' not bytes.
>

The difficulty is, as I alluded to above, what is a "text char"?  Depending
on the use-case it could be the octets, the Unicode code points or the
glyphs shown on the screen.

Octets is the information returned from the GUI editing component, that is
why its what is shown.

It would be technically possible to scan the selection and count the
Unicode code points in it, but it would have performance implications if
the selection is large, for example if the user selected the whole document.

There is currently no way of knowing how many glyphs the GUI component used
to display a sequence of octets, so the counted code points may not match
what you see on the screen anyway.

So I don't think its worth changing, the option available, scanning the
selection to count code points each time its changed is potentially slow
and may not do what you expect in any case.

Cheers
Lex



>
> What you think about?
>
> Cheers.
>
> --
> Jose Angel Navarro Cortes
> email: janc at janc.es
> web: http://janc.es/
> Usuario Linux: #49178
>
> _______________________________________________
> Users mailing list
> Users at lists.geany.org
> https://lists.geany.org/cgi-bin/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.geany.org/pipermail/users/attachments/20131114/d5f6fdcb/attachment.html>


More information about the Users mailing list