Re: [Geany-Users] Count Sel Chars

14 Nov 2013


      On 14 November 2013 19:07, janc@janc.es janc@janc.es wrote:
...
Hi! friends.
I found ehwn selecting chars in a linux UTF-8 text, in status bar it
count double if it is an extended char, I mean out of ASCII table.
Actually its counting octets in the underlying UTF-8 encoding that the
buffer uses, so it could count as high as four for a specific code point.
Or possibly higher when a glyph is made of two or more combining characters.
...
I was using that selection to format the output of script.
I think it should be the number of 'text chars' not bytes.
The difficulty is, as I alluded to above, what is a "text char"?  Depending
on the use-case it could be the octets, the Unicode code points or the
glyphs shown on the screen.
Octets is the information returned from the GUI editing component, that is
why its what is shown.
It would be technically possible to scan the selection and count the
Unicode code points in it, but it would have performance implications if
the selection is large, for example if the user selected the whole document.
There is currently no way of knowing how many glyphs the GUI component used
to display a sequence of octets, so the counted code points may not match
what you see on the screen anyway.
So I don't think its worth changing, the option available, scanning the
selection to count code points each time its changed is potentially slow
and may not do what you expect in any case.
Cheers
Lex
...
What you think about?
Cheers.
--
Jose Angel Navarro Cortes
email: janc@janc.es
web: http://janc.es/
Usuario Linux: #49178

Users mailing list
Users@lists.geany.org
https://lists.geany.org/cgi-bin/mailman/listinfo/users

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Geany-Users] Count Sel Chars