>As Enrico said above, Geany will not load a file containing NULs, thats one of the >causes of the "binary file" error message, so check if the files contain

That indeed seems to be the problem.
Appears that when Windows saves an email as text,
it puts \x{00} at the end of the file
which persuades Geany to open in encoding UTF-16LE

At all events, I have added the line
   s/\x{00}//;
to my Perl script that does DOS to Unix,
and, for all the files that I have tried so far,
Geany is now happy.

Many thanks

Richard H

On 10/22/2013 02:40 AM, Lex Trotman wrote:



On 22 October 2013 05:07, Enrico Tröger <enrico.troeger@uvena.de> wrote:
Hi,

>How do I get Geany to recognize (Linux text) files
>as UTF-8 encoded?
>
>The files in question are legacy Windows txt files,
>written in French (i.e. with lots of accents)
>which I have converted to   mode: Unix (LF)   encoding:UTF-8
>by a Perl script that does
>
>     "iconv -f CP1252 -t UTF-8 --output=$tempfile $infile"
>and
>     "dos2unix -n -f $tempfile $outfile"

-f for 'force binary files'? Geany can't handle binary files.

In default convert mode --ascii I believe dos2unix expects only ascii chars, so it needs a -f to make it accept UTF-8 encodings.  Given that this is running on the output of iconv this *should* be ok, unless the original files contained NULs or was not CP1252.

 


>It appears that if the infile has a final \x{OA} character,
>then this arrives in the outfile.

\x0A ist \n, hard to imagine this really confuses Geany that much.

Especially as we have an option to add this to files when they are saved :)

 

>
>I can open these files with JEdit or Kate, no problem.
>But Geany's behaviour with such files is inconsistent.
>
>Sometimes Geany refuses to do anything,
>saying "... does not look like a text
>file, or the file encoding is not supported",
>
>Sometimes Geany renders the file  using encoding
>UTF-16 LE, which makes it look as if written in
>Mandarin Chinese.

This sort of thing happens to me with Windows files that have *not* been converted to UTF-8, are you *sure* the iconv was successful?  Are the files CP1252 or maybe ISO-8859-1 or some other code page?

 
>
>And sometimes Geany opens such 'problem' files correctly,
>as UTF-8. So far as I can see, this tends to be the
>case if there are already several txt files open.

Do you mean the behaviour changes for a particular file depending on if there are already several text files open?

 
>
>I have tried putting the line /* geany_encoding=utf-8 */
>as line 1 of a problem file, but that does not seem to
>have any consistent effect.

Without having a look at the code, I was sure in-file headers would
take precedence over guessed encodings.

Your memory is fine Enrico :)

The order (in the absence of a user forced selection) is:

1) Use the encoding the regex found, *if it converts and validates*.  For files with the line above it should be consistent, especially as there is a first try special case for utf-8 that validates.  That is unless the file contains NULs or had a conversion error from the regex matched encoding or won't validate as UTF-8, in which case Geany assumes that the regex just matched some random text and so goes on to try the steps below.

2) Use the encoding in the locale, if it converts without error and validates.  What locale do you have set?

3) Get desperate :) try each encoding in the list (in the order of the menu->document->set encodings->* list) first successful conversion to successfully validate wins.  This heuristic is probably where you are getting strange encodings selected.

Some further things to try, in the open dialog, Geany gives you the chance to select the encoding to use.  Do your "problematic" files work if you select UTF-8 instead of "detect"?  

As Enrico said above, Geany will not load a file containing NULs, thats one of the causes of the "binary file" error message, so check if the files contain NULs.  Gedit does accept NULs IIUC.
 

Cheers
Lex


Anyway, it's quuite hard to help here without knowing about what files
we are talking here.
Could you share some of the problematic files? If not possible in
public, at least via private mail?


Regards,
Enrico

--
Not sent from my smartphone.
_______________________________________________
Users mailing list
Users@lists.geany.org
https://lists.geany.org/cgi-bin/mailman/listinfo/users



_______________________________________________
Users mailing list
Users@lists.geany.org
https://lists.geany.org/cgi-bin/mailman/listinfo/users