Need at editing an uncompressed PDF file but suddenly geany seems simply as ignoring is completely i.e. cannot open that uncompressed PDF file Anyone knowledgeable what actually is going on such ?
My guess would be a problem with encoding. Did you check the status tab in the bottom panel? It should tell you if there is some issue with the file being opened.
There's message: "File 'some.PDF' does not look like a text file or the file encoding not supported"
I thought till now geany is capable of opening all code pages
Any viable workaround ?
Some points:
1. Geany will not load any file which contains NUL bytes after converting the encoding to UTF-8 2. There is no guaranteed way to detect encoding of a file 3. PDFs can contain image data which can contain NUL bytes and is not text that can be converted to UTF-8 4. The message "File 'some.PDF' does not look like a text file or the file encoding not supported" means one o fthe above occurred for every encoding Geany knows.
Any viable workaround ?
Viable I don't know, but I heard some people working around NUL bytes by replacing those with a placeholder and back after edition. Something like `sed -i 's/\0/%%NUL%%/g' file && geany file && sed -i 's/%%NUL%%/\x00/g/'` or along the idea. Something like that should work, but I never used ir myself.
Still the image data may not be convertible to UTF-8, its just a sequence of bytes, not any encoding.
base64 then, the solution to just about anything?
base64 then, the solution to just about anything?
:-)
Well, base 64 would be a good solution, if iconv encoding converters recognised the image data inside the PDF and converted it to base 64 instead of just crapping out when the random bytes in the image do not make a valid encoding. (Image or any other binary that can be embedded in PDF).
If base 64 or any other no-NULs format is allowed in PDFs maybe the OP could use one of the PDF converters to do so before editing it in Geany.
But probably the best answer is for the OP to use an actual PDF editor, not try to edit PDF content in a text editor.
Still the image data may not be convertible to UTF-8, its just a sequence of bytes, not any encoding.
UTF-8 no, but many encodings are actually "a sequence of bytes", which is the reason why choosing the right one when opening is so hard (basically, it's a guessing game if you don't know already the answer). *Any* stream of bytes its gonna be valid in one of the ISO-8859-* encodings, the only reason we don't accept them is that we refuse NUL bytes.
Ok, ... convertable to UTF-8 with no NULs ...
Picky picky mumble mumble... :-)
Yes, any ISO-8859-* stream with no NULs is convertible to UTF-8 with no NULs. Or am I missing something? I don't think any of ISO-8859-* has things Unicode cannot represent :)
Yes, but image data is very likely to have NUL bytes which those encodings convert to NULs IIRC. My point is that the text in the PDF will be convertable by some encoding just fine, but embedded images run through the same converter will get garbage and likely NULs. The iconv converters don't know about embedded images to skip them.
Sure, it won't be a convenient experience, and if the non-binary data is not single-byte encoded it's unlikely to be really usable as even stripping/replacing the NULs will not allow to convert to that multi-byte encoding if it has stricter rules than "any byte goes anywhere" (like UTF-8).
Anyway, any solution for editing binary data is gonna be sub-optimal if not specialized for that type of data. Geany knows binary data that represent text, anything else it doesn't. Even real hex editors are usually a pain if they don't have specific support for the format -- but still, they permit to do *some* useful things sometimes.
So to summarise, for PDF files use a PDF editor, for image files use an image editor, for pure text files use Geany.
I propose to close this, it's highly unlikely Geany will ever become a PDF editor. Or is there any reason in keeping this open?
Yeah it's probably fine to close.
FWIW, I have code on top of my encoding PRs for opening binary files, but that's limited to the loading and encoding management, not adapting all code to work with NULs. Yet, search seems to work fairly well @elextr 😉 Anyway, I'm not sure we're gonna merge it, as it's not 100% trivial and doesn't necessarily add a lot of value if most features are half-broken -- however it could help with viewing broken log files or fixing a tiny corruption.
github-comments@lists.geany.org