Hi All,
In bug 3410977 the OP asked for Geany to use MIME types to determine filetypes. This caused some confusion.
1. I asked where the OP was suggesting we get the mime type from. 2. Colomban suggested g_content_type_guess 3. Matthew suggested mime.types and then libmagic
Given a filename with an extension, all these just lookup the extension in a database, the same as we do, but as Colomban pointed out, not as easily modified.
Matthew and I looked at the code of g_content_type_guess and tried to believe it was as stupid as it looked. So I tried it and it was! Any file without an extension is an application/octet-stream unless its binary. Feeding the content to it without a filename gives application/x-csrc for any file containing /* or //.
Libmagic is very slightly brighter it gives C for files containing /* and C++ for files containing //
None of these techniques is in any way convincing that we should convert to using the standard mime/magic libraries and all of them reverse the order that we currently use, we do internal first then extension, they do extension then internal. And libmagic would cause another dependency.
If no-one has any better ideas we can close the bug on that basis.
Instead I believe it will be useful to look for marks in the file in the first and last couple of lines. Examples are the modelines used by emacs, which are in all the C++ system library files (which have no extensions) and the modelines used by vim. For people who insisted (however daftly) on using .h for C++ headers they could still mark that the file should be highlighted with the C++ rules and keywords. (A note to C programmers, C++ has a lot more in its headers, sometimes the whole program so highlighting them correctly is useful).
Using a regular expression for the search would make it adaptable and the capture group could return the filetype. I'll look at this soon.
Cheers Lex