In the initial description you talk about "unicode". Unicode characters can be saved using different encodings, this can be UTF-8, UTF-16 and so on. We cannot know which encoding the filename of a file on disk was used.

Yes, we can assume UTF-16 or we can assume UTF-8, maybe even both by trying.
But then the next user wants UTF-32 BE, then UTF-32 LE and whatever else.

The behaviour is probably very similar on non-Windows platforms. I think usually filenames should follow the system's locale. Mixed charsets are never a good idea.

Anyway, I don't know how I could create a file on Windows with a filename not in the system's locale and so cannot really test it. I just tested your changes with non-ASCII filenames in the system's locale and it still works.

Before this could get merged, two remarks:


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.