grep
'ing '[^\x00-\x7F]'
will show you everything that is not in the ascii range. You are negating the set so it's not going to show you \x00
that's null and represented in the ascii range. Is there are a reason there are so many non ascii characters in this file? If there is other code hitting this are they "polluting" it? Beyond learning how to fix it, you probably should find out why this is happening.
You could try deleting the null bytes, to see if that helps:
cat file | tr -d '\000' > new_file
diff file new_file
You can also try segmenting the file (for testing) into smaller chunks until you find the problem area. You may want to use od
for a closer inspection.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.