Thursday, July 28, 2011

Open a utf8 encoded file saved in windows on unix or linux

Scenario:
Programmatic creation of xml files using c# in a sharepoint library. These files are then manually sent over to a linux system that reads these files using bash od command. The output of the linux command sees 3 special characters in front of the xml declaration tag which is not visible when we open the file in notepad on windows.

Solution:
The 3 special characters are displayed due to BOM (Byte order marker) present as the first character in the generated xml file which is not visible in notepad. The solution is to remove the BOM character programmatically before writing the file to disk / sharepoint.

This can be easily achieved using,

Pass false as a parameter to the constructor so that it ignores the BOM while encoding files.


No comments:

Post a Comment