Wednesday, 25 November 2009

Inserting the BOM into a file

I have been working with Windows UTF-8 files a lot today, and I found that just saving the file as UTF-8 isn't enough for ASP.NET (well 1.1 at least). Even though emacs says that it will save the buffer with the BOM (Byte Order Mark) it doesn't seem to if the file doesn't start with one. So I wrote myself a little helper function to add the BOM into the start of the file. It works by going to the start of the buffer you are in, and adding the BOM FEFF. The exact bytes comprising the BOM for the Unicode character U+FEFF are converted into the UTF-8 format by emacs when it saves the file (which for reference are EF BB BF -- thanks to pnkfelix for pointing that out).
;;Insert the BOM at the start of a file for UTF
(defun insert-BOM()
  (goto-char (point-min))
  (ucs-insert (string-to-number "FEFF" 16))