Info logo
Encyclopedia

  

Byte Order Mark

Home :: Up
Google
www.fastload.org

Byte Order Mark

A Byte Order Mark (BOM) is the character at code point FEFF (ZERO-WIDTH NO-BREAK SPACE), when that character is used to denote the Endianness of an encoded string of UCS/Unicode characters.

A BOM can be used to indicate that unlabeled text is UTF-16 or UTF-8 encoded, as well as indicating the byte-order of UTF-16 text, whether labeled or not.

In UTF-16, a BOM is expressed as the 8-bit byte sequence FE FF at the beginning of the encoded string, to indicate that the encoded characters that follow it use big-endian byte order; or it is expressed as the byte sequence FF FE to indicate little-endian order.

UTF-8 text can also use a BOM, although this is rare, since UTF-8 prescribes a fixed byte order, and since UTF-8 is often assumed or implicit, so it doesn't need a signature. The UTF-8 representation of the BOM is the byte sequence EF BB BF.

External Links


Recommend us

This article is licensed under the GNU Free Documentation License.
You may copy and modify it as long as the entire work (including additions) remains under this license.
To view or edit this article at Wikipedia, follow this link.