UTF-8 is one of the most commonly used methods to encode unicode characters into byte values. It has some interesting properties, for example, characters from the ASCII codeset retain their encoding. This is an implementation of UTF-8 decoding in OCaml.
Continue reading
Decoding UTF-8 streams
Leave a reply