Module String.Ascii
US-ASCII string support.
The following functions act only on US-ASCII code points, that is on the bytes in range [0x00;0x7F]. The functions can be safely used on UTF-8 encoded strings but they will, of course, only deal with US-ASCII related matters.
References.
- Vint Cerf. ASCII format for Network Interchange. RFC 20, 1969.
Predicates
Casing transforms
The functions can be safely used on UTF-8 encoded strings; they will of course only deal with US-ASCII casings.
val uppercase : string -> stringuppercase sisswith US-ASCII characters'a'to'z'mapped to'A'to'Z'.
val lowercase : string -> stringlowercase sisswith US-ASCII characters'A'to'Z'mapped to'a'to'z'.
val capitalize : string -> stringcapitalize sis likeuppercasebut performs the map only ons.[0].
val uncapitalize : string -> stringuncapitalize sis likelowercasebut performs the map only ons.[0].
Converting to US-ASCII hexadecimal characters
val to_hex : string -> stringto_hex sis the sequence of bytes ofsas US-ASCII lowercase hexadecimal digits.
val of_hex : string -> (string, int) Stdlib.resultof_hex hparses a sequence of US-ASCII (lower or upper cased) hexadecimal digits fromhinto its corresponding byte sequence.Error nis returned either withnan index in the string which is not a hexadecimal digit or the length ofhif it there is a missing digit at the end.
Converting to printable US-ASCII characters
val escape : string -> stringescape sescapes bytes ofsto a representation that uses only US-ASCII printable characters. More precisely:- [
0x20;0x5B] and [0x5D;0x7E] are left unchanged. These are the printable US-ASCII bytes, except'\\'(0x5C). - [
0x00;0x1F],0x5Cand [0x7F;0xFF] are escaped by an hexadecimal"\xHH"escape withHa capital hexadecimal number. These bytes are the US-ASCII control characters, the non US-ASCII bytes and'\\'(0x5C).
Use
unescapeto unescape. The invariantunescape (escape s) = Ok sholds.- [
val unescape : string -> (string, int) Stdlib.resultunescape sunescapes fromsthe escapes performed byescape. More precisely:"\xHH"withHa lower or upper case hexadecimal number is unescaped to the corresponding byte value.
Any other escape following a
'\\'not defined above makes the function returnError iwithithe index of the error in the string.
val ocaml_string_escape : string -> stringocaml_string_escape sescapes the bytes ofsto a representation that uses only US-ASCII printable characters and according to OCaml's conventions forstringliterals. More precisely:'\b'(0x08) is escaped to"\\b"(0x5C,0x62).'\t'(0x09) is escaped to"\\t"(0x5C,0x74).'\n'(0x0A) is escaped to"\\n"(0x5C,0x6E).'\r'(0x0D) is escaped to"\\r"(0x5C,0x72).'\"'(0x22) is escaped to"\\\""(0x5C,0x22).'\\'(0x5C) is escaped to"\\\\"(0x5C,0x5C).0x20,0x21, [0x23;0x5B] and [0x5D;0x7E] are left unchanged. These are the printable US-ASCII bytes, except'\"'(0x22) and'\\'(0x5C).- Remaining bytes are escaped by an hexadecimal
"\xHH"escape withHan uppercase hexadecimal number. These bytes are the US-ASCII control characters not mentioned above and non US-ASCII bytes.
Use
ocaml_unescapeto unescape. The invariantocaml_unescape (ocaml_string_escape s) = Ok sholds.
val ocaml_unescape : string -> (string, int) Stdlib.resultocaml_unescape sunescapes fromsthe escape sequences afforded by OCamlstringandcharliterals. More precisely:"\\b"(0x5C,0x62) is unescaped to'\b'(0x08)."\\t"(0x5C,0x74) is unescaped to'\t'(0x09)."\\n"(0x5C,0x6E) is unescaped to'\n'(0x0A)."\\r"(0x5C,0x72) is unescaped to'\r'(0x0D)."\\ "(0x5C,0x20) is unescaped to' '(0x20)."\\\""(0x5C,0x22) is unescaped to'\"'(0x22)."\\'"(0x5C,0x27) is unescaped to'\''(0x27)."\\\\"(0x5C,0x5C) is unescaped to'\\'(0x5C)."\xHH"withHa lower or upper case hexadecimal number is unescaped to the corresponding byte value."\\DDD"withDa decimal number such thatDDDis unescaped to the corresponding byte value."\\oOOO"withOan octal number is unescaped to the corresponding byte value.
Any other escape following a
'\\'not defined above makes the function returnError iwithithe location of the error in the string.