Module String.Ascii
US-ASCII string support.
The following functions act only on US-ASCII code points, that is on the bytes in range [0x00
;0x7F
]. The functions can be safely used on UTF-8 encoded strings but they will, of course, only deal with US-ASCII related matters.
References.
- Vint Cerf. ASCII format for Network Interchange. RFC 20, 1969.
Predicates
Casing transforms
The functions can be safely used on UTF-8 encoded strings; they will of course only deal with US-ASCII casings.
val uppercase : string -> string
uppercase s
iss
with US-ASCII characters'a'
to'z'
mapped to'A'
to'Z'
.
val lowercase : string -> string
lowercase s
iss
with US-ASCII characters'A'
to'Z'
mapped to'a'
to'z'
.
val capitalize : string -> string
capitalize s
is likeuppercase
but performs the map only ons.[0]
.
val uncapitalize : string -> string
uncapitalize s
is likelowercase
but performs the map only ons.[0]
.
Converting to US-ASCII hexadecimal characters
val to_hex : string -> string
to_hex s
is the sequence of bytes ofs
as US-ASCII lowercase hexadecimal digits.
val of_hex : string -> (string, int) Stdlib.result
of_hex h
parses a sequence of US-ASCII (lower or upper cased) hexadecimal digits fromh
into its corresponding byte sequence.Error n
is returned either withn
an index in the string which is not a hexadecimal digit or the length ofh
if it there is a missing digit at the end.
Converting to printable US-ASCII characters
val escape : string -> string
escape s
escapes bytes ofs
to a representation that uses only US-ASCII printable characters. More precisely:- [
0x20
;0x5B
] and [0x5D
;0x7E
] are left unchanged. These are the printable US-ASCII bytes, except'\\'
(0x5C
). - [
0x00
;0x1F
],0x5C
and [0x7F
;0xFF
] are escaped by an hexadecimal"\xHH"
escape withH
a capital hexadecimal number. These bytes are the US-ASCII control characters, the non US-ASCII bytes and'\\'
(0x5C
).
Use
unescape
to unescape. The invariantunescape (escape s) = Ok s
holds.- [
val unescape : string -> (string, int) Stdlib.result
unescape s
unescapes froms
the escapes performed byescape
. More precisely:"\xHH"
withH
a lower or upper case hexadecimal number is unescaped to the corresponding byte value.
Any other escape following a
'\\'
not defined above makes the function returnError i
withi
the index of the error in the string.
val ocaml_string_escape : string -> string
ocaml_string_escape s
escapes the bytes ofs
to a representation that uses only US-ASCII printable characters and according to OCaml's conventions forstring
literals. More precisely:'\b'
(0x08
) is escaped to"\\b"
(0x5C,0x62
).'\t'
(0x09
) is escaped to"\\t"
(0x5C,0x74
).'\n'
(0x0A
) is escaped to"\\n"
(0x5C,0x6E
).'\r'
(0x0D
) is escaped to"\\r"
(0x5C,0x72
).'\"'
(0x22
) is escaped to"\\\""
(0x5C,0x22
).'\\'
(0x5C
) is escaped to"\\\\"
(0x5C
,0x5C
).0x20
,0x21
, [0x23
;0x5B
] and [0x5D
;0x7E
] are left unchanged. These are the printable US-ASCII bytes, except'\"'
(0x22
) and'\\'
(0x5C
).- Remaining bytes are escaped by an hexadecimal
"\xHH"
escape withH
an uppercase hexadecimal number. These bytes are the US-ASCII control characters not mentioned above and non US-ASCII bytes.
Use
ocaml_unescape
to unescape. The invariantocaml_unescape (ocaml_string_escape s) = Ok s
holds.
val ocaml_unescape : string -> (string, int) Stdlib.result
ocaml_unescape s
unescapes froms
the escape sequences afforded by OCamlstring
andchar
literals. More precisely:"\\b"
(0x5C,0x62
) is unescaped to'\b'
(0x08
)."\\t"
(0x5C,0x74
) is unescaped to'\t'
(0x09
)."\\n"
(0x5C,0x6E
) is unescaped to'\n'
(0x0A
)."\\r"
(0x5C,0x72
) is unescaped to'\r'
(0x0D
)."\\ "
(0x5C,0x20
) is unescaped to' '
(0x20
)."\\\""
(0x5C,0x22
) is unescaped to'\"'
(0x22
)."\\'"
(0x5C,0x27
) is unescaped to'\''
(0x27
)."\\\\"
(0x5C
,0x5C
) is unescaped to'\\'
(0x5C
)."\xHH"
withH
a lower or upper case hexadecimal number is unescaped to the corresponding byte value."\\DDD"
withD
a decimal number such thatDDD
is unescaped to the corresponding byte value."\\oOOO"
withO
an octal number is unescaped to the corresponding byte value.
Any other escape following a
'\\'
not defined above makes the function returnError i
withi
the location of the error in the string.