Module String.Ascii
US-ASCII string support.
References.
- Vint Cerf. ASCII format for Network Interchange. RFC 20, 1969.
Predicates
Casing transforms
The following functions act only on US-ASCII code points that is on bytes in range [0x00
;0x7F
], leaving any other byte intact. The functions can be safely used on UTF-8 encoded strings; they will of course only deal with US-ASCII casings.
val uppercase : string -> string
uppercase s
iss
with US-ASCII characters'a'
to'z'
mapped to'A'
to'Z'
.
val lowercase : string -> string
lowercase s
iss
with US-ASCII characters'A'
to'Z'
mapped to'a'
to'z'
.
val capitalize : string -> string
capitalize s
is likeuppercase
but performs the map only ons.[0]
.
val uncapitalize : string -> string
uncapitalize s
is likelowercase
but performs the map only ons.[0]
.
Escaping to printable US-ASCII
val escape : string -> string
escape s
iss
with:- Any
'\\'
(0x5C
) escaped to the sequence"\\\\"
(0x5C
,0x5C
). - Any byte in the ranges [
0x00
;0x1F
] and [0x7F
;0xFF
] escaped by an hexadecimal"\xHH"
escape withH
a capital hexadecimal number. These bytes are the US-ASCII control characters and non US-ASCII bytes. - Any other byte is left unchanged.
- Any
val unescape : string -> string option
unescape s
unescapes whatescape
did. The letters of hex escapes can be upper, lower or mixed case, and any two letter hex escape is decoded to its corresponding byte. Any other escape not defined byescape
or truncated escape makes the function returnNone
.The invariant
unescape (escape s) = Some s
holds.
val escape_string : string -> string
escape_string s
is likeescape
except it escapess
according to OCaml's lexical conventions for strings with:- Any
'\b'
(0x08
) escaped to the sequence"\\b"
(0x5C,0x62
). - Any
'\t'
(0x09
) escaped to the sequence"\\t"
(0x5C,0x74
). - Any
'\n'
(0x0A
) escaped to the sequence"\\n"
(0x5C,0x6E
). - Any
'\r'
(0x0D
) escaped to the sequence"\\r"
(0x5C,0x72
). - Any
'\"'
(0x22
) escaped to the sequence"\\\""
(0x5C,0x22
). - Any other byte follows the rules of
escape
- Any
val unescape_string : string -> string option
unescape_string
is toescape_string
whatunescape
is toescape
and also additionally unescapes the sequence"\\'"
(0x5C,0x27
) to"'"
(0x27
).