Module Char.Ascii
US-ASCII character support
The following functions act only on US-ASCII code points, that is on the bytes in range [0x00
;0x7F
]. The functions can be safely used on UTF-8 encoded strings, they will of course only deal with US-ASCII related matters.
References.
- Vint Cerf. ASCII format for Network Interchange. RFC 20, 1969.
Predicates
val is_valid : char -> bool
is_valid c
istrue
iffc
is an US-ASCII character, that is a byte in the range [0x00
;0x7F
].
val is_digit : char -> bool
is_digit c
istrue
iffc
is an US-ASCII digit'0'
...'9'
, that is a byte in the range [0x30
;0x39
].
val is_hex_digit : char -> bool
is_hex_digit c
istrue
iffc
is an US-ASCII hexadecimal digit'0'
...'9'
,'a'
...'f'
,'A'
...'F'
, that is a byte in one of the ranges [0x30
;0x39
], [0x41
;0x46
], [0x61
;0x66
].
val is_upper : char -> bool
is_upper c
istrue
iffc
is an US-ASCII uppercase letter'A'
...'Z'
, that is a byte in the range [0x41
;0x5A
].
val is_lower : char -> bool
is_lower c
istrue
iffc
is an US-ASCII lowercase letter'a'
...'z'
, that is a byte in the range [0x61
;0x7A
].
val is_white : char -> bool
is_white c
istrue
iffc
is an US-ASCII white space character, that is one of space' '
(0x20
), tab'\t'
(0x09
), newline'\n'
(0x0A
), vertical tab (0x0B
), form feed (0x0C
), carriage return'\r'
(0x0D
).
val is_blank : char -> bool
is_blank c
istrue
iffc
is an US-ASCII blank character, that is either space' '
(0x20
) or tab'\t'
(0x09
).
Casing transforms
Escaping to printable US-ASCII
val escape : char -> string
escape c
escapesc
with:'\\'
(0x5C
) escaped to the sequence"\\\\"
(0x5C
,0x5C
).- Any byte in the ranges [
0x00
;0x1F
] and [0x7F
;0xFF
] escaped by an hexadecimal"\xHH"
escape withH
a capital hexadecimal number. These bytes are the US-ASCII control characters and non US-ASCII bytes. - Any other byte is left unchanged.
Use
String.Ascii.unescape
to unescape.
val escape_char : char -> string
escape_char c
is likeescape
except is escapess
according to OCaml's lexical conventions for characters with:'\b'
(0x08
) escaped to the sequence"\\b"
(0x5C,0x62
).'\t'
(0x09
) escaped to the sequence"\\t"
(0x5C,0x74
).'\n'
(0x0A
) escaped to the sequence"\\n"
(0x5C,0x6E
).'\r'
(0x0D
) escaped to the sequence"\\r"
(0x5C,0x72
).'\\''
(0x27
) escaped to the sequence"\\'"
(0x5C,0x27
).- Other bytes follow the rules of
escape
Use
String.Ascii.unescape_string
to unescape.