Topkg.String
Strings.
include module type of String
make n c
is a string of length n
with each index holding the character c
.
init n f
is a string of length n
with index i
holding the character f i
(called in increasing index order).
get s i
is the character at index i
in s
. This is the same as writing s.[i]
.
Note. The Stdlib
.( ^ ) binary operator concatenates two strings.
concat sep ss
concatenates the list of strings ss
, inserting the separator string sep
between each.
compare s0 s1
sorts s0
and s1
in lexicographical order. compare
behaves like Stdlib.compare
on strings but may be more efficient.
contains_from s start c
is true
if and only if c
appears in s
after position start
.
rcontains_from s stop c
is true
if and only if c
appears in s
before position stop+1
.
contains s c
is String.contains_from
s 0 c
.
sub s pos len
is a string of length len
, containing the substring of s
that starts at position pos
and has length len
.
split_on_char sep s
is the list of all (possibly empty) substrings of s
that are delimited by the character sep
.
The function's result is specified by the following invariants:
sep
as a separator returns a string equal to the input (concat (make 1 sep)
(split_on_char sep s) = s
).sep
character.map f s
is the string resulting from applying f
to all the characters of s
in increasing order.
mapi f s
is like map
but the index of the character is also passed to f
.
trim s
is s
without leading and trailing whitespace. Whitespace characters are: ' '
, '\x0C'
(form feed), '\n'
, '\r'
, and '\t'
.
escaped s
is s
with special characters represented by escape sequences, following the lexical conventions of OCaml.
All characters outside the US-ASCII printable range [0x20;0x7E] are escaped, as well as backslash (0x2F) and double-quote (0x22).
The function Scanf
.unescaped is a left inverse of escaped
, i.e. Scanf.unescaped (escaped s) = s
for any string s
(unless escaped s
fails).
uppercase_ascii s
is s
with all lowercase letters translated to uppercase, using the US-ASCII character set.
lowercase_ascii s
is s
with all uppercase letters translated to lowercase, using the US-ASCII character set.
capitalize_ascii s
is s
with the first character set to uppercase, using the US-ASCII character set.
uncapitalize_ascii s
is s
with the first character set to lowercase, using the US-ASCII character set.
iter f s
applies function f
in turn to all the characters of s
. It is equivalent to f s.[0]; f s.[1]; ...; f s.[length s - 1]; ()
.
iteri
is like iter
, but the function is also given the corresponding character index.
index_from s i c
is the index of the first occurrence of c
in s
after position i
.
index_from_opt s i c
is the index of the first occurrence of c
in s
after position i
(if any).
rindex_from s i c
is the index of the last occurrence of c
in s
before position i+1
.
rindex_from_opt s i c
is the index of the last occurrence of c
in s
before position i+1
(if any).
index s c
is String.index_from
s 0 c
.
index_opt s c
is String.index_from_opt
s 0 c
.
rindex s c
is String.rindex_from
s (length s - 1) c
.
rindex_opt s c
is String.rindex_from_opt
s (length s - 1) c
.
to_seq s
is a sequence made of the string's characters in increasing order. In "unsafe-string"
mode, modifications of the string during iteration will be reflected in the iterator.
to_seqi s
is like to_seq
but also tuples the corresponding index.
create n
returns a fresh byte sequence of length n
. The sequence is uninitialized and contains arbitrary bytes.
set s n c
modifies byte sequence s
in place, replacing the byte at index n
with c
. You can also write s.[n] <- c
instead of set s n c
.
blit src src_pos dst dst_pos len
copies len
bytes from the string src
, starting at index src_pos
, to byte sequence dst
, starting at character number dst_pos
.
fill s pos len c
modifies byte sequence s
in place, replacing len
bytes by c
, starting at pos
.
Return a copy of the argument, with all lowercase letters translated to uppercase, including accented letters of the ISO Latin-1 (8859-1) character set.
Return a copy of the argument, with all uppercase letters translated to lowercase, including accented letters of the ISO Latin-1 (8859-1) character set.
Return a copy of the argument, with the first character set to uppercase, using the ISO Latin-1 (8859-1) character set..
is_prefix ~affix s
is true
iff affix.[i] = s.[i]
for all indices i
of affix
.
is_suffix ~affix s
is true iff affix.[n - i] = s.[m - i]
for all indices i
of affix
with n = String.length affix - 1
and m =
String.length s - 1
.
for_all p s
is true
iff for all indices i
of s
, p s.[i]
= true
.
exists p s
is true
iff there exists an index i
of s
with p s.[i] = true
.
with_index_range ~first ~last s
are the consecutive bytes of s
whose indices exist in the range [first
;last
].
first
defaults to 0
and last to String.length s - 1
.
Note that both first
and last
can be any integer. If first > last
the interval is empty and the empty string is returned.
cut ~sep s
is either the pair Some (l,r)
of the two (possibly empty) substrings of s
that are delimited by the first match of the separator character sep
or None
if sep
can't be matched in s
. Matching starts from the beginning of s
(rev
is false
, default) or the end (rev
is true
).
The invariant l ^ (String.make 1 sep) ^ r = s
holds.
cuts ~sep s
is the list of all substring of s
that are delimited by matches of sep
. Empty substrings are ommited in the list if empty
is falsee
(defaults to true
). The invariant String.concat (String.make 1 sep) (split ~sep s) = s
holds.
parse_version
parses version strings of the form:
"[v]major.minor[.patchlevel][+additional-info]"
into (major, minor, patch, additiona_info)
tuples.