Module B0_std.String
Strings.
String
include module type of Stdlib.String
val get : string -> int -> char
String.get s n
returns the character at indexn
in strings
. You can also writes.[n]
instead ofString.get s n
.Raise
Invalid_argument
ifn
not a valid index ins
.
val set : bytes -> int -> char -> unit
String.set s n c
modifies byte sequences
in place, replacing the byte at indexn
withc
. You can also writes.[n] <- c
instead ofString.set s n c
.Raise
Invalid_argument
ifn
is not a valid index ins
.- deprecated
This is a deprecated alias of
Bytes
.set.
val create : int -> bytes
String.create n
returns a fresh byte sequence of lengthn
. The sequence is uninitialized and contains arbitrary bytes.Raise
Invalid_argument
ifn < 0
orn >
Sys
.max_string_length.- deprecated
This is a deprecated alias of
Bytes
.create.
val make : int -> char -> string
String.make n c
returns a fresh string of lengthn
, filled with the characterc
.Raise
Invalid_argument
ifn < 0
orn >
Sys
.max_string_length.
val init : int -> (int -> char) -> string
String.init n f
returns a string of lengthn
, with characteri
initialized to the result off i
(called in increasing index order).Raise
Invalid_argument
ifn < 0
orn >
Sys
.max_string_length.- since
- 4.02.0
val copy : string -> string
Return a copy of the given string.
- deprecated
Because strings are immutable, it doesn't make much sense to make identical copies of them.
val sub : string -> int -> int -> string
String.sub s start len
returns a fresh string of lengthlen
, containing the substring ofs
that starts at positionstart
and has lengthlen
.Raise
Invalid_argument
ifstart
andlen
do not designate a valid substring ofs
.
val fill : bytes -> int -> int -> char -> unit
String.fill s start len c
modifies byte sequences
in place, replacinglen
bytes withc
, starting atstart
.Raise
Invalid_argument
ifstart
andlen
do not designate a valid range ofs
.- deprecated
This is a deprecated alias of
Bytes
.fill.
val concat : string -> string list -> string
String.concat sep sl
concatenates the list of stringssl
, inserting the separator stringsep
between each.Raise
Invalid_argument
if the result is longer thanSys
.max_string_length bytes.
val iter : (char -> unit) -> string -> unit
String.iter f s
applies functionf
in turn to all the characters ofs
. It is equivalent tof s.[0]; f s.[1]; ...; f s.[String.length s - 1]; ()
.
val iteri : (int -> char -> unit) -> string -> unit
Same as
String.iter
, but the function is applied to the index of the element as first argument (counting from 0), and the character itself as second argument.- since
- 4.00.0
val map : (char -> char) -> string -> string
String.map f s
applies functionf
in turn to all the characters ofs
(in increasing index order) and stores the results in a new string that is returned.- since
- 4.00.0
val mapi : (int -> char -> char) -> string -> string
String.mapi f s
callsf
with each character ofs
and its index (in increasing index order) and stores the results in a new string that is returned.- since
- 4.02.0
val trim : string -> string
Return a copy of the argument, without leading and trailing whitespace. The characters regarded as whitespace are:
' '
,'\012'
,'\n'
,'\r'
, and'\t'
. If there is neither leading nor trailing whitespace character in the argument, return the original string itself, not a copy.- since
- 4.00.0
val escaped : string -> string
Return a copy of the argument, with special characters represented by escape sequences, following the lexical conventions of OCaml. All characters outside the ASCII printable range (32..126) are escaped, as well as backslash and double-quote.
If there is no special character in the argument that needs escaping, return the original string itself, not a copy.
Raise
Invalid_argument
if the result is longer thanSys
.max_string_length bytes.The function
Scanf
.unescaped is a left inverse ofescaped
, i.e.Scanf.unescaped (escaped s) = s
for any strings
(unlessescape s
fails).
val index : string -> char -> int
String.index s c
returns the index of the first occurrence of characterc
in strings
.Raise
Not_found
ifc
does not occur ins
.
val index_opt : string -> char -> int option
String.index_opt s c
returns the index of the first occurrence of characterc
in strings
, orNone
ifc
does not occur ins
.- since
- 4.05
val rindex : string -> char -> int
String.rindex s c
returns the index of the last occurrence of characterc
in strings
.Raise
Not_found
ifc
does not occur ins
.
val rindex_opt : string -> char -> int option
String.rindex_opt s c
returns the index of the last occurrence of characterc
in strings
, orNone
ifc
does not occur ins
.- since
- 4.05
val index_from : string -> int -> char -> int
String.index_from s i c
returns the index of the first occurrence of characterc
in strings
after positioni
.String.index s c
is equivalent toString.index_from s 0 c
.Raise
Invalid_argument
ifi
is not a valid position ins
. RaiseNot_found
ifc
does not occur ins
after positioni
.
val index_from_opt : string -> int -> char -> int option
String.index_from_opt s i c
returns the index of the first occurrence of characterc
in strings
after positioni
orNone
ifc
does not occur ins
after positioni
.String.index_opt s c
is equivalent toString.index_from_opt s 0 c
. RaiseInvalid_argument
ifi
is not a valid position ins
.- since
- 4.05
val rindex_from : string -> int -> char -> int
String.rindex_from s i c
returns the index of the last occurrence of characterc
in strings
before positioni+1
.String.rindex s c
is equivalent toString.rindex_from s (String.length s - 1) c
.Raise
Invalid_argument
ifi+1
is not a valid position ins
. RaiseNot_found
ifc
does not occur ins
before positioni+1
.
val rindex_from_opt : string -> int -> char -> int option
String.rindex_from_opt s i c
returns the index of the last occurrence of characterc
in strings
before positioni+1
orNone
ifc
does not occur ins
before positioni+1
.String.rindex_opt s c
is equivalent toString.rindex_from_opt s (String.length s - 1) c
.Raise
Invalid_argument
ifi+1
is not a valid position ins
.- since
- 4.05
val contains : string -> char -> bool
String.contains s c
tests if characterc
appears in the strings
.
val contains_from : string -> int -> char -> bool
String.contains_from s start c
tests if characterc
appears ins
after positionstart
.String.contains s c
is equivalent toString.contains_from s 0 c
.Raise
Invalid_argument
ifstart
is not a valid position ins
.
val rcontains_from : string -> int -> char -> bool
String.rcontains_from s stop c
tests if characterc
appears ins
before positionstop+1
.Raise
Invalid_argument
ifstop < 0
orstop+1
is not a valid position ins
.
val uppercase : string -> string
Return a copy of the argument, with all lowercase letters translated to uppercase, including accented letters of the ISO Latin-1 (8859-1) character set.
- deprecated
Functions operating on Latin-1 character set are deprecated.
val lowercase : string -> string
Return a copy of the argument, with all uppercase letters translated to lowercase, including accented letters of the ISO Latin-1 (8859-1) character set.
- deprecated
Functions operating on Latin-1 character set are deprecated.
val capitalize : string -> string
Return a copy of the argument, with the first character set to uppercase, using the ISO Latin-1 (8859-1) character set..
- deprecated
Functions operating on Latin-1 character set are deprecated.
val uncapitalize : string -> string
Return a copy of the argument, with the first character set to lowercase, using the ISO Latin-1 (8859-1) character set..
- deprecated
Functions operating on Latin-1 character set are deprecated.
val uppercase_ascii : string -> string
Return a copy of the argument, with all lowercase letters translated to uppercase, using the US-ASCII character set.
- since
- 4.03.0
val lowercase_ascii : string -> string
Return a copy of the argument, with all uppercase letters translated to lowercase, using the US-ASCII character set.
- since
- 4.03.0
val capitalize_ascii : string -> string
Return a copy of the argument, with the first character set to uppercase, using the US-ASCII character set.
- since
- 4.03.0
val uncapitalize_ascii : string -> string
Return a copy of the argument, with the first character set to lowercase, using the US-ASCII character set.
- since
- 4.03.0
val compare : t -> t -> int
The comparison function for strings, with the same specification as
Stdlib.compare
. Along with the typet
, this functioncompare
allows the moduleString
to be passed as argument to the functorsSet
.Make andMap
.Make.
val split_on_char : char -> string -> string list
String.split_on_char sep s
returns the list of all (possibly empty) substrings ofs
that are delimited by thesep
character.The function's output is specified by the following invariants:
- The list is not empty.
- Concatenating its elements using
sep
as a separator returns a string equal to the input (String.concat (String.make 1 sep) (String.split_on_char sep s) = s
). - No string in the result contains the
sep
character.
- since
- 4.04.0
Iterators
val to_seq : t -> char Stdlib.Seq.t
Iterate on the string, in increasing index order. Modifications of the string during iteration will be reflected in the iterator.
- since
- 4.07
val to_seqi : t -> (int * char) Stdlib.Seq.t
Iterate on the string, in increasing order, yielding indices along chars
- since
- 4.07
val of_seq : char Stdlib.Seq.t -> t
Create a string from the generator
- since
- 4.07
Predicates
val is_prefix : affix:string -> string -> bool
is_prefix ~affix s
istrue
iffaffix.[i] = s.[i]
for all indicesi
ofaffix
.
val is_infix : affix:string -> string -> bool
is_infix ~affix s
istrue
iff there exists an indexj
such that for all indicesi
ofaffix
,affix.[i] = s.[j+ 1]
.
val is_suffix : affix:string -> string -> bool
is_suffix ~affix s
is true iffaffix.[i] = s.[m - i]
for all indicesi
ofaffix
and withm = String.length s - 1
.
Extracting substrings
val with_index_range : ?first:int -> ?last:int -> string -> string
with_index_range ~first ~last s
are the consecutive bytes ofs
whose indices exist in the range [first
;last
].first
defaults to0
and last toString.length s - 1
.Note that both
first
andlast
can be any integer. Iffirst > last
the interval is empty and the empty string is returned.
Breaking
Breaking with magnitudes
val take_left : int -> string -> string
take_left n s
are the firstn
bytes ofs
. This iss
ifn >= length s
and""
ifn <= 0
.
val take_right : int -> string -> string
take_right n s
are the lastn
bytes ofs
. This iss
ifn >= length s
and""
ifn <= 0
.
val drop_left : int -> string -> string
drop_left n s
iss
without the firstn
bytes ofs
. This is""
ifn >= length s
ands
ifn <= 0
.
Breaking with predicates
val keep_left : (char -> bool) -> string -> string
keep_left sat s
are the first consecutivesat
statisfying bytes ofs
.
val keep_right : (char -> bool) -> string -> string
keep_right sat s
are the last consecutivesat
satisfying bytes ofs
.
val lose_left : (char -> bool) -> string -> string
lose_left sat s
iss
without the first consecutivesat
satisfying bytes ofs
.
val lose_right : (char -> bool) -> string -> string
lose_right sat s
iss
without the last consecutivesat
satisfying bytes ofs
.
Breaking with separators
val cut_left : sep:string -> string -> (string * string) option
cut ~sep s
is either the pairSome (l,r)
of the two (possibly empty) substrings ofs
that are delimited by the first match of the separator charactersep
orNone
ifsep
can't be matched ins
. Matching starts from the left ofs
.The invariant
l ^ sep ^ r = s
holds.- raises Invalid_argument
if
sep
is the empty string.
val cut_right : sep:string -> string -> (string * string) option
cut_right ~sep s
is likecut_left
but matching starts on the right ofs
.
val cuts_left : ?drop_empty:bool -> sep:string -> string -> string list
cuts_left sep s
is the list of all substrings ofs
that are delimited by matches of the non empty separator stringsep
. Empty substrings are omitted in the list ifdrop_empty
istrue
(defaults tofalse
).Matching separators in
s
starts from the left ofs
(rev
isfalse
, default) or the end (rev
istrue
). Once one is found, the separator is skipped and matching starts again, that is separator matches can't overlap. If there is no separator match ins
, the list[s]
is returned.The following invariants hold:
concat ~sep (cuts ~drop_empty:false ~sep s) = s
cuts ~drop_empty:false ~sep s <> []
- raises Invalid_argument
if
sep
is the empty string.
val cuts_right : ?drop_empty:bool -> sep:string -> string -> string list
cuts_right sep s
is likecuts_left
but matching starts on the right ofs
.
Traversing
Formatting
val pp : string Fmt.t
pp ppf s
printss
's bytes onppf
.
val dump : string Fmt.t
dump ppf s
printss
as a syntactically valid OCaml string onppf
.
Uniqueness
val uniquify : string list -> string list
uniquify ss
isss
without duplicates, the list order is preserved.
val unique : exists:(string -> bool) -> string -> (string, string) Stdlib.result
unique ~exist n
isn
ifexists n
isfalse
orr = strf "%s~%d" n d
withd
the smallest integer in [1
;1e9
] such thatexists r
isfalse
or an error if there is no such string.
Suggesting
val edit_distance : string -> string -> int
edit_distance s0 s1
is the number of single character edits (insertion, deletion, substitution) that are needed to changes0
intos1
.
val suggest : ?dist:int -> string list -> string -> string list
suggest ~dist candidates s
are the elements ofcandidates
whose edit distance is the smallest tos
and at most at a distance ofdist
ofs
(defaults to2
). If multiple results are returned the order ofcandidates
is preserved.
Escaping and unescaping bytes
See also the Converting to printable US-ASCII characters.
XXX. Limitation cannot escape/unescape multiple bytes (e.g. UTF-8 byte sequences). This could be achieved by tweaking the sigs to return integer pairs but that would allocate quite a bit.
val escaper : (char -> int) -> (bytes -> int -> char -> int) -> string -> string
escaper char_len set_char
is a byte escaper that given a bytec
useschar_len c
bytes in the escaped form and usesset_char b i c
to set the escaped form forc
inb
at indexi
returning the next writable index (no bounds check need to be performed). For anyb
,c
andi
the invarianti + char_len c = set_char b i c
must hold.
exception
Illegal_escape of int
See
unescaper
.
val unescaper : (string -> int -> int) -> (bytes -> int -> string -> int -> int) -> string -> (string, int) Stdlib.result
unescaper char_len_at set_char
is a byte unescaper that useschar_len_at
to determine the length of a byte at a given index in the string to unescape andset_char b k s i
to set at indexk
inb
the unescaped character read at indexi
ins
; and returns the next readable index ins
(no bound check need to be performed). For anyb
,s
,k
andi
the invarianti + char_len_at s i = set_char b k s i
.Both
char_len_at
andset_char
may raiseIllegal_escape i
if the given indexi
has an illegal or truncated escape. The unescaper only uses this exception internally it returnsError i
if it found an illegal escape at indexi
.
Strings as US-ASCII character sequences
module Ascii : sig ... end
US-ASCII string support.
String map and sets
module Set : sig ... end
String sets.
module Map : sig ... end
String maps.