Module Astring.String
Strings, substrings, string sets and maps.
A string s
of length l
is a zero-based indexed sequence of l
bytes. An index i
of s
is an integer in the range [0
;l-1
], it represents the i
th byte of s
which can be accessed using the string indexing operator s.[i]
.
Important. OCaml's string
s became immutable since 4.02. Whenever possible compile your code with the -safe-string
option. This module does not expose any mutable operation on strings and assumes strings are immutable. See the porting guide.
v0.8.3 - homepage
String
val v : len:int -> (int -> char) -> string
v len f
is a strings
of lengthlen
withs.[i] = f i
for all indicesi
ofs
.f
is invoked in increasing index order.- raises Invalid_argument
if
len
is not in the range [0
;Sys
.max_string_length].
val get : string -> int -> char
get s i
is the byte ofs
' at indexi
. This is equivalent to thes.[i]
notation.- raises Invalid_argument
if
i
is not an index ofs
.
val head : ?rev:bool -> string -> char option
head s
isSome (get s h)
withh = 0
ifrev = false
(default) orh = length s - 1
ifrev = true
.None
is returned ifs
is empty.
val get_head : ?rev:bool -> string -> char
get_head s
is likehead
but- raises Invalid_argument
if
s
is empty.
Appending strings
val append : string -> string -> string
append s s'
appendss'
tos
. This is equivalent tos ^ s'
.- raises Invalid_argument
if the result is longer than
Sys
.max_string_length.
val concat : ?sep:string -> string list -> string
concat ~sep ss
concatenates the list of stringsss
, separating each consecutive elements in the listss
withsep
(defaults toempty
).- raises Invalid_argument
if the result is longer than
Sys
.max_string_length.
Predicates
val is_prefix : affix:string -> string -> bool
is_prefix ~affix s
istrue
iffaffix.[i] = s.[i]
for all indicesi
ofaffix
.
val is_infix : affix:string -> string -> bool
is_infix ~affix s
istrue
iff there exists an indexj
ins
such that for all indicesi
ofaffix
we haveaffix.[i] = s.[j + i]
.
val is_suffix : affix:string -> string -> bool
is_suffix ~affix s
is true iffaffix.[n - i] = s.[m - i]
for all indicesi
ofaffix
withn = String.length affix - 1
andm = String.length s - 1
.
val for_all : (char -> bool) -> string -> bool
for_all p s
istrue
iff for all indicesi
ofs
,p s.[i] = true
.
Extracting substrings
Tip. These functions extract substrings as new strings. Using substrings may be less wasteful and more flexible.
val with_range : ?first:int -> ?len:int -> string -> string
with_range ~first ~len s
are the consecutive bytes ofs
whose indices exist in the range [first
;first + len - 1
].first
defaults to0
andlen
tomax_int
. Note thatfirst
can be any integer andlen
any positive integer.- raises Invalid_argument
if
len
is negative.
val with_index_range : ?first:int -> ?last:int -> string -> string
with_index_range ~first ~last s
are the consecutive bytes ofs
whose indices exist in the range [first
;last
].first
defaults to0
andlast
toString.length s - 1
.Note that both
first
andlast
can be any integer. Iffirst > last
the interval is empty and the empty string is returned.
val trim : ?drop:(char -> bool) -> string -> string
trim ~drop s
iss
with prefix and suffix bytes satisfyingdrop
ins
removed.drop
defaults toChar.Ascii.is_white
.
val span : ?rev:bool -> ?min:int -> ?max:int -> ?sat:(char -> bool) -> string -> string * string
span ~rev ~min ~max ~sat s
is(l, r)
where:- if
rev
isfalse
(default),l
is at leastmin
and at mostmax
consecutivesat
satisfying initial bytes ofs
orempty
if there are no such bytes.r
are the remaining bytes ofs
. - if
rev
istrue
,r
is at leastmin
and at mostmax
consecutivesat
satisfying final bytes ofs
orempty
if there are no such bytes.l
are the remaining the bytes ofs
.
If
max
is unspecified the span is unlimited. Ifmin
is unspecified it defaults to0
. Ifmin > max
the condition can't be satisfied and the left or right span, depending onrev
, is always empty.sat
defaults to(fun _ -> true)
.The invariant
l ^ r = s
holds.- raises Invalid_argument
if
max
ormin
is negative.
- if
val take : ?rev:bool -> ?min:int -> ?max:int -> ?sat:(char -> bool) -> string -> string
take ~rev ~min ~max ~sat s
is the matching span ofspan
without the remaining one. In other words:(if rev then snd else fst) @@ span ~rev ~min ~max ~sat s
val drop : ?rev:bool -> ?min:int -> ?max:int -> ?sat:(char -> bool) -> string -> string
drop ~rev ~min ~max ~sat s
is the remaining span ofspan
without the matching span. In other words:(if rev then fst else snd) @@ span ~rev ~min ~max ~sat s
val cut : ?rev:bool -> sep:string -> string -> (string * string) option
cut ~sep s
is either the pairSome (l,r)
of the two (possibly empty) substrings ofs
that are delimited by the first match of the non empty separator stringsep
orNone
ifsep
can't be matched ins
. Matching starts from the beginning ofs
(rev
isfalse
, default) or the end (rev
istrue
).The invariant
l ^ sep ^ r = s
holds.- raises Invalid_argument
if
sep
is the empty string.
val cuts : ?rev:bool -> ?empty:bool -> sep:string -> string -> string list
cuts sep s
is the list of all substrings ofs
that are delimited by matches of the non empty separator stringsep
. Empty substrings are omitted in the list ifempty
isfalse
(defaults totrue
).Matching separators in
s
starts from the beginning ofs
(rev
isfalse
, default) or the end (rev
istrue
). Once one is found, the separator is skipped and matching starts again, that is separator matches can't overlap. If there is no separator match ins
, the list[s]
is returned.The following invariants hold:
concat ~sep (cuts ~empty:true ~sep s) = s
cuts ~empty:true ~sep s <> []
- raises Invalid_argument
if
sep
is the empty string.
val fields : ?empty:bool -> ?is_sep:(char -> bool) -> string -> string list
fields ~empty ~is_sep s
is the list of (possibly empty) substrings that are delimited by bytes for whichis_sep
istrue
. Empty substrings are omitted in the list ifempty
isfalse
(defaults totrue
).is_sep
defaults toChar.Ascii.is_white
.
Substrings
type sub
The type for substrings.
val sub_with_range : ?first:int -> ?len:int -> string -> sub
sub_with_range
is likewith_range
but returns a substring value. Iffirst
is smaller than0
the empty string at the start ofs
is returned. Iffirst
is greater than the last index ofs
the empty string at the end ofs
is returned.
val sub_with_index_range : ?first:int -> ?last:int -> string -> sub
sub_with_index_range
is likewith_index_range
but returns a substring value. Iffirst
andlast
are smaller than0
the empty string at the start ofs
is returned. Iffirst
and is greater than the last index ofs
the empty string at the end ofs
is returned. Iffirst > last
andfirst
is an index ofs
the empty string atfirst
is returned.
module Sub : sig ... end
Substrings.
Traversing strings
val find : ?rev:bool -> ?start:int -> (char -> bool) -> string -> int option
find ~rev ~start sat s
is:- If
rev
isfalse
(default). The smallest indexi
, if any, greater or equal tostart
such thatsat s.[i]
istrue
.start
defaults to0
. - If
rev
istrue
. The greatest indexi
, if any, smaller or equal tostart
such thatsat s.[i]
istrue
.start
defaults toString.length s - 1
.
Note that
start
can be any integer.- If
val find_sub : ?rev:bool -> ?start:int -> sub:string -> string -> int option
find_sub ~rev ~start ~sub s
is:- If
rev
isfalse
(default). The smallest indexi
, if any, greater or equal tostart
such thatsub
can be found starting ati
ins
that iss.[i] = sub.[0]
,s.[i+1] = sub.[1]
, ...start
defaults to0
. - If
rev
istrue
. The greatest indexi
, if any, smaller or equal tostart
such thatsub
can be found starting ati
ins
that iss.[i] = sub.[0]
,s.[i+1] = sub.[1]
, ...start
defaults toString.length s - 1
.
Note that
start
can be any integer.- If
val filter : (char -> bool) -> string -> string
filter sat s
is the string made of the bytes ofs
that satisfysat
, in the same order.
val filter_map : (char -> char option) -> string -> string
filter_map f s
is the string made of the bytes ofs
as mapped byf
, in the same order.
val map : (char -> char) -> string -> string
map f s
iss'
withs'.[i] = f s.[i]
for all indicesi
ofs
.f
is invoked in increasing index order.
val mapi : (int -> char -> char) -> string -> string
mapi f s
iss'
withs'.[i] = f i s.[i]
for all indicesi
ofs
.f
is invoked in increasing index order.
val fold_left : ('a -> char -> 'a) -> 'a -> string -> 'a
fold_left f acc s
isf (
...(f (f acc s.[0]) s.[1])
...) s.[m]
withm = String.length s - 1
.
val fold_right : (char -> 'a -> 'a) -> string -> 'a -> 'a
fold_right f s acc
isf s.[0] (f s.[1] (
...(f s.[m] acc) )
...)
withm = String.length s - 1
.
Uniqueness
Strings as US-ASCII character sequences
module Ascii : sig ... end
US-ASCII string support.
Pretty printing
val pp : Stdlib.Format.formatter -> string -> unit
pp ppf s
printss
's bytes onppf
.
val dump : Stdlib.Format.formatter -> string -> unit
dump ppf s
printss
as a syntactically valid OCaml string onppf
usingAscii.escape_string
.
String sets and maps
module Set : sig ... end
String sets.
module Map : sig ... end
String maps.
OCaml base type conversions
val to_char : string -> char option
to_char s
is the single byte ins
orNone
if there is no byte or more than one ins
.
val of_bool : bool -> string
of_bool b
is a string representation forb
. Relies onPervasives
.string_of_bool.
val to_bool : string -> bool option
to_bool s
is abool
froms
, if any. Relies onPervasives
.bool_of_string.
val of_int : int -> string
of_int i
is a string representation fori
. Relies onPervasives
.string_of_int.
val to_int : string -> int option
to_int
is anint
froms
, if any. Relies onPervasives
.int_of_string.
val of_nativeint : nativeint -> string
of_nativeint i
is a string representation fori
. Relies onNativeint
.of_string.
val to_nativeint : string -> nativeint option
to_nativeint
is annativeint
froms
, if any. Relies onNativeint
.to_string.
val of_int32 : int32 -> string
of_int32 i
is a string representation fori
. Relies onInt32
.of_string.
val to_int32 : string -> int32 option
to_int32
is anint32
froms
, if any. Relies onInt32
.to_string.
val of_int64 : int64 -> string
of_int64 i
is a string representation fori
. Relies onInt64
.of_string.
val to_int64 : string -> int64 option
to_int64
is anint64
froms
, if any. Relies onInt64
.to_string.