String processing¶
This page documents the String
type.
General concepts¶
A String
is a sequence of characters enclosed between double quotes,
such as "this"
. Strings in Phonometrica are immutable, which means that you
cannot modify them directly. All functions which “modify” a string
actually return a new (modified) version of the string but leave the
original string unchanged.
All string functions assume that strings are encoded according to the UTF-8 Unicode standard. A good tutorial about UTF-8 can be found at the following address: http://www.zehnet.de/2005/02/12/unicode-utf-8-tutorial. In the remainder of this document, the term character is used to mean extended grapheme cluster in the sense of the Unicode specification. This generally corresponds to the notion of “user-perceived character”.
Methods¶
-
class
String
¶
-
at
(pos)¶
Get character at position pos
. If pos
is negative, counting starts from the end.
-
concat
(other)¶
Create a new string which is the concatenation of this
and other
.
Another, simpler way to concatenate strings is to use the operator +
.
-
contains
(substring)¶
Returns true if the string contains substring
, and false
otherwise.
-
count
(substring)¶
Returns the number of times substring
appears in the string.
var s = "cacococococa"
var count = s.count("coco")
print(count) # prints "2"
Note: matches don’t overlap.
-
ends\_with(suffix)
Returns true if the string ends with suffix
, and false
otherwise.
See also: starts_with()
-
insert
(pos, other)¶
Returns a copy of the string with other
inserted at position pos
-
left
(n)¶
Get the substring corresponding to the n
first characters of the
string.
-
ltrim
()¶
Returns a copy of the string with whitespace characters removed at the left end of the string.
var s = " hello "
s = s.ltrim()
print("$" + s + "$") # prints "$hello $"
-
mid
(from, to)¶
Returns the substring of str
starting at index from
and ending
at index to
(inclusive). If to
equals -1
, returns the
substring from from
until the end of the string.
var s = "c'était ça"
print(s.mid(3, 7)) # "était"
print(s.mid(3,-1)) # "était ça"
-
remove
(substr)¶
Returns a copy of the string where all (non-overlapping) instances of the
substring substr
have been removed.
See also: remove_at()
,
remove_first()
,
remove_last()
-
remove\_at(at, count)
Returns a copy of the string where count
code points, starting at
position at
, have been removed.
See also: remove()
,
remove_first()
,
remove_last()
-
remove\_first(substr)
Returns a copy of the string where the first instance of substr
has
been removed.
See also: remove_at()
, remove()
,
remove_last()
-
remove\_last(substr)
Returns a copy of the string where the last instance of substr
has been
removed.
See also: remove_at()
, remove()
,
remove_first()
-
replace
(old, new)¶
Returns a copy of the string where all (non-overlapping) instances of the
substring old
have been replaced by new
.
See also: replace_at()
,
replace_first()
,
replace_last()
-
replace\_at(at, count, new)
Returns a copy of the string where count
code points, starting at
position at
, have been replaced by new
.
See also: replace()
,
replace_first()
,
replace_last()
-
replace\_first(str, old, new)
Returns a copy of the string where the first instance of the substring
old
has been replaced by new
.
See also: replace_at()
,
replace()
, replace_last()
-
replace\_last(str, old, new)
Returns a copy of the string where the last instance of the substring
old
has been replaced by new
.
See also: replace_at()
,
replace()
, replace_first()
-
reverse
()¶
Returns a new string with all the characters in the string in reversed order.
-
right
(n)¶
Get the substring corresponding to the n
last characters of the
string.
-
rtrim
()¶
Returns a copy of the string with whitespace characters removed at the right end of the string.
var s = " hello "
s = s.rtrim()
print("$" + s + "$") # prints "$ hello$"
-
split
(delim)¶
Returns a table of strings which have been split at each occurrence of
the substring delim
. If delim
is the empty string, it returns a
list of the characters in the string.
-
starts\_with(prefix)
Returns true if the string starts with prefix
, and false
otherwise.
See also: ends_with()
-
to\_lower()
Returns a copy of the string where each code point has been converted to lower case.
var s1 = "C'ÉTAIT ÇA"
var s2 = s1.to_lower()
print(s2) # prints "c'était ça"
See also: to_upper()
-
to\_upper()
Returns a copy of the string where each code point has been converted to upper case.
var s1 = "c'était ça"
var s2 = s1.to_upper()
print(s2) # prints "C'ÉTAIT ÇA"
See also: to_lower()
-
trim
()¶
Returns a copy of the string with whitespace characters removed at both ends of the string.
var s = "\t hello \n"
s = s.trim()
print("$" + s + "$") # prints "$hello$"