String processing¶
This page documents the String type.
General concepts¶
A String is a sequence of characters enclosed between double quotes,
such as "this". Strings in Phonometrica are immutable, which means that you
cannot modify them directly. All functions which “modify” a string
actually return a new (modified) version of the string but leave the
original string unchanged.
All string functions assume that strings are encoded according to the UTF-8 Unicode standard. A good tutorial about UTF-8 can be found at the following address: http://www.zehnet.de/2005/02/12/unicode-utf-8-tutorial. In the remainder of this document, the term character is used to mean extended grapheme cluster in the sense of the Unicode specification. This generally corresponds to the notion of “user-perceived character”.
Methods¶
-
class
String¶
-
at(pos)¶
Get character at position pos. If pos is negative, counting starts from the end.
-
concat(other)¶
Create a new string which is the concatenation of this and other.
Another, simpler way to concatenate strings is to use the operator +.
-
contains(substring)¶
Returns true if the string contains substring, and false
otherwise.
-
count(substring)¶
Returns the number of times substring appears in the string.
var s = "cacococococa"
var count = s.count("coco")
print(count) # prints "2"
Note: matches don’t overlap.
-
ends\_with(suffix)
Returns true if the string ends with suffix, and false otherwise.
See also: starts_with()
-
insert(pos, other)¶
Returns a copy of the string with other inserted at position pos
-
left(n)¶
Get the substring corresponding to the n first characters of the
string.
-
ltrim()¶
Returns a copy of the string with whitespace characters removed at the left end of the string.
var s = " hello "
s = s.ltrim()
print("$" + s + "$") # prints "$hello $"
-
mid(from, to)¶
Returns the substring of str starting at index from and ending
at index to (inclusive). If to equals -1, returns the
substring from from until the end of the string.
var s = "c'était ça"
print(s.mid(3, 7)) # "était"
print(s.mid(3,-1)) # "était ça"
-
remove(substr)¶
Returns a copy of the string where all (non-overlapping) instances of the
substring substr have been removed.
See also: remove_at(),
remove_first(),
remove_last()
-
remove\_at(at, count)
Returns a copy of the string where count code points, starting at
position at, have been removed.
See also: remove(),
remove_first(),
remove_last()
-
remove\_first(substr)
Returns a copy of the string where the first instance of substr has
been removed.
See also: remove_at(), remove(),
remove_last()
-
remove\_last(substr)
Returns a copy of the string where the last instance of substr has been
removed.
See also: remove_at(), remove(),
remove_first()
-
replace(old, new)¶
Returns a copy of the string where all (non-overlapping) instances of the
substring old have been replaced by new.
See also: replace_at(),
replace_first(),
replace_last()
-
replace\_at(at, count, new)
Returns a copy of the string where count code points, starting at
position at, have been replaced by new.
See also: replace(),
replace_first(),
replace_last()
-
replace\_first(str, old, new)
Returns a copy of the string where the first instance of the substring
old has been replaced by new.
See also: replace_at(),
replace(), replace_last()
-
replace\_last(str, old, new)
Returns a copy of the string where the last instance of the substring
old has been replaced by new.
See also: replace_at(),
replace(), replace_first()
-
reverse()¶
Returns a new string with all the characters in the string in reversed order.
-
right(n)¶
Get the substring corresponding to the n last characters of the
string.
-
rtrim()¶
Returns a copy of the string with whitespace characters removed at the right end of the string.
var s = " hello "
s = s.rtrim()
print("$" + s + "$") # prints "$ hello$"
-
split(delim)¶
Returns a table of strings which have been split at each occurrence of
the substring delim. If delim is the empty string, it returns a
list of the characters in the string.
-
starts\_with(prefix)
Returns true if the string starts with prefix, and false otherwise.
See also: ends_with()
-
to\_lower()
Returns a copy of the string where each code point has been converted to lower case.
var s1 = "C'ÉTAIT ÇA"
var s2 = s1.to_lower()
print(s2) # prints "c'était ça"
See also: to_upper()
-
to\_upper()
Returns a copy of the string where each code point has been converted to upper case.
var s1 = "c'était ça"
var s2 = s1.to_upper()
print(s2) # prints "C'ÉTAIT ÇA"
See also: to_lower()
-
trim()¶
Returns a copy of the string with whitespace characters removed at both ends of the string.
var s = "\t hello \n"
s = s.trim()
print("$" + s + "$") # prints "$hello$"