The following APIs are capable of handling Unicode objects and strings
on input (we refer to them as strings in the descriptions) and return
Unicode objects or integers as apporpriate.
They all return NULL or -1 if an exception occurs.
PyObject* PyUnicode_Split(PyObject *s,
PyObject *sep,
int maxsplit)
Return value:New reference.
Split a string giving a list of Unicode strings. If sep is NULL,
splitting will be done at all whitespace substrings. Otherwise,
splits occur at the given separator. At most maxsplit splits
will be done. If negative, no limit is set. Separators are not
included in the resulting list.
Split a Unicode string at line breaks, returning a list of Unicode
strings. CRLF is considered to be one line break. The Line break
characters are not included in the resulting strings.
Translate a string by applying a character mapping table to it and
return the resulting Unicode object.
The mapping table must map Unicode ordinal integers to Unicode
ordinal integers or None (causing deletion of the character).
Mapping tables need only provide the __getitem__()
interface; dictionaries and sequences work well. Unmapped character
ordinals (ones which cause a LookupError) are left
untouched and are copied as-is.
errors has the usual meaning for codecs. It may be NULL
which indicates to use the default error handling.
Join a sequence of strings using the given separator and return the
resulting Unicode string.
PyObject* PyUnicode_Tailmatch(PyObject *str,
PyObject *substr,
int start,
int end,
int direction)
Return value:New reference.
Return 1 if substr matches str[start:end] at
the given tail end (direction == -1 means to do a prefix
match, direction == 1 a suffix match), 0 otherwise.
int PyUnicode_Find(PyObject *str,
PyObject *substr,
int start,
int end,
int direction)
Return the first position of substr in
str[start:end] using the given direction
(direction == 1 means to do a forward search,
direction == -1 a backward search). The return value is the
index of the first match; a value of -1 indicates that no
match was found, and -2 indicates that an error occurred and
an exception has been set.
int PyUnicode_Count(PyObject *str,
PyObject *substr,
int start,
int end)
Return the number of non-overlapping occurrences of substr in
str[start:end]. Returns -1 if an
error occurred.
PyObject* PyUnicode_Replace(PyObject *str,
PyObject *substr,
PyObject *replstr,
int maxcount)
Return value:New reference.
Replace at most maxcount occurrences of substr in
str with replstr and return the resulting Unicode object.
maxcount == -1 means replace all occurrences.