cafe.common package¶
cafe.common.unicode¶
-
cafe.common.unicode.
PLANE_NAMES
= <class 'cafe.common.unicode.PLANE_NAMES'>[source]¶ Namespace that defines all standard Unicode Plane names
A list-like object (UnicodeRangeList) made up of UnicodeRange objects. It covers the same total range as UNICODE_BLOCKS, but is instead organized by plane names instead of block names, which results in fewer but larger ranges.
-
cafe.common.unicode.
BLOCK_NAMES
= <class 'cafe.common.unicode.BLOCK_NAMES'>[source]¶ Namespace that defines all standard Unicode Block names
A list-like object (UnicodeRangeList) made up of UnicodeRange objects. Each UnicodeRange object in the list corresponds to a named Unicode Block, and contains the start and end integer for that Block.
-
cafe.common.unicode.
UNICODE_BLOCKS
(cafe.common.unicode.UnicodeRangeList)¶ list-like object that iterates through named ranges of unicode codepoints Instantiated at runtime (when imported) near the bottom of this file
-
cafe.common.unicode.
UNICODE_PLANES
(cafe.common.unicode.UnicodeRangeList)¶ list-like object that iterates through ranges of ranges of unicode codepoints Instantiated at runtime (when imported) near the bottom of this file
Usage Examples:
# Print all the characters in the "Thai" unicode block
for c in UNICODE_BLOCKS.get_range(BLOCK_NAMES.thai).encoded_codepoints():
print c
# Iterate through all the integer codepoints in the "Thai" unicode block
for i in UNICODE_BLOCKS.get_range(BLOCK_NAMES.thai).codepoints():
do_something(i)
# Get a list of the names of all the characters in the "Thai" unicode block
[n for n in UNICODE_BLOCKS.get_range(
BLOCK_NAMES.thai).codepoint_names()]
-
cafe.common.unicode.
UNICODE_ENDING_CODEPOINT
= 1114109¶ Integer denoting the last unicode codepoint
-
cafe.common.unicode.
UNICODE_STARTING_CODEPOINT
= 0¶ Integer denoting the first unicode codepoint
-
class
cafe.common.unicode.
UnicodeRange
(start, end, name)[source]¶ Bases:
object
Iterable representation of a range of unicode codepoints. This can represent a standard Unicode Block, a standard Unicode Plane, or even a custom range.
A UnicodeRange object contains a start, end, and name attribute which normally corresponds to the start and end integer for a range of Unicode codepoints.
Each UnicodeRange object includes generators for performing common functions on the codepoints in that integer range.
-
codepoint_names
()[source]¶ Generator that yields the name of each codepoint in range as a string.
If a name cannot be found, the codepoint’s integer value is returned in hexidecimal format as a string.
Return type: generator, returns strings
-
-
class
cafe.common.unicode.
UnicodeRangeList
[source]¶ Bases:
list
A list-like for containing collections of UnicodeRange objects.
Allows iteration through all codepoins in collected ranged, even if the ranges are disjointed. Useful for for creating custom ranges for specialized testing.
-
codepoint_names
()[source]¶ Generator that yields the name of each codepoint in range as a string.
If a name cannot be found, the codepoint’s integer value is returned in hexidecimal format as a string.
Return type: generator, returns strings
-
codepoints
()[source]¶ Generator that yields each codepoint in all ranges as an integer.
Return type: generator, returns ints
-
encoded_codepoints
(encoding='utf-8')[source]¶ Generator that yields each codepoint name in range, encoded.
Parameters: encoding (string) – the encoding to use on the string Return type: generator, returns unicode strings
-
get_range
(range_name)[source]¶ Get a range of unicode codepoints by block name.
Returns a single
UnicodeRange
object representing the codepoints in the unicode block range named byrange_name
, if such a range exists in the instance ofUnicodeRangeList
thatget_range
is being called from.Parameters: range_name (string) – name of the requested unicode block range. Return type: UnicodeRange
class instance, or None
-
get_range_list
(range_name_list)[source]¶ Get a list of ranges of unicode codepoints by block names.
Returns a single
UnicodeRangeList
object representing the codepoints in the unicode block ranges named byrange_name_list
, if such ranges exists in the instance ofUnicodeRangeList
thatget_range_list
is being called from.Parameters: range_name_list (list of strings) – name(s) of requested unicode block ranges. Return type: UnicodeRangeList
class instance, orNone
-
-
cafe.common.unicode.
codepoint_name
(codepoint_integer)[source]¶ Expects a Unicode codepoint as an integer.
Returns the unicode name of codepoint_integer if valid unicode codepoint, None otherwise
If a name cannot be found, the codepoint’s integer value is returned in hexidecimal format as a string.