Overview:
- The str_len() function of NumPy computes the length of the string for each of the strings present in a NumPy array-like containing bytes, str_ and StringDType as elements. The lengths are returned as an ndarray. When the parameter is a scalar an integer is returned as return value.
- For strings constructed using bytes the length is number of bytes.
- For strings constructed using unicode the length is number of unicode code points. A unicode character (as per the mapping) may require more than one code point to represent a character. Hence, the total length of an unicode string is not the number of characters present in it but the total number of code points present in the string.
Example 1 - Finding lengths of unicode strings:
In this example the strings can have a maximum length of ten. If the length of the string exceeds ten the remaining characters are simply discarded. To address this the variable length string type StringDType can be used as in the case of Example 3.
# Example Python program that finds the length of unicode strings # Create a NumPy array of unicode strings # Hello in different languages lengths = numpy.strings.str_len(stringArray) |
Output:
An ndarray of unicode strings: [['Hello' 'こんにちは'] ['안녕하세요' 'สวัสดี']] Length of unicode strings: [[5 5] [5 6]] |
Example 2 - Finding lenghts of strings made of bytes:
# Example Python program that finds the length of a # Get the length of ASCII bytes msgLength = len(msg) # Get the length of an unicode string |
Output:
11 11 9 9 |
Example 3 - Using StringDType:
# Example Python program that uses varibale length # Create an ndarray whose elements are variable length strings vStrings[0][0] = "A" lens = numpy.strings.str_len(vStrings) # Print the array # Print the string lengths |
Output:
[['A' 'AAAA'] ['AAAAAA' 'AAAAAAAAAAAA']] [[ 1 4] [ 6 12]] |