xoutil.string - Common string operations

Some additions for string standard module.

In this module str and unicode types are not used because Python 2 and Python 3 treats strings differently (see String Ambiguity in Python for more information). The types bytes and text_type will be used instead with the following conventions:

  • In Python 2 str is synonym of bytes and both (unicode and ‘str’) are both string types inheriting form basestring.
  • In Python 3 str is always unicode but unicode and basestring types doesn’t exists. bytes type can be used as an array of one byte each item.
xoutil.string.cut_prefix(value, prefix)[source]

Removes the leading prefix if exists, else return value unchanged.

xoutil.string.cut_any_prefix(value, *prefixes)[source]

Apply cut_prefix() for the first matching prefix.

xoutil.string.cut_prefixes(value, *prefixes)[source]

Apply cut_prefix() for all provided prefixes in order.

xoutil.string.cut_suffix(value, suffix)[source]

Removes the tailing suffix if exists, else return value unchanged.

xoutil.string.cut_any_suffix(value, *suffixes)[source]

Apply cut_suffix() for the first matching suffix.

xoutil.string.cut_suffixes(value, *suffixes)[source]

Apply cut_suffix() for all provided suffixes in order.

xoutil.string.error2str(error)[source]

Convert an error to string.

xoutil.string.make_a10z(string)[source]

Utility to find out that “internationalization” is “i18n”.

Examples:

>>> print(make_a10z('parametrization'))
p13n
xoutil.string.slugify(value, replacement='-', invalid_chars='', valid_chars='', encoding=None)[source]

Return the normal-form of a given string value that is valid for slugs.

Convert all non-ascii to valid characters, whenever possible, using unicode ‘NFKC’ normalization and lower-case the result. Replace unwanted characters by the value of replacement (remove extra when repeated).

Default valid characters are [_a-z0-9]. Extra arguments invalid_chars and valid_chars can modify this standard behaviour, see next:

Parameters:
  • value – The source value to slugify.
  • replacement

    A character to be used as replacement for unwanted characters. Could be both, the first extra positional argument, or as a keyword argument. Default value is a hyphen (‘-‘).

    There will be a contradiction if this argument contains any invalid character (see invalid_chars). None, or False, will be converted converted to an empty string for backward compatibility with old versions of this function, but not use this, will be deprecated.

  • invalid_chars

    Characters to be considered invalid. There is a default set of valid characters which are kept in the resulting slug. Characters given in this parameter are removed from the resulting valid character set (see valid_chars).

    Extra argument values can be used for compatibility with invalid_underscore argument in deprecated normalize_slug function:

    • True is a synonymous of underscore "_".
    • False or None: An empty set.

    Could be given as a name argument or in the second extra positional argument. Default value is an empty set.

  • valid_chars – A collection of extra valid characters. Could be either a valid string, any iterator of strings, or None to use only default valid characters. Non-ASCII characters are ignored.
  • encoding – If value is not a text (unicode), it is decoded before ASCII normalization.

Examples:

>>> slugify('  Á.e i  Ó  u  ') == 'a-e-i-o-u'
True

>>> slugify(' Á.e i  Ó  u  ', '.', invalid_chars='AU') == 'e.i.o'
True

>>> slugify('  Á.e i  Ó  u  ', valid_chars='.') == 'a.e-i-o-u'
True

>>> slugify('_x', '_') == '_x'
True

>>> slugify('-x', '_') == 'x'
True

>>> slugify(None) == 'none'
True

>>> slugify(1 == 1)  == 'true'
True

>>> slugify(1.0) == '1-0'
True

>>> slugify(135) == '135'
True

>>> slugify(123456, '', invalid_chars='52') == '1346'
True

>>> slugify('_x', '_') == '_x'
True

Changed in version 1.5.5: Added the invalid_underscore parameter.

Changed in version 1.6.6: Replaced the invalid_underscore paremeter by invalids. Added the valids parameter.

Changed in version 1.7.2: Clarified the role of invalids with regards to replacement.

Changed in version 1.8.0: Deprecate the invalids paremeter name in favor of invalid_chars, also deprecate the valids paremeter name in favor of valid_chars.

Changed in version 1.8.7: Add parameter ‘encoding’.

xoutil.string.normalize_slug(value, replacement='-', invalid_chars='', valid_chars='', encoding=None)[source]

Deprecated alias of slugify().