xotl.tools.records - Records definitions

Records definitions.

A record allows to describe plain external data and a simplified model to read it. The main use of records is to represent data that is read from a CSV file.

See the record class to find out how to use it.

class xotl.tools.records.record(raw_data)[source]

Base record class.

Records allow to represent a sequence or mapping of values extracted from external sources into a dict-like Python value.

The first use-case for this abstraction is importing data from a CSV file. You could represent each line as an instance of a properly defined record.

An instance of a record would represent a single line (or row) from the external data source.

Records are expected to declare fields. Each field must be a CAPITALIZED valid identifier like:

>>> class INVOICE(record):
...     ID = 0
...     REFERENCE = 1

Fields must be integers or plain strings. Fields must not begin with an underscore (“_”). External data lines are required to support indexes of those types.

You could use either the classmethod get_field() to get the value of field in a single line (data as provided by the external source):

>>> line = (1, 'AA20X138874Z012')
>>> INVOICE.get_field(line, INVOICE.REFERENCE)
'AA20X138874Z012'

You may also have an instance:

>>> invoice = INVOICE(line)
>>> invoice.reference
'AA20X138874Z012'

Note

Instances attributes are renamed to lowercase. So you must not create any other attribute that has the same name as a field in lowercase, or else it will be overwritten.

You could define readers for any field. For instance if you have a “CREATED_DATETIME” field you may create a “_created_datetime_reader” function that will be used to parse the raw value of the instance into an expected type. See the included readers builders below.

Readers are always cast as staticmethods, whether or not you have explicitly stated that fact:

>>> from dateutil import parser
>>> class BETTER_INVOICE(INVOICE):
...     CREATED_TIME = 2
...     _created_time_reader = lambda val: parser.parse(val)

>>> line = (1, 'AA20X138874Z012', '2014-02-17T17:29:21.965053')
>>> BETTER_INVOICE.get_field(line, BETTER_INVOICE.CREATED_TIME)
datetime.datetime(2014, 2, 17, 17, 29, 21, 965053)

Warning

Creating readers for fields defined in super classes is not directly supported. To do so, you must declare the reader as a staticmethod yourself.

Note

Currently there’s no concept of relationship between rows in this model. We are evaluating whether by placing a some sort of context into the kwargs argument would be possible to write readers that fetch other instances.

Included reader builders

The following functions build readers for standards types.

Note

You cannot use these functions themselves as readers, but you must call them to obtain the desired reader.

All these functions have a pair of keywords arguments nullable and default. The argument nullable indicates whether the value must be present or not. The function check_nullable() implements this check and allows other to create their own builders with the same semantic.

xotl.tools.records.datetime_reader(format, nullable=False, default=None, strict=True)[source]

Returns a datetime reader.

Parameters:
  • format – The format the datetime is expected to be in the external data. This is passed to datetime.datetime.strptime().
  • strict – Whether to be strict about datetime format.

The reader works first by passing the value to strict datetime.datetime.strptime() function. If that fails with a ValueError and strict is True the reader fails entirely.

If strict is False, the worker applies different rules. First if the dateutil package is installed its parser module is tried. If dateutil is not available and nullable is True, return None; if nullable is False and default is not null (as in isnull()), return default, otherwise raise a ValueError.

xotl.tools.records.boolean_reader(true=('1', ), nullable=False, default=None)[source]

Returns a boolean reader.

Parameters:true – A collection of raw values considered to be True. Only the values in this collection will be considered True values.
xotl.tools.records.integer_reader(nullable=False, default=None)[source]

Returns an integer reader.

xotl.tools.records.decimal_reader(nullable=False, default=None)[source]

Returns a Decimal reader.

xotl.tools.records.float_reader(nullable=False, default=None)[source]

Returns a float reader.

xotl.tools.records.date_reader(format, nullable=False, default=None, strict=True)[source]

Return a date reader.

This is similar to datetime_reader() but instead of returning a datetime.datetime it returns a datetime.date.

Actually this function delegates to datetime_reader() most of its functionality.

Checking for null values

xotl.tools.records.isnull(val)[source]

Return True if val is null.

Null values are None, the empty string and any False instance of xotl.tools.symbols.boolean.

Notice that 0, the empty list and other false values in Python are not considered null. This allows that the CSV null (the empty string) is correctly treated while other sources that provide numbers (and 0 is a valid number) are not misinterpreted as null.

xotl.tools.records.check_nullable(val, nullable)[source]

Check the restriction of nullable.

Return True if the val is non-null. If nullable is True and the val is null returns False. If nullable is False and val is null, raise a ValueError.

Test for null is done with function isnull().

These couple of functions allows you to define new builders that use the same null concept. For instance, if you need readers that parse dates in diferent locales you may do:

def date_reader(nullable=False, default=None, locale=None):
    from xotl.tools.records import check_nullable
    from babel.dates import parse_date, LC_TIME
    from datetime import datetime
    if not locale:
        locale = LC_TIME

    def reader(value):
        if check_nullable(value, nullable):
            return parse_date(value, locale=locale)
        else:
            return default
    return reader