xotl.tools.records - Records definitions¶
A record allows to describe plain external data and a simplified model to read it. The main use of records is to represent data that is read from a CSV file.
record class to find out how to use it.
Base record class.
Records allow to represent a sequence or mapping of values extracted from external sources into a dict-like Python value.
The first use-case for this abstraction is importing data from a CSV file. You could represent each line as an instance of a properly defined record.
An instance of a record would represent a single line (or row) from the external data source.
Records are expected to declare fields. Each field must be a CAPITALIZED valid identifier like:
>>> class INVOICE(record): ... ID = 0 ... REFERENCE = 1
Fields must be integers or plain strings. Fields must not begin with an underscore (“_”). External data lines are required to support indexes of those types.
You could use either the classmethod
get_field()to get the value of field in a single line (data as provided by the external source):
>>> line = (1, 'AA20X138874Z012') >>> INVOICE.get_field(line, INVOICE.REFERENCE) 'AA20X138874Z012'
You may also have an instance:
>>> invoice = INVOICE(line) >>> invoice.reference 'AA20X138874Z012'
Instances attributes are renamed to lowercase. So you must not create any other attribute that has the same name as a field in lowercase, or else it will be overwritten.
You could define readers for any field. For instance if you have a “CREATED_DATETIME” field you may create a “_created_datetime_reader” function that will be used to parse the raw value of the instance into an expected type. See the included readers builders below.
Readers are always cast as staticmethods, whether or not you have explicitly stated that fact:
>>> from dateutil import parser >>> class BETTER_INVOICE(INVOICE): ... CREATED_TIME = 2 ... _created_time_reader = lambda val: parser.parse(val) >>> line = (1, 'AA20X138874Z012', '2014-02-17T17:29:21.965053') >>> BETTER_INVOICE.get_field(line, BETTER_INVOICE.CREATED_TIME) datetime.datetime(2014, 2, 17, 17, 29, 21, 965053)
Creating readers for fields defined in super classes is not directly supported. To do so, you must declare the reader as a staticmethod yourself.
Currently there’s no concept of relationship between rows in this model. We are evaluating whether by placing a some sort of context into the kwargs argument would be possible to write readers that fetch other instances.
Included reader builders¶
The following functions build readers for standards types.
You cannot use these functions themselves as readers, but you must call them to obtain the desired reader.
All these functions have a pair of keywords arguments nullable and
default. The argument nullable indicates whether the value must be
present or not. The function
check_nullable() implements this check and
allows other to create their own builders with the same semantic.
datetime_reader(format, nullable=False, default=None, strict=True)¶
Returns a datetime reader.
- format – The format the datetime is expected to be in the external
data. This is passed to
- strict – Whether to be strict about datetime format.
The reader works first by passing the value to strict
datetime.datetime.strptime()function. If that fails with a ValueError and strict is True the reader fails entirely.
If strict is False, the worker applies different rules. First if the dateutil package is installed its parser module is tried. If dateutil is not available and nullable is True, return None; if nullable is False and default is not null (as in
isnull()), return default, otherwise raise a ValueError.
- format – The format the datetime is expected to be in the external data. This is passed to
boolean_reader(true=('1', ), nullable=False, default=None)¶
Returns a boolean reader.
Parameters: true – A collection of raw values considered to be True. Only the values in this collection will be considered True values.
Returns an integer reader.
Returns a Decimal reader.
Returns a float reader.
date_reader(format, nullable=False, default=None, strict=True)¶
Return a date reader.
Actually this function delegates to
datetime_reader()most of its functionality.
Checking for null values¶
Return True if val is null.
Null values are None, the empty string and any False instance of
Notice that 0, the empty list and other false values in Python are not considered null. This allows that the CSV null (the empty string) is correctly treated while other sources that provide numbers (and 0 is a valid number) are not misinterpreted as null.
Check the restriction of nullable.
Return True if the val is non-null. If nullable is True and the val is null returns False. If nullable is False and val is null, raise a ValueError.
Test for null is done with function
These couple of functions allows you to define new builders that use the same null concept. For instance, if you need readers that parse dates in diferent locales you may do:
def date_reader(nullable=False, default=None, locale=None): from xotl.tools.records import check_nullable from babel.dates import parse_date, LC_TIME from datetime import datetime if not locale: locale = LC_TIME def reader(value): if check_nullable(value, nullable): return parse_date(value, locale=locale) else: return default return reader