datawork.api.data

Module implementing abstract Data class.

class datawork.api.data.Data(desc=None, name=None)[source]

Data placeholder class.

This class represents data that has either not yet been computed, or is furthermore not fully specified. Classes inheriting Data implement placeholders for specific data types, e.g. Pandas dataframes or numpy arrays.

Subclasses of Data are typically instantiated by invocations of Tool.

Thus Data and Invocation are connected and form the backbone of the computational graph, with Tool objects connected to Invocation as objects that can be configured.

Note that the provider attribute itself an Invocation, can be “partial”, in which case the data object itself is callable. When called, arguments are passed to the provider which will create new invocations; potentially now non-partial ones.

__call__(*args)[source]

Enable calling for placeholder Data objects.

__init__(desc=None, name=None)[source]

Construct a placeholder data object.

Parameters:
  • desc – a plain-text description of this data object
  • name – a short-hand name for this data object
__repr__()[source]

Represent data including provider and name.

static check_type(value)[source]

Guard value to ensure it is of proper type.

classmethod constant(val, name='constant')[source]

Create a constant from appropriately typed variable.

data

Getter for data attribute.

get_data()[source]

Getter for data attribute.

get_hash()[source]

Return hash of provider if exists, or of data itself for constants.

missing_args()[source]

Count number of missing arguments.

parents()[source]

Return provider as only parent if it is set.

read(filename)[source]

Read data from disk.

static serialize(data)[source]

Convert data to string.

set_data(value, cache=True)[source]

Setter for data attribute.

write(filename)[source]

Write data to disk.