Provider¶

Warning

Documentation for the module is under development.

Provider – class that converts chunks of data using its transformers self.transformers. The following specific useful implementations of this class exist: Merger and Serial.

init¶

def __init__(self, numofthreads=4)

Creates a Provider object.

Parameters

Parameter	Allowed types	Description	Default
numofthreads	int	Number of threads involved in data preparation	4

Return value

None

enter¶

def __enter__(self)

Enables using Provider in with construction.

Parameters

None.

Return value

Given Provider object.

exit¶

def __exit__(self, exc_type, exc_value, traceback)

Called upon exit from with expression. Executes the [closePool] method(#closepool).

Parameters

Parameter	Allowed types	Description	Default
exc_type	–	System parameter, please see the description of the with construction online	–
exc_value	–	System parameter, please see the description of the with construction online	–
traceback	–	System parameter, please see the description of the with construction online	–

Return value

None.

closePool¶

def closePool(self)

Closes the thread pool.

Parameters

None.

Return value

None.

addTransformer¶

def addTransformer(self, transformer)

A method of adding a transformer to the array of transformers of this object.

Parameters

Parameter	Allowed types	Description	Default
transformer	Transformer	Transformer to be used for data conversion	–

Return value

None.

getNextChunk¶

def getNextChunk(self, chunksize, **kwargs)

Returns a data batch sized chunksuze. The kwargs parameter specifies the parameters for constructing the data batch. By default the method is empty; you have to redefine it to your needs.

Parameters

Parameter	Allowed types	Description	Default
chunksize	int	Size of returned data batches	–
**kwargs	dict	Dictionary that can specify the parameters for constructing data batches	-

Return value

There is no default return value in Provider. If a certain method realization is specified, then the return value is the same as in Merger.

prepareData¶

def prepareData(self, chunksize=20000, **kwargs)

It takes the next data batch using getNextChunk and prepares the transformed data in multi-threaded mode.

Parameters

Parameter	Allowed types	Description	Default
chunksize	int	Size of returned data batches	20000
**kwargs	dict	Dictionary that can specify the parameters for constructing data batches	-

Return value

None.

getData¶

def getData(self)

Method used for receiving processed data, it is called after prepareData. If the data is not ready yet, it waits for the processing to end.

Parameters

None.

Return value

Prepared data from self.data.

worker¶

def worker(transformers, batch, threadidx)

In separate threads, workers apply each transformer to their data batch.

Parameters

Parameter	Allowed types	Description	Default
transformers	list	Transformer array self.transformers	–
batch	np.ndarray, list	Data batch from self.data, passed to this worker for processing	–
threadidx	int	Multiprocessing thread number	–

Return value

Returns a tuple (batch, threadidx), where batch – processed data batch.

Provider¶

__init__¶

__enter__¶

__exit__¶

closePool¶

addTransformer¶

getNextChunk¶

prepareData¶

getData¶

worker¶

init¶

enter¶

exit¶