Model#

class gordo.machine.model.base.GordoBase(**kwargs)[source]#

Bases: ABC

Initialize the model

abstract get_metadata()[source]#: Get model specific metadata, if any

abstract get_params(deep=False)[source]#: Return a dict containing all parameters used to initialized object

abstract score(X: ndarray | DataFrame, y: ndarray | DataFrame, sample_weight: ndarray | None = None)[source]#: Score the model; must implement the correct default scorer based on model type

class gordo.machine.model.models.KerasAutoEncoder(*args: Any, **kwargs: Any)[source]#

Bases: KerasBaseEstimator, TransformerMixin

Subclass of the KerasBaseEstimator to allow fitting to just X without requiring y.

Initialized a Scikit-Learn API compatitble Keras model with a pre-registered function or a builder function directly.

Parameters:

kind – The structure of the model to build. As designated by any registered builder functions, registered with gordo.machine.model.register.register_model_builder(). Alternatively, one may pass a builder function directly to this argument. Such a function should accept n_features as it’s first argument, and pass any additional parameters to **kwargs
kwargs (dict) – Any additional args which are passed to the factory building function and/or any additional args to be passed to Keras’ fit() method

score(X: ndarray | DataFrame, y: ndarray | DataFrame, sample_weight: ndarray | None = None, **kwargs) → float[source]#

Returns the explained variance score between auto encoder’s input vs output

Parameters:

X – Input data to the model
y – Target
sample_weight – sample weights
kwargs – Additional kwargs for model.predict()

Return type:

Returns the explained variance score

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → KerasAutoEncoder#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a pipeline.Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

class gordo.machine.model.models.KerasBaseEstimator(*args: Any, **kwargs: Any)[source]#

Bases: KerasRegressor, GordoBase, BaseEstimator

Initialized a Scikit-Learn API compatitble Keras model with a pre-registered function or a builder function directly.

Parameters:

kind – The structure of the model to build. As designated by any registered builder functions, registered with gordo.machine.model.register.register_model_builder(). Alternatively, one may pass a builder function directly to this argument. Such a function should accept n_features as it’s first argument, and pass any additional parameters to **kwargs
kwargs (dict) – Any additional args which are passed to the factory building function and/or any additional args to be passed to Keras’ fit() method

classmethod extract_supported_fit_args(kwargs)[source]#

Filtering only fit related kwargs

Parameters:: kwargs –

fit(X: ndarray | DataFrame | DataArray, y: ndarray | DataFrame | DataArray, **kwargs)[source]#

Fit the model to X given y.

Parameters:

X – numpy array or pandas dataframe
y – numpy array or pandas dataframe
sample_weight – array like - weight to assign to samples
kwargs – Any additional kwargs to supply to keras fit method.

classmethod from_definition(definition: dict)[source]#

Handler for gordo.serializer.from_definition()

Parameters:: definition – Model definition

get_metadata()[source]#

Get metadata for the KerasBaseEstimator. Includes a dictionary with key “history”. The key’s value is a a dictionary with a key “params” pointing another dictionary with various parameters. The metrics are defined in the params dictionary under “metrics”. For each of the metrics there is a key who’s value is a list of values for this metric per epoch.

Return type:: Metadata dictionary, including a history object if present

static get_n_features(X: ndarray | DataFrame | DataArray) → int | tuple[source]#

static get_n_features_out(y: ndarray | DataFrame | DataArray) → int | tuple[source]#

get_params(**params)[source]#

Gets the parameters for this estimator

Parameters:: params – ignored (exists for API compatibility).
Return type:: Parameters used in this estimator

into_definition() → dict[source]#: Handler for gordo.serializer.into_definition

load_kind(kind)[source]#

static parse_module_path(module_path) → Tuple[str | None, str][source]#

predict(X: ndarray, **kwargs) → ndarray[source]#

Parameters:

X – Input data
kwargs – kwargs which are passed to Kera’s predict method

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → KerasBaseEstimator#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a pipeline.Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

property sk_params#: Parameters used for scikit learn kwargs

supported_fit_args = ['batch_size', 'epochs', 'verbose', 'callbacks', 'validation_split', 'shuffle', 'class_weight', 'initial_epoch', 'steps_per_epoch', 'validation_batch_size', 'max_queue_size', 'workers', 'use_multiprocessing']#

class gordo.machine.model.models.KerasLSTMAutoEncoder(*args: Any, **kwargs: Any)[source]#

Bases: KerasLSTMBaseEstimator

Parameters:

kind – The structure of the model to build. As designated by any registered builder functions, registered with gordo.machine.model.register.register_model_builder(). Alternatively, one may pass a builder function directly to this argument. Such a function should accept n_features as it’s first argument, and pass any additional parameters to **kwargs.
lookback_window – Number of timestamps (lags) used to train the model.
batch_size – Number of training examples used in one epoch.
epochs – Number of epochs to train the model. An epoch is an iteration over the entire data provided.
verbose – Verbosity mode. Possible values are 0, 1, or 2 where 0 = silent, 1 = progress bar, 2 = one line per epoch.
kwargs – Any arguments which are passed to the factory building function and/or any additional args to be passed to the intermediate fit method.

kwargs#

property lookahead: int#: Steps ahead in y the model should target

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → KerasLSTMAutoEncoder#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a pipeline.Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

class gordo.machine.model.models.KerasLSTMBaseEstimator(*args: Any, **kwargs: Any)[source]#

Bases: KerasBaseEstimator, TransformerMixin

Abstract Base Class to allow to train a many-one LSTM autoencoder and an LSTM 1 step forecast

Parameters:

kind – The structure of the model to build. As designated by any registered builder functions, registered with gordo.machine.model.register.register_model_builder(). Alternatively, one may pass a builder function directly to this argument. Such a function should accept n_features as it’s first argument, and pass any additional parameters to **kwargs.
lookback_window – Number of timestamps (lags) used to train the model.
batch_size – Number of training examples used in one epoch.
epochs – Number of epochs to train the model. An epoch is an iteration over the entire data provided.
verbose – Verbosity mode. Possible values are 0, 1, or 2 where 0 = silent, 1 = progress bar, 2 = one line per epoch.
kwargs – Any arguments which are passed to the factory building function and/or any additional args to be passed to the intermediate fit method.

fit(X: ndarray, y: ndarray, **kwargs) → KerasLSTMForecast[source]#

This fits a one step forecast LSTM architecture.

Parameters:

X – 2D numpy array of dimension n_samples x n_features. Input data to train.
y – 2D numpy array representing the target
kwargs – Any additional args to be passed to Keras fit_generator method.

Return type:

KerasLSTMForecast

get_metadata()[source]#

Add number of forecast steps to metadata

Return type:: Metadata dictionary, including forecast steps.

kwargs#

abstract property lookahead: int#: Steps ahead in y the model should target

predict(X: ndarray, **kwargs) → ndarray[source]#

Parameters:

X – Data to predict/transform. 2D numpy array of dimension n_samples x n_features where n_samples must be > lookback_window.

Returns:

2D numpy array of dimension (n_samples - lookback_window) x 2*n_features.
The first half of the array (results[:, :n_features]) corresponds to X offset
by lookback_window+1 (i.e., X[lookback_window:,:]) whereas the second half corresponds to
the predicted values of X[lookback_window:,:].

Example

>>> import numpy as np
>>> from gordo.machine.model.factories.lstm_autoencoder import lstm_model
>>> from gordo.machine.model.models import KerasLSTMForecast
>>> #Define train/test data
>>> X_train = np.array([[1, 1], [2, 3], [0.5, 0.6], [0.3, 1], [0.6, 0.7]])
>>> X_test = np.array([[2, 3], [1, 1], [0.1, 1], [0.5, 2]])
>>> #Initiate model, fit and transform
>>> lstm_ae = KerasLSTMForecast(kind="lstm_model",
...                             lookback_window=2,
...                             verbose=0)
>>> model_fit = lstm_ae.fit(X_train, y=X_train.copy())
>>> model_transform = lstm_ae.predict(X_test)
>>> model_transform.shape
(2, 2)

score(X: ndarray | DataFrame, y: ndarray | DataFrame, sample_weight: ndarray | None = None, **kwargs) → float[source]#

Returns the explained variance score between 1 step forecasted input and true input at next time step (note: for LSTM X is offset by lookback_window).

Parameters:

X – Input data to the model.
y – Target
sample_weight – Sample weights
kwargs – Additional kwargs for predict

Return type:

Returns the explained variance score.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → KerasLSTMBaseEstimator#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a pipeline.Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

class gordo.machine.model.models.KerasLSTMForecast(*args: Any, **kwargs: Any)[source]#

Bases: KerasLSTMBaseEstimator

Parameters:

kind – The structure of the model to build. As designated by any registered builder functions, registered with gordo.machine.model.register.register_model_builder(). Alternatively, one may pass a builder function directly to this argument. Such a function should accept n_features as it’s first argument, and pass any additional parameters to **kwargs.
lookback_window – Number of timestamps (lags) used to train the model.
batch_size – Number of training examples used in one epoch.
epochs – Number of epochs to train the model. An epoch is an iteration over the entire data provided.
verbose – Verbosity mode. Possible values are 0, 1, or 2 where 0 = silent, 1 = progress bar, 2 = one line per epoch.
kwargs – Any arguments which are passed to the factory building function and/or any additional args to be passed to the intermediate fit method.

kwargs#

property lookahead: int#: Steps ahead in y the model should target

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → KerasLSTMForecast#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a pipeline.Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

class gordo.machine.model.models.KerasRawModelRegressor(*args: Any, **kwargs: Any)[source]#

Bases: KerasAutoEncoder

Create a scikit-learn like model with an underlying tensorflow.keras model from a raw config.

Examples

>>> import yaml
>>> import numpy as np
>>> config_str = '''
...   # Arguments to the .compile() method
...   compile:
...     loss: mse
...     optimizer: adam
...
...   # The architecture of the model itself.
...   spec:
...     tensorflow.keras.models.Sequential:
...       layers:
...         - tensorflow.keras.layers.Dense:
...             units: 4
...         - tensorflow.keras.layers.Dense:
...             units: 1
... '''
>>> config = yaml.safe_load(config_str)
>>> model = KerasRawModelRegressor(kind=config)
>>>
>>> X, y = np.random.random((10, 4)), np.random.random((10, 1))
>>> model.fit(X, y, verbose=0)
KerasRawModelRegressor(kind: {'compile': {'loss': 'mse', 'optimizer': 'adam'},
 'spec': {'tensorflow.keras.models.Sequential': {'layers': [{'tensorflow.keras.layers.Dense': {'units': 4}},
                                                            {'tensorflow.keras.layers.Dense': {'units': 1}}]}}})
>>> out = model.predict(X)

Initialized a Scikit-Learn API compatitble Keras model with a pre-registered function or a builder function directly.

Parameters:

kind – The structure of the model to build. As designated by any registered builder functions, registered with gordo.machine.model.register.register_model_builder(). Alternatively, one may pass a builder function directly to this argument. Such a function should accept n_features as it’s first argument, and pass any additional parameters to **kwargs
kwargs (dict) – Any additional args which are passed to the factory building function and/or any additional args to be passed to Keras’ fit() method

kwargs#

load_kind(kind)[source]#

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → KerasRawModelRegressor#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a pipeline.Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

gordo.machine.model.models.create_keras_timeseriesgenerator(X: ndarray, y: ndarray | None, batch_size: int, lookback_window: int, lookahead: int) → tensorflow.keras.preprocessing.sequence.TimeseriesGenerator[source]#

Provides a keras.preprocessing.sequence.TimeseriesGenerator for use with LSTM’s, but with the added ability to specify the lookahead of the target in y.

If lookahead==0 then the generated samples in X will have as their last element the same as the corresponding Y. If lookahead is 1 then the values in Y is shifted so it is one step in the future compared to the last value in the samples in X, and similar for larger values.

Parameters:

X – 2d array of values, each row being one sample.
y – array representing the target.
batch_size – How big should the generated batches be?
lookback_window – How far back should each sample see. 1 means that it contains a single measurement
lookahead – How much is Y shifted relative to X

Returns:

3d matrix with a list of batchX-batchY pairs, where batchX is a batch of
X-values, and correspondingly for batchY. A batch consist of batch_size nr
of pairs of samples (or y-values), and each sample is a list of length
lookback_window.

Examples

>>> import numpy as np
>>> X, y = np.random.rand(100,2), np.random.rand(100, 2)
>>> gen = create_keras_timeseriesgenerator(X, y,
...                                        batch_size=10,
...                                        lookback_window=20,
...                                        lookahead=0)
>>> len(gen) # 9 = (100-20+1)/10
9
>>> len(gen[0]) # batchX and batchY
2
>>> len(gen[0][0]) # batch_size=10
10
>>> len(gen[0][0][0]) # a single sample, lookback_window = 20,
20
>>> len(gen[0][0][0][0]) # n_features = 2
2