Utils#
- gordo.server.utils.check_metadata_file(directory: str, name: str)[source]#
Checking if the directory with metadata exists since it might be deleted through DELETE endpoint
- gordo.server.utils.dataframe_from_dict(data: dict) DataFrame[source]#
The inverse procedure done by
multi_lvl_column_dataframe_from_dict()Reconstructed a MultiIndex column dataframe from a previously serialized one.Expects
datato be a nested dictionary where each top level key has a value capable of being loaded frompandas.core.DataFrame.from_dict()- Parameters:
data – Data to be loaded into a MultiIndex column dataframe
- Return type:
MultiIndex column dataframe.
Examples
>>> serialized = { ... 'feature0': {'sub-feature-0': {'2019-01-01': 0, '2019-02-01': 4}, ... 'sub-feature-1': {'2019-01-01': 1, '2019-02-01': 5}}, ... 'feature1': {'sub-feature-0': {'2019-01-01': 2, '2019-02-01': 6}, ... 'sub-feature-1': {'2019-01-01': 3, '2019-02-01': 7}} ... } >>> dataframe_from_dict(serialized) feature0 feature1 sub-feature-0 sub-feature-1 sub-feature-0 sub-feature-1 2019-01-01 0 1 2 3 2019-02-01 4 5 6 7
- gordo.server.utils.dataframe_from_parquet_bytes(buf: bytes) DataFrame[source]#
Convert bytes representing a parquet table into a pandas dataframe.
- Parameters:
buf – Bytes representing a parquet table. Can be the direct result from func::gordo.server.utils.dataframe_into_parquet_bytes
- gordo.server.utils.dataframe_into_parquet_bytes(df: DataFrame, compression: str = 'snappy') bytes[source]#
Convert a dataframe into bytes representing a parquet table.
- Parameters:
df – DataFrame to be compressed
compression – Compression to use, passed to
pyarrow.parquet.write_table()
- gordo.server.utils.dataframe_to_dict(df: DataFrame) dict[source]#
Convert a dataframe can have a
pandas.MultiIndexas columns into a dict where each key is the top level column name, and the value is the array of columns under the top level name. If it’s a simple dataframe,pandas.core.DataFrame.to_dict()will be used.This allows
json.dumps()to be performed, wherepandas.DataFrame.to_dict()would convert such a multi-level column dataframe into keys oftupleobjects, which are not json serializable. However this ends up working withpandas.DataFrame.from_dict()- Parameters:
df – Dataframe expected to have columns of type
pandas.MultiIndex2 levels deep.- Return type:
List of records representing the dataframe in a ‘flattened’ form.
Examples
>>> import pprint >>> import pandas as pd >>> import numpy as np >>> columns = pd.MultiIndex.from_tuples((f"feature{i}", f"sub-feature-{ii}") for i in range(2) for ii in range(2)) >>> index = pd.date_range('2019-01-01', '2019-02-01', periods=2) >>> df = pd.DataFrame(np.arange(8).reshape((2, 4)), columns=columns, index=index) >>> df feature0 feature1 sub-feature-0 sub-feature-1 sub-feature-0 sub-feature-1 2019-01-01 0 1 2 3 2019-02-01 4 5 6 7 >>> serialized = dataframe_to_dict(df) >>> pprint.pprint(serialized) {'feature0': {'sub-feature-0': {'2019-01-01': 0, '2019-02-01': 4}, 'sub-feature-1': {'2019-01-01': 1, '2019-02-01': 5}}, 'feature1': {'sub-feature-0': {'2019-01-01': 2, '2019-02-01': 6}, 'sub-feature-1': {'2019-01-01': 3, '2019-02-01': 7}}}
- gordo.server.utils.delete_revision(directory: str, name: str)[source]#
Delete model revision
- Parameters:
directory (directory - Revision) –
name (name - Model) –
- gordo.server.utils.extract_X_y(method)[source]#
For a given flask view, will attempt to extract an ‘X’ and ‘y’ from the request and assign it to flask’s ‘g’ global request context
If it fails to extract ‘X’ and (optionally) ‘y’ from the request, it will not run the function but return a
BadRequestresponse notifying the client of the failure.- Parameters:
method – The flask route to decorate, and will return it’s own response object and will want to use
flask.g.Xand/orflask.g.y- Returns:
Will either run a
flask.Responsewith status code 400 if it failsto extract the X and optionally the y. Otherwise will run the decorated
methodwhich is also expected to return some sort of
flask.Responseobject.
- gordo.server.utils.load_metadata(directory: str, name: str) dict[source]#
Load metadata from a directory for a given model by name.
- Parameters:
directory – Directory to look for the model’s metadata
name – Name of the model to load metadata for, this would be the sub directory within the directory parameter.
- gordo.server.utils.load_model(directory: str, name: str) BaseEstimator#
Load a given model from the directory by name.
- Parameters:
directory – Directory to look for the model
name – Name of the model to load, this would be the sub directory within the directory parameter.
- gordo.server.utils.metadata_required(f)[source]#
Decorate a view which has
gordo_nameas a url parameter and will setg.metadatato that model’s metadata
- gordo.server.utils.model_required(f)[source]#
Decorate a view which has
gordo_nameas a url parameter and will setg.modelto be the loaded model andg.metadatato that model’s metadata
- gordo.server.utils.validate_gordo_name(gordo_name: str)[source]#
gordo_name argument should contains alpha-numericals or ‘-’ symbols
The general model input/output operations applied by blueprints.
- gordo.server.model_io.get_model_output(model: Pipeline, X: ndarray) ndarray[source]#
Get the raw output from the current model given X. Will try to predict and then transform, raising an error if both fail.
- Parameters:
X – 2d array of sample(s)
- Return type:
The raw output of the model in numpy array form.