Reporters#
- class gordo.reporters.base.BaseReporter[source]#
Bases:
ABC
- class gordo.reporters.mlflow.MlFlowReporter(*args, model_builder_class: str | Type[ModelBuilder] | None = None, **kwargs)[source]#
Bases:
BaseReporter
- exception gordo.reporters.mlflow.MlflowLoggingError[source]#
Bases:
ReporterException
- gordo.reporters.mlflow.batch_log_items(metrics: List[Metric], params: List[Param], n_max_metrics: int = 200, n_max_params: int = 100) List[Dict[str, Metric | Param]][source]#
Split metrics, params and tags to batches that satisfy limits imposed by MLFlow and AzureML
NOTE: The default maximum number of metrics are and parameters set here are those set by AzureML as per today, 18 February 2020.
Also, there the 1mb request size is not evaluated here, as doing this should not be necessary and is not addressable in a succint way. MLflow also has a limit of 1000 log items per request, but reaching this is not possible with AzureML’s current limit on metrics.
- Parameters:
metrics – List of MLFlow Metric objects to log.
params – List of MLFlow Param objects to log.
n_max_metrics – Limit to number of metrics AzureML allows per batch log request payload.
n_max_params – Limit to number of params MLFlow allows per batch log request payload.
- Returns:
List of MlflowClinet.log_batch keyworkd arguments, split to quatnitites
that respect limits present for MLFlow and AzureML.
- gordo.reporters.mlflow.epoch_now() int[source]#
Get current timestamp in UTC as milliseconds since Unix epoch.
- Return type:
Milliseconds since Unix epoch.
- gordo.reporters.mlflow.get_kwargs_from_secret(name: str, keys: List[str]) dict[source]#
Get keyword arguments dictionary from secrets environment variable
- Parameters:
name – Name of the environment variable whose content is a colon separated list of secrets.
- Return type:
Dictionary of keyword arguments parsed from environment variable.
- gordo.reporters.mlflow.get_machine_log_items(machine: Machine) Tuple[List[Metric], List[Param]][source]#
Create flat lists of MLflow logging entities from multilevel dictionary
For more information, see the mlflow docs.
- Parameters:
machine – Machine to log.
- Returns:
List of MLFlow Metric objects to log.
List of MLFlow Param objects to log.
- gordo.reporters.mlflow.get_mlflow_client(workspace_kwargs: dict = {}, service_principal_kwargs: dict = {}) MlflowClient[source]#
Set remote tracking URI for mlflow to AzureML workspace
- Parameters:
workspace_kwargs –
AzureML Workspace configuration to use for remote MLFlow tracking. An empty dict will result in local logging by the MlflowClient.
{ "subscription_id":<value>, "resource_group":<value>, "workspace_name":<value> }
service_principal_kwargs (dict) –
AzureML ServicePrincipalAuthentication keyword arguments. An empty dict will result in interactive authentication.
{ "tenant_id":<value>, "service_principal_id":<value>, "service_principal_password":<value> }
- Return type:
Client with tracking uri set to AzureML if configured.
- gordo.reporters.mlflow.get_run_id(client: MlflowClient, experiment_name: str, model_key: str) str[source]#
Get an existing or create a new run for the given model_key and experiment_name.
The model key corresponds to a unique configuration of the model. The corresponding run must be manually stopped using the mlflow.tracking.MlflowClient.set_terminated method.
- Parameters:
client – Client with tracking uri set to AzureML if configured.
experiment_name – Name of experiment to log to.
model_key – Unique ID of model configuration.
- Return type:
Unique ID of MLflow run to log to.
- gordo.reporters.mlflow.get_spauth_kwargs() dict[source]#
Get AzureML keyword arguments from environment
The name of this environment variable is set in the Argo workflow template, and its value should be in the format: <tenant_id>:<service_principal_id>:<service_principal_password>
- Returns:
AzureML ServicePrincipalAuthentication keyword arguments. See
gordo.builder.mlflow_utils.get_mlflow_client()
- gordo.reporters.mlflow.get_workspace_kwargs() dict[source]#
Get AzureML keyword arguments from environment
The name of this environment variable is set in the Argo workflow template, and its value should be in the format: <subscription_id>:<resource_group>:<workspace_name>.
- Returns:
AzureML Workspace configuration to use for remote MLFlow tracking. See
gordo.builder.mlflow_utils.get_mlflow_client().
- gordo.reporters.mlflow.log_machine(mlflow_client: MlflowClient, run_id: str, machine: Machine)[source]#
Send logs to configured MLflow backend
- Parameters:
mlflow_client – Client instance to call logging methods from.
run_id – Unique ID off MLflow Run to log to.
machine – Machine to log with MlflowClient.
- gordo.reporters.mlflow.mlflow_context(name: str, model_key: str = 'e040fc7030e64d00a2290451a11d6c38', workspace_kwargs: dict = {}, service_principal_kwargs: dict = {})#
Generate MLflow logger function with either a local or AzureML backend
- Parameters:
name – The name of the log group to log to (e.g. a model name).
model_key – Unique ID of logging run.
workspace_kwargs – AzureML Workspace configuration to use for remote MLFlow tracking. See
gordo.builder.mlflow_utils.get_mlflow_client().service_principal_kwargs – AzureML ServicePrincipalAuthentication keyword arguments. See
gordo.builder.mlflow_utils.get_mlflow_client()
Example
>>> with tempfile.TemporaryDirectory as tmp_dir: ... mlflow.set_tracking_uri(f"file:{tmp_dir}") ... with mlflow_context("log_group", "unique_key", {}, {}) as (mlflow_client, run_id): ... log_machine(machine)
- class gordo.reporters.postgres.Machine(*args, **kwargs)[source]#
Bases:
Model- DoesNotExist#
alias of
MachineDoesNotExist
- dataset = <BinaryJSONField: Machine.dataset>#
- metadata = <BinaryJSONField: Machine.metadata>#
- model = <BinaryJSONField: Machine.model>#
- name = <CharField: Machine.name>#
- class gordo.reporters.postgres.PostgresReporter(host: str, port: int = 5432, user: str = 'postgres', password: str = 'postgres', database: str = 'postgres')[source]#
Bases:
BaseReporterReporter storing the
gordo.machine.Machineinto a Postgres database.- db = <playhouse.postgres_ext.PostgresqlExtDatabase object>#
- exception gordo.reporters.postgres.PostgresReporterException[source]#
Bases:
ReporterException