reading-notes

Module 3 - Build and operate machine learning solutions with Azure Machine Learning

Introduction to the Azure Machine Learning SDK

Azure Machine Learning enables you to manage machine learning model data preparation, training, validation, and deployment. It supports existing frameworks such as Scikit-Learn, PyTorch, and Tensorflow; and provides a cross-platform platform for operationalizing machine learning in the cloud.

Azure Machine Learning workspaces

A workspace is a context for the experiments, data, compute targets, and other assets associated with a machine learning workload. A workspace defines the boundary for a set of related machine learning assets. You can use workspaces to group machine learning assets based on projects, deployment environments (for example, test and production), teams, or some other organizing principle. The assets in a workspace include:

Workspaces are Azure resources, and as such they are defined within a resource group in an Azure subscription, along with other related Azure resources that are required to support the workspace. The Azure resources created alongside a workspace include:

You can assign role-based access control (RBAC) authorization policies to a workspace, enabling you to manage permissions that restrict what actions specific Azure Active Directory (AAD) principals can perform.


Azure Machine Learning tools and interfaces

Azure Machine Learning provides a cloud-based service that offers flexibility in how you use it. There are user interfaces specifically designed for Azure Machine Learning:

Azure Machine Learning includes the ability to create Compute Instances in a workspace to provide a development environment that is managed with all of the other assets in the workspace. Compute Instances include Jupyter Notebook and JupyterLab installations that you can use to write and run code that uses the Azure Machine Learning SDK to work with assets in your workspace. You can choose a compute instance image that provides the compute specification you need, from small CPU-only VMs to large GPU-enabled workstations. Because compute instances are hosted in Azure, you only pay for the compute resources when they are running. You can store notebooks independently in workspace storage, and open them in any compute instance.

Compute Instances

The Azure Machine Learning Extension for Visual Studio Code provides a graphical interface for working with assets in an Azure Machine Learning workspace. You can combine the capabilities of the Azure Machine Learning and Python extensions to manage a complete end-to-end machine learning workload in Azure Machine Learning from the Visual Studio Code environment.

VS Code Extension


Azure Machine Learning experiments

In Azure Machine Learning, an experiment is a named process, usually the running of a script or a pipeline, that can generate metrics and outputs and be tracked in the Azure Machine Learning workspace.

When you submit an experiment, you use its run context to initialize and end the experiment run that is tracked in Azure Machine Learning. After the experiment run has completed, you can view the details of the run in the Experiments tab in Azure Machine Learning studio.

  from azureml.core import Experiment

  # create an experiment variable
  experiment = Experiment(workspace = ws, name = "my-experiment")

  # start the experiment
  run = experiment.start_logging()

  # experiment code goes here

  # end the experiment
  run.complete()

Every experiment generates log files that include the messages that would be written to the terminal during interactive execution. This enables you to use simple print statements to write messages to the log. However, if you want to record named metrics for comparison across runs, you can do so by using the Run object; which provides a range of logging functions specifically for this purpose. These include:

You can view the metrics logged by an experiment run in Azure Machine Learning studio or by using the RunDetails widget in a notebook:

  from azureml.widgets import RunDetails

  RunDetails(run).show()

You can also retrieve the metrics using the Run object’s get_metrics method, which returns a JSON representation of the metrics:

  import json

  # Get logged metrics
  metrics = run.get_metrics()
  print(json.dumps(metrics, indent=2))

The previous code might produce output similar to this:

  {
    "observations": 15000
  }

In addition to logging metrics, an experiment can generate output files. Often these are trained machine learning models, but you can save any sort of file and make it available as an output of your experiment run. You can upload local files to the run’s outputs folder by using the Run object’s upload_file method in your experiment code: run.upload_file(name='outputs/sample.csv', path_or_stream='./sample.csv')

When running an experiment in a remote compute context, any files written to the outputs folder in the compute context are automatically uploaded to the run’s outputs folder when the run completes. You can retrieve a list of output files from the Run object like this:

  import json

  files = run.get_file_names()
  print(json.dumps(files, indent=2))

The previous code would produce output similar to this:

  [
    "outputs/sample.csv"
  ]

You can run an experiment inline using the start_logging method of the Experiment object, but it’s more common to encapsulate the experiment logic in a script and run the script as an experiment. To access the experiment run context (which is needed to log metrics) the script must import the azureml.core.Run class and call its get_context method. The script can then use the run context to log metrics, upload files, and complete the experiment:

  from azureml.core import Run
  import pandas as pd
  import matplotlib.pyplot as plt
  import os

  # Get the experiment run context
  run = Run.get_context()

  # load the diabetes dataset
  data = pd.read_csv('data.csv')

  # Count the rows and log the result
  row_count = (len(data))
  run.log('observations', row_count)

  # Save a sample of the data
  os.makedirs('outputs', exist_ok=True)
  data.sample(100).to_csv("outputs/sample.csv", index=False, header=True)

  # Complete the run
  run.complete()

To run a script as an experiment, you must define a script configuration that defines the script to be run and the Python environment in which to run it. This is implemented by using a ScriptRunConfig object. The following code could be used to run an experiment based on a script in the experiment_files folder (which must also contain any files used by the script, such as the data.csv file in previous script code example):

  from azureml.core import Experiment, ScriptRunConfig

  # Create a script config
  script_config = ScriptRunConfig(source_directory=experiment_folder,
                                  script='experiment.py') 

  # submit the experiment
  experiment = Experiment(workspace = ws, name = 'my-experiment')
  run = experiment.submit(config=script_config)
  run.wait_for_completion(show_output=True)

Note: An implicitly created RunConfiguration object defines the Python environment for the experiment, including the packages available to the script. If your script depends on packages that are not included in the default environment, you must associate the ScriptRunConfig with an Environment object that makes use of a CondaDependencies object to specify the Python packages required.


Summary

In this module, you learned how to:

Source: Microsoft Learn