Python code runs in the context of a virtual environment that defines the version of the Python runtime to be used as well as the installed packages available to the code. In most Python installations, packages are installed and managed in environments using Conda
or pip
. To improve portability, we usually create environments in docker containers that are in turn be hosted in compute targets, such as your development computer, virtual machines, or clusters in the cloud.
There are multiple ways to create environments in Azure Machine Learning:
You can use a Conda
or pip
specification file to define the packages required in a Python environment, and use it to create an Environment object. For example, you could save the following Conda
configuration settings in a file named conda.yml
:
name: py_env
dependencies:
- numpy
- pandas
- scikit-learn
- pip:
- azureml-defaults
You could then use the following code to create an Azure Machine Learning environment from the saved specification file:
from azureml.core import Environment
env = Environment.from_conda_specification(name='training_environment',
file_path='./conda.yml')
If you have an existing Conda environment defined on your workstation, you can use it to define an Azure Machine Learning environment:
from azureml.core import Environment
env = Environment.from_existing_conda_environment(name='training_environment',
conda_environment_name='py_env')
You can define an environment by specifying the Conda and pip packages you need in a CondaDependencies object, like this:
from azureml.core import Environment
from azureml.core.conda_dependencies import CondaDependencies
env = Environment('training_environment')
deps = CondaDependencies.create(conda_packages=['scikit-learn','pandas','numpy'],
pip_packages=['azureml-defaults'])
env.python.conda_dependencies = deps
Usually, environments for experiment script are created in containers. The following code configures a script-based experiment to host the env
environment created previously in a container (this is the default unless you use a DockerConfiguration with a use_docker
attribute of False
, in which case the environment is created directly in the compute target):
from azureml.core import Experiment, ScriptRunConfig
from azureml.core.runconfig import DockerConfiguration
docker_config = DockerConfiguration(use_docker=True)
script_config = ScriptRunConfig(source_directory='my_folder',
script='my_script.py',
environment=env,
docker_runtime_config=docker_config)
Azure Machine Learning uses a library of base images for containers, choosing the appropriate base for the compute target you specify. If you have created custom container images and registered them in a container registry, you can override the default base images and use your own by modifying the attributes of the environment’s docker property:
env.docker.base_image='my-base-image'
env.docker.base_image_registry='myregistry.azurecr.io/myimage'
Alternatively, you can have an image created on-demand based on the base image and additional settings in a dockerfile
:
env.docker.base_image = None
env.docker.base_dockerfile = './Dockerfile'
By default, Azure machine Learning handles Python paths and package dependencies. If your image already includes an installation of Python with the dependencies you need, you can override this behavior by setting python.user_managed_dependencies
to True
and setting an explicit Python path for your installation.
env.python.user_managed_dependencies=True
env.python.interpreter_path = '/opt/miniconda/bin/python'
Use the register
method of an Environment object to register an environment: env.register(workspace=ws)
.
You can view the registered environments in your workspace like this:
from azureml.core import Environment
env_names = Environment.list(workspace=ws)
for env_name in env_names:
print('Name:',env_name)
You can retrieve a registered environment by using the get
method of the Environment class, and then assign it to a ScriptRunConfig. The following code sample retrieves the training_environment
registered environment, and assigns it to a script run configuration:
from azureml.core import Environment, ScriptRunConfig
training_env = Environment.get(workspace=ws, name='training_environment')
script_config = ScriptRunConfig(source_directory='my_folder',
script='my_script.py',
environment=training_env)
When an experiment based on the estimator is run, Azure Machine Learning will look for an existing environment that matches the definition, and if none is found a new environment will be created based on the registered environment specification.
In Azure Machine Learning, Compute Targets are physical or virtual computers on which experiments are run. Types of compute are:
Source: Microsoft Learn