Error when deploying Auto ML model for image classification

Question

Error when deploying Auto ML model for image classification

Chris (Proudback) 20

Hi community,

we created a simple image classification (single class) model using auto ml. The process seems to finish fine and the model (based on seresnext) is available for deploy.

When using the automated deploy mechanism to a real-time endpoint, system uses generated scoring script, conda files etc. to create the environment.

Creation of the endpoint succeeds but deployment fails with the following error log:

LibMambaUnsatisfiableError: Encountered problems while solving:

2025-11-27T02:19:59: #10 29.79 - nothing provides _python_rc needed by python-3.14.0rc1-h4dad89b_2_cp314t

Then comes a long list of incompatible libraries:

Could not solve for environment specs

2025-11-27T02:19:59: #10 29.79 The following packages are incompatible

2025-11-27T02:19:59: #10 29.79 ├─ botocore =1.23.19 * is installable and it requires

2025-11-27T02:19:59: #10 29.79 │ └─ urllib3 >=1.25.4,<1.27 * with the potential options

2025-11-27T02:19:59: #10 29.79 │ ├─ urllib3 [1.25.5|1.25.6|1.25.7] would require

2025-11-27T02:19:59: #10 29.79 │ │ └─ python >=2.7,<2.8.0a0 * with the potential options

2025-11-27T02:19:59: #10 29.79 │ │ ├─ python [2.7.18|3.10.10|...|3.8.20] would require

2025-11-27T02:19:59: #10 29.79 │ │ │ ├─ libffi >=3.4,<4.0a0 *, which can be installed;

2025-11-27T02:19:59: #10 29.79 │ │ │ └─ pypy3.8 [=7.3.11 *|=7.3.9 *] with the potential options

2025-11-27T02:19:59: #10 29.79 │ │ │ ├─ pypy3.8 [7.3.11|7.3.9] would require

2025-11-27T02:19:59: #10 29.79 │ │ │ │ └─ libffi >=3.4,<4.0a0 *, which can be installed;

2025-11-27T02:19:59: #10 29.79 │ │ │ └─ pypy3.8 [7.3.8|7.3.9] would require

2025-11-27T02:19:59: #10 29.79 │ │ │ └─ libffi >=3.4.2,<3.5.0a0 *, which can be installed;

2025-11-27T02:19:59: #10 29.79 │ │ ├─ python [2.7.13|2.7.14|...|3.8.6] would require

2025-11-27T02:19:59: #10 29.79 │ │ │ ├─ libffi [=3.2 *|>=3.2.1,<3.3.0a0 *|>=3.2.1,<3.3a0 *], which can be installed;

And so on, leading to

ERROR: process "/bin/sh -c ldconfig /usr/local/cuda/lib64/stubs && conda env create -p /azureml-envs/azureml_75ba4dda9b29345db018f712dXXXXXX -f azureml-environment-setup/mutated_conda_dependencies.yml && rm -rf "$HOME/.cache/pip" && conda clean -aqy && CONDA_ROOT_DIR=$(conda info --root) && rm -rf "$CONDA_ROOT_DIR/pkgs" && find "$CONDA_ROOT_DIR" -type d -name pycache -exec rm -rf {} + && ldconfig" did not complete successfully: exit code: 1

Creating the right dependencies is the whole point of the auto creation of environments or is it required to go into the conda files?

Or is there a recommended curated environment for computer vision (i.e. image classification) models?

Thanks!

Chris

Chris (Proudback) 20

Thank you very much Anshika, this helped to point in the right direction.

However, now the problems seems more strange (again, an auto ml model using auto deploy).

The auto generated conda file did specify 3.9.7 as the Python version, despite some libraries requiring a more recent Python version. Changing this to 3.14 or 3.10 solved many of the incompatibilities. There are two libraries still which appear to have a version conflict:

The following packages are incompatible
2025-11-27T05:50:18: #10 231.0 ├─ botocore =1.23.19 * is installable and it requires
2025-11-27T05:50:18: #10 231.0 │  └─ urllib3 >=1.25.4,<1.27 *, which can be installed;
2025-11-27T05:50:18: #10 231.0 ├─ libsqlite =3.46.0 * is requested and can be installed;
2025-11-27T05:50:18: #10 231.0 ├─ python =3.14.0 * is installable with the potential options
2025-11-27T05:50:18: #10 231.0 │  ├─ python 3.14.0 would require
2025-11-27T05:50:18: #10 231.0 │  │  └─ python_abi =* *_cp314, which can be installed;
2025-11-27T05:50:18: #10 231.0 │  ├─ python 3.14.0 would require
2025-11-27T05:50:18: #10 231.0 │  │  └─ python_abi =* *_cp314t, which can be installed;
2025-11-27T05:50:18: #10 231.0 │  ├─ python 3.14.0 would require
2025-11-27T05:50:18: #10 231.0 │  │  └─ libsqlite >=3.50.4,<4.0a0 *, which conflicts with any installable versions previously reported;
2025-11-27T05:50:18: #10 231.0 │  └─ python [3.14.0rc1|3.14.0rc2|3.14.0rc3] would require
2025-11-27T05:50:18: #10 231.0 │     └─ _python_rc =* *, which does not exist (perhaps a missing channel);
2025-11-27T05:50:18: #10 231.0 ├─ python_abi =3.9 * is not installable because there are no viable options
2025-11-27T05:50:18: #10 231.0 │  ├─ python_abi 3.9 would require
2025-11-27T05:50:18: #10 231.0 │  │  └─ python =3.9 *_73_pypy, which conflicts with any installable versions previously reported;
2025-11-27T05:50:18: #10 231.0 │  ├─ python_abi 3.9 would require
2025-11-27T05:50:18: #10 231.0 │  │  └─ python =3.9 *_cpython, which conflicts with any installable versions previously reported;
2025-11-27T05:50:18: #10 231.0 │  └─ python_abi 3.9 conflicts with any installable versions previously reported;
2025-11-27T05:50:18: #10 231.0 └─ urllib3 =2.5.0 * is not installable because it conflicts with any installable versions previously reported.

Botocore and urllib3 are indeed incompatible in the versions specified:

	- boto3=1.20.19 (from 2021)
  	- botocore=1.23.19
  	- urllib3=2.5.0 (up to date)

Is this expected behaviour that the auto generated conda files need to be vetted / tested or a re we doing something wrong elsewhere?

We will check if changing boto3 / botocore to more recent version will help. Is there any better way to this simple try / error approach? Also as it appear there are way more libraries in the conda file than should be needed. Increasing the risk of such conflicts.

Anshika Varshney 3,795 Reputation points Microsoft External Staff Moderator

2025-11-28T09:01:36.1166667+00:00
Hi Chris (Proudback),

This behavior is actually expected with AutoML + auto-deploy, because the generated Conda environment often contains older pinned packages that may not match newer Python versions. So yes, the Conda file usually needs a quick manual cleanup before deployment.

The conflict you’re seeing comes from old boto3/botocore (from 2021) being mixed with urllib3 2.x, which they don’t support. Updating both boto3 and botocore to recent versions normally resolves the issue.

A simple and more stable approach is:

Start with Python 3.10 (AutoML models are currently most stable with 3.8–3.10).

Remove outdated AWS libraries (boto3/botocore) from the Conda file unless your model really depends on them.

If you need them, manually pin modern compatible versions, e.g.: boto3 ≥ 1.34 botocore ≥ 1.34 urllib3 1.26.x or 2.x depending on boto3 version

Remove any unnecessary packages AutoML added, it often includes more libraries than needed.

This avoids the try-and-error loop and gives a much cleaner deployment environment.

Please let me know if there are any remaining questions or additional details, I can help with, I’ll be glad to provide further clarification or guidance.

Thankyou!

Chris (Proudback) 20

Hi Anshika,

Many thanks again, this makes a lot sense. I basically tried all weekend to create an environment from the standard Ubuntu image with our custom conda file, removing more and more libraries. Either we are overlooking something very fundamental, this process is not leading anywhere. Every environment creation timed out, even with only a few basic azureml-xxx-xxx libraries.

A more promising path seemed the following curated environment:
ai-ml-automl-dnn-vision-gpu/versions/34

This seems to be the somewhat newer version of what is being used by the auto generated model, listing this in the MLmodel artifact:

ai-ml-automl-dnn-vision-gpu:27

But again, this throws another error, now when deploying to the endpoint.

First a warning that keys which are used in the scoring file are not supported:

2025-11-29 09:08:18,516 W [138] azmlinfsrv - Found extra keys in the config file that are not supported by the server.
Extra keys = ['AZUREML_ENTRY_SCRIPT', 'AZUREML_MODEL_DIR', 'HOSTNAME']

And further down the init function fails to load the model:

2025-11-29 09:08:26,552 I [138] azmlinfsrv.user_script - Found user script at /var/azureml-app/251129164249-4119349585/scoring_file_v_1_0_0.py
2025-11-29 09:08:26,552 I [138] azmlinfsrv.user_script - run() is decorated with @rawhttp. Server will invoke it with the flask request object.
2025-11-29 09:08:26,552 I [138] azmlinfsrv.user_script - Invoking user's init function
2025-11-29 09:08:26,569 E [138] azmlinfsrv - User's init function failed
2025-11-29 09:08:26,570 E [138] azmlinfsrv - Encountered Exception Traceback (most recent call last):
  File "/azureml-envs/azureml-automl-dnn-vision-gpu/lib/python3.10/site-packages/azureml_inference_server_http/server/user_script.py", line 119, in invoke_init
    self._user_init()
  File "/var/azureml-app/251129164249-4119349585/scoring_file_v_1_0_0.py", line 28, in init
    model_path = Model.get_model_path(model_name='landcruiserdete0')
  File "/azureml-envs/azureml-automl-dnn-vision-gpu/lib/python3.10/site-packages/azureml/core/model.py", line 835, in get_model_path
    return Model._get_model_path_local(model_name, version)
  File "/azureml-envs/azureml-automl-dnn-vision-gpu/lib/python3.10/site-packages/azureml/core/model.py", line 856, in _get_model_path_local
    return Model._get_model_path_local_from_root(model_name)
  File "/azureml-envs/azureml-automl-dnn-vision-gpu/lib/python3.10/site-packages/azureml/core/model.py", line 898, in _get_model_path_local_from_root
    raise ModelNotFoundException("Model {} not found in cache at {} or in current working directory {}. "
azureml.exceptions._azureml_exception.ModelNotFoundException: ModelNotFoundException:
	Message: Model landcruiserdete0 not found in cache at /var/azureml-app/251129164249-4119349585/azureml-models or in current working directory /var/azureml-app/251129164249-4119349585. For more info, set logging level to DEBUG.
	InnerException None
	ErrorResponse 
{
    "error": {
        "message": "Model landcruiserdete0 not found in cache at /var/azureml-app/251129164249-4119349585/azureml-models or in current working directory /var/azureml-app/251129164249-4119349585. For more info, set logging level to DEBUG."
    }
}

Any help or further pointers would be appreciated!

Thanks,

Chris

Chris (Proudback) 20 Reputation points

2025-12-02T08:11:41.52+00:00

One addition. After another round of stripping down the libraries, only leaving core azureml / dnn / vision libraries the environment creation succeeded.

Model deployment then failed with the same error above.
Anshika Varshney 3,795 Reputation points Microsoft External Staff Moderator

2025-12-02T13:07:07.8933333+00:00

Hi Chris (Proudback),

I have escalated your issue to my expert team so please check private chat and provide some details.

Thankyou!

2 answers

Your answer

Anshika Varshney 3,795 Reputation points Microsoft External Staff Moderator

2025-11-28T09:01:36.1166667+00:00

Hi Chris (Proudback),

This behavior is actually expected with AutoML + auto-deploy, because the generated Conda environment often contains older pinned packages that may not match newer Python versions. So yes, the Conda file usually needs a quick manual cleanup before deployment.

The conflict you’re seeing comes from old boto3/botocore (from 2021) being mixed with urllib3 2.x, which they don’t support. Updating both boto3 and botocore to recent versions normally resolves the issue.

A simple and more stable approach is:

Start with Python 3.10 (AutoML models are currently most stable with 3.8–3.10).

Remove outdated AWS libraries (boto3/botocore) from the Conda file unless your model really depends on them.

If you need them, manually pin modern compatible versions, e.g.: boto3 ≥ 1.34 botocore ≥ 1.34 urllib3 1.26.x or 2.x depending on boto3 version

Remove any unnecessary packages AutoML added, it often includes more libraries than needed.

This avoids the try-and-error loop and gives a much cleaner deployment environment.

Please let me know if there are any remaining questions or additional details, I can help with, I’ll be glad to provide further clarification or guidance.

Thankyou!
Chris (Proudback) 20 Reputation points

2025-12-02T08:11:41.52+00:00

One addition. After another round of stripping down the libraries, only leaving core azureml / dnn / vision libraries the environment creation succeeded.

Model deployment then failed with the same error above.
Anshika Varshney 3,795 Reputation points Microsoft External Staff Moderator

2025-12-02T13:07:07.8933333+00:00

Hi Chris (Proudback),

I have escalated your issue to my expert team so please check private chat and provide some details.

Thankyou!

Answer 1

Hi Chris (Proudback),

Thanks for the question.

The error you’re seeing usually happens when the AutoML deployment tries to build an environment with package versions that cannot be resolved. In your case, the logs show that the environment is trying to use a release-candidate Python version (python-3.14.0rc1), which causes dependency conflicts during the image-build step.

AutoML often auto-generates a Conda environment, but sometimes the selected Python version or library versions are incompatible. When that happens, the deployment fails before the model is even packaged. This is a common issue with computer-vision models because they depend on several heavy libraries.

A good next step is to download the generated conda file from your AutoML run and replace the Python version with a stable one (for example Python 3.8 or 3.10). After updating the file, you can redeploy the model using a custom environment. This typically resolves the dependency conflict.

You can also test the environment locally or manually rebuild the environment in Azure ML by specifying your own conda YAML. Many users find that creating a simple custom environment avoids these “unsatisfiable dependency” errors completely.

If the issue continues even after updating the environment, let me know or if there are any remaining questions or additional details, I can help with, I’ll be glad to provide further clarification or guidance.

Thankyou!

Anshika Varshney 3,795 Reputation points Microsoft External Staff Moderator

2025-12-05T11:55:27.18+00:00

Hi Chris (Proudback),

Thank you for sharing the update I appreciate you taking the time to confirm the resolution!

I’ve converted my earlier comment into an answer, please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful. This helps others in the community with the same question find the solution more easily.

Thankyou!

Answer 2

Hello Chris (Proudback),

Welcome to the Microsoft Q&A and thank you for posting your questions here.

I understand that you are having error when deploying Auto ML model for image classification.

To avoid any misconception, do the followings:

Get logs (replace names):

# if you deployed via older az ml service:
az ml service get-logs --workspace-name MY_WS --name MY_SERVICE --verbose
# or check portal deployment per your SDK/CLI version:
az ml online-endpoint get-logs --endpoint-name MY_ENDPOINT --deployment-name MY_DEPLOYMENT

Read logs for either:

Explicit provider registration error
Stack traces from init()
Scheduler/0/3 nodes are available: Insufficient nvidia.com/gpu → run Step C

If provider error:

az provider register --namespace Microsoft.MachineLearningServices
az provider show --namespace Microsoft.MachineLearningServices --query registrationState

If container crash: locate image name in logs, then:

   docker pull <image>
   docker run -p 8000:5001 <image>
   # run your sample POST to /score to reproduce init/run errors

If dependent package missing: create a conda YAML with needed packages and use that environment in the deploy config (or use a custom Dockerfile). See AutoML docs for how to supply custom environment. https://learn.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-image-models?view=azureml-api-2

To fix:

Container crashed because scoring init() throws, requires reading container logs and fixing code/dependencies, not just re-registering providers.
Dependency mismatches between train and inference environment, require environment changes / rebuild or use the correct curated inference image; not solved by region/provider changes. https://learn.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-image-models?view=azureml-api-2

I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.

Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

Share via

Error when deploying Auto ML model for image classification

2 answers

Your answer