Block code download from Azure Machine Learning – almost

I’m currently working on a project to deploy secure Azure Machine Learning workspaces. This involves Private Endpoint, and a tightly locked down network with all the tools we have available. One of the limitations we were trying to put in place, was the ability to download files from the notebook view. This is where users store their python code, Jupyter Notebooks etc., but it’s also possible for them to output data to this datastore. That is because it is actually a datastore, and it’s there by default. It’s using Azure Files on the default storage account which is attached to your workspace.

To prevent this we looked into Azure RBAC roles, and found that the specific action of downloading files, was actually possible to block by using RBAC. The action is listed as:

Microsoft.MachineLearningServices/workspaces/notebooks/storage/download/action

Putting this in the “not actions” part of our role, helped us blocking file downloads:

{
    "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "actions": {
            "value": [
                "Microsoft.MachineLearningServices/workspaces/*/read",
                "Microsoft.MachineLearningServices/workspaces/*/action",
                "Microsoft.MachineLearningServices/workspaces/*/delete",
                "Microsoft.MachineLearningServices/workspaces/*/write"
            ]
        },
        "notActions": {
            "value": [
                "Microsoft.MachineLearningServices/workspaces/*/delete",
                "Microsoft.MachineLearningServices/workspaces/write",
                "Microsoft.MachineLearningServices/workspaces/computes/*/write",
                "Microsoft.MachineLearningServices/workspaces/computes/*/delete", 
                "Microsoft.MachineLearningServices/workspaces/computes/listKeys/action",
                "Microsoft.MachineLearningServices/workspaces/listKeys/action",
                "Microsoft.MachineLearningServices/workspaces/notebooks/storage/download/action",
            ]
        },
        "roleName":{
            "value": "Custom Azure ML Data Scientist"
        },
        "roleDescription": {
            "value": "Can run experiment but can't create or delete compute, or download files."
        }
    }
}

But… When you’re using compute instances in Azure Machine Learning, that compute instance also have JupyterLabs available through the browser. Through JupyterLabs it is also possible to download files, and unfortunately the RBAC role does not prevent this.

As I mentioned in the beginning, we are using Private Endpoints, so the workspace is only available from the internal network. That means we can secure the computers (in this case Azure Virtual Desktop) and ensure users can’t extract files from them. I’ll go over that setup in another post soon.

Leave a Reply