Welcome to this next post in my series about Azure Monitor for Service Providers. This post will answer questions about the number of subscriptions needed, how to delegate access, and pros and cons for whatever choice you make.
First of all, this series assumes you’re a Microsoft CSP Provider, so you can create subscriptions for your customers. If you don’t know about tenants, and the importance of using the correct tenant, please read this post: https://cloudpuzzles.net/2016/11/08/azure-ad-matters-youre-csp/
The reason I want you to be able to create subscriptions, is that the design is based on 1 subscription per customer. This means all data and configuration from customer A is located in a dedicated subscription in that customers tenant. This would look like this:
As you can see, everything is completely isolated between customers. This is what we’ve chosen to be the best design, but it’s not the only option and it’s not perfect.
Access to subscriptions
For access to subscriptions, we previously used Azure AD B2B. This worked, but we had to switch between tenants all the time. Luckily, Azure Lighthouse was released, and we’re now using this for access. In short, Lighthouse let’s you light up customer subscriptions in your own tenant. You can then, in your own tenant, browse resources, deploy new things, and for example build Workbooks to access monitor data in customer environments. Only thing missing for us, is Azure AD PIM. This is being worked on: https://feedback.azure.com/forums/169401-azure-active-directory/suggestions/38627143-customer-tenants-should-be-manageable-by-pim
Pros and cons
Easy to Maintain
We prioritized an environment where we can easily add and remove customers. With this design, it’s as simple as deleting a subscription to remove a customer, or create a subscription and deploy our templates (more on that later) to add a new customer.
Show Microsoft we can drive consumption
I know this might sound completely irrelevant, but as a CSP with Microsoft, and partner in general, we want to show Microsoft that we can drive consumption and built solutions in Azure. This is a great way doing it, since we don’t depend on the customer adding, and keeping us, as a Digital Partner of Record (or registered with Partner Admin Link). This is our subscription, with no other Microsoft Partners having access to it.
Since we control the subscription, we can see who has access to it and can make changes to it. There is of course a way for the customer, with a global admin account, to gain access to it. We’ve chosen to be open with the customer about these things, and tell them how it works. But at the same time, we have a very strict contract, telling them this is our code, they simply buy the service from us. On top of that, we also monitor any changes to role assignments on our subscription, through Azure Activity Log on the subscription level.
While it’s good for us to show Microsoft a huge consumption, we also have a responsibility to the customer, to deliver a solution that’s not too expensive. With this model we do have some duplicate resources across customers, which adds a cost. This could be a heartbeat alert. It’s almost the same across all customers, but we create it specifically for every customer. That’s an extra cost, even if it’s a low cost. We discussed this a lot internally, but in the end we agreed that it was the best solution. It also adds flexibility for us to differentiate thresholds between customers.
The moving parts
As mentioned previously, we deploy our monitoring with templates. With SCOM we had a lot of manual work to get a customer up and running, installing gateways, creating and maintaining certificates etc. We wanted to make this a lot easier with Azure Monitor. It was very important for us to automate this as much as possible, and we’ve come along way, but still have work to do.
PowerShell, ARM or Terraform?
Early in the process we had to choose how to deploy our solution. As mentioned above, it should be automated as much as possible. We shortly considered building it all in PowerShell, but found that new features wasn’t always available quickly in PowerShell modules. On top of that, back when we had to make the decision, Azure PowerShell had a lot of quirks and it was clear that an overhaul was needed (which has come with the Az modules).
With PowerShell out of the picture, we had to pick between Azure Resource Manager (ARM) templates, or Terraform. For me it was quite easy to pick one, but in the end it comes down to what skills you have. We had a lot of experience with ARM, and none with Terraform. We knew that new Azure Monitor features would be coming at a quick cadence, and we didn’t trust that Terraform would be as quick to support these, as Microsoft themselves would in ARM.
I’m not saying Terraform isn’t a great solution, I’m just saying that we had to make a choice. And we ended up with ARM templates combined with PowerShell to deploy them.
When we started out, OMS solutions was a thing. This is probably the closest we come to anything like a Management Pack. A lot of things was changing in OMS though, and a lot of new features wasn’t fully supported in OMS solutions, so it was 1 step forward, 2 steps backwards with everything we built. We chose to drop OMS solutions, and just build normal Azure resources.
We decided to built different basic “solutions/management packs” and some of what we built is:
- Windows Server
- Monitors CPU, RAM, Disk and events like unexpected shutdowns
- We could have put this in our Windows Server solution, but wanted to support Linux too, so right now this is living in it’s own solution
- Change Tracking
- Change Tracking is an OMS solution that gives us data about changes on a computer. This could be registry changes, file changes, or, and this is what we initially needed, changes to Windows Services. If a service stops, we need to generate an alert.
- Patch Management
- Not really a monitoring kind of thing, but as mentioned in my first post, we added “management” to our offering. Patch Management in Azure is one of those management tools we can now provide.
- A simple monitoring of Veeam backup, looking for events for failed backup jobs.
All of this is deployed with a PowerShell script.
Next post will go deeper into Azure Lighthouse, and how we configure this for customers. It’s the first thing we deploy, after the subscription.