Azure In-VM Instance Metadata & Managed Service Identities with ARM Templates and GoLang combined – inside/out

A lot changed since my last blog post… we had a great and beatiful summer with an awesome vacation and I am now part of the Azure Customer Advisory Team which is the customer-facing part from Azure Engineering. So, I finally ended up in Jason Zander’s part of Microsoft, the person who’s responsible for Azure, itself. That means I am now involved in the most complex Azure-projects we run with customers and not dedicated to SAP, only, anymore. Although I still work with SAP a lot.

Now, in the meantime a lot of Azure tech stuff expanded as well. In this post I want to focus on two specific features – the In-VM Instance Metadata Service and the Managed Service Identity (in short, MSI) which we recently started using in a customer project even before MSI got publicly available and announced.

I’ve posted about the need for in-VM instance metadata as well as an approach for allowing Virtual Machines to perform automated management operations in a previous blog-post, already. While what I wrote back then is technically still possible, MSI and in-VM Instance Metadata are the recommendation for such scenarios right now. So, you can consider this as the long-awaited follow-up post for this previous one!

Recap the scenario

The scneario I posted about back then was about virtual machines that need to read data about themselves and also modifying configuration settings about themselves through Azure Resource Manager REST API calls. In the meantime, that very same customer I blogged about back then came with a new scenario that requires a similar capability to us.

Essentially, in that scenario a VM needed to capture it’s own IP addresses and determine the IP addresses of its peers for performing automated configurations of networking routes and keepalived settings for an HA setup (more details to follow in a separate blog post).

All of this is possible through a combined use of the new Azure in-VM instance metadata service and the Managed Service Identity!

In-VM Instance Metadata in a Nutshell

This is really nothing special, AWS and other cloud providers have it for ages, already. It essentially gives applications and scripts running inside of the VM an HTTP endpoint available from within the VM, only. This endpoint returns fundamental basic details about a Virtual Machine such as its name, network configurations, unqiue identifiers etc. For Azure Virtual Machines, this endpoint is available on http://169.254.169.254/metadata/instance?api-version=2017-04-02 and returns JSON-formatted data about the virtual machine that looks similar to the following:

myuser@mylinuxvm:~$ curl -H Metadata:true "http://169.254.169.254/metadata/instance?api-version=2017-04-02" | jq

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   515  100   515    0     0   115k      0 --:--:-- --:--:-- --:--:--  125k
{
  "compute": {
    "location": "westeurope",
    "name": "mylinuxvm",
    "offer": "UbuntuServer",
    "osType": "Linux",
    "platformFaultDomain": "0",
    "platformUpdateDomain": "0",
    "publisher": "Canonical",
    "sku": "16.04-LTS",
    "version": "16.04.201708151",
    "vmId": "d7......-9...-4..4-b..b-2..........4",
    "vmSize": "Standard_D2s_v3"
  },
  "network": {
    "interface": [
      {
        "ipv4": {
          "ipAddress": [
            {
              "privateIpAddress": "10.1.0.5",
              "publicIpAddress": "xx.xx.xx.xx"
            }
          ],
          "subnet": [
            {
              "address": "10.1.0.0",
              "prefix": "24"
            }
          ]
        },
        "ipv6": {
          "ipAddress": []
        },
        "macAddress": "00........B3"
      }
    ]
  }
}
myuser@mylinuxvm:~$

It’s a simple REST-service only accessible to anything that runs inside of the VM. All you need to take care off is ensuring, that you pass the Metadata: true HTTP-header when calling into the service. The call above shows the fundamental basics, only. There’s much more the service provides, for a complete look, review the documentation.

Managed Service Identities (MSI)

The in-VM instance metadata service is great if you need to query details about the VM, itself. What what if you need to query more? For example, which other servers are available in the same resource group to be able to configure keepalived for automatically configuring an HA-setup with Unicast instead of multi-cast for the availability pings? That’s especially important on Azure, since Multi-Cast is blocked by the VNET infrastructure. Finding out which other servers are available in the same resource group is not possible through the in-VM instance metadata service!

In my previous blog post about this topic when Instance-Metadata and MSI where not available, yet, the scenario was for a Marketplace Image to open up ports on Azure NSGs as part of an automated process after the user entered more details into a post-provisioning registration application that ran inside of the VM. Again, such actions do require access to the Azure Resource Manager REST APIs… and that, in turn, requires to authenticate against Azure Active Directory with a valid principal.

In the past, you had to manually create a Service Principal for such actions and assign permissions in the Azure Subscription for it. Then, from within the VM, you had to sign-in against Azure AD from your script or application using this Service Principal to gain access to the Azure Resource Manager REST APIs. This introduced a very delicate challenge: where would you store the credentials for being able to sign-in with the Service Principal from within the VM!?

With Managed Service Identities, these kind of scenarios become way easier to implement and removes the challenge for you to manage secrets in Virtual Machines for Service Principals. With MSIs activated, all sorts of Azure Service Instances can get identities assigned which are fully managed by Azure through it’s Microsoft.ManagedIdentity resource provider.

MSIs can be enabled on Virtual Machines, but also other types of Services as you can read in the documentation. You can enable it through the portal, via an ARM template or with PowerShell or the Azure CLI!

Enabling Managed Service Identities

There are two pieces to it, which are getting more visible when you enable MSIs through:

  • Assigning an MSI to a resource which essentially results in the creation of a “managed service principal” for an Azure Resource such as a Virtual Machine that is made available to this Azure Resource, only!
  • Making tokens available to the respective resource for which the Managed Service Identity has been created. For VMs, this happens through a Virtual Machine Extension called the ManagedIdentityExtensionForWindows or ManagedIdentityExtensionForLinux, respectively. When the extension is enabled for a virtual machine, any software running inside of the VM can request a token which is created as a result of an authentication against Azure AD with the MSI credentials. You don’t have to take care about those credentials since they are managed by the MSI infrastructure for you.

Once you have an MSI attached to a Virtual Machine (or another Azure Resource), you can to assign permissions to this identity for performing management operations against resources in your Azure subscriptions. The following screen shot shows this in the portal:

Assigning Permissions to a Managed Service Identity

If you need to assign the permissions via CLI, then you need to get the object IDs and App IDs for the service principals which are managed for you behind the scenes. Below is an excerpt of Azure CLI commands and results showing what you need to do!

mszcool@dev:~$ az vm show --resource-group LinuxHaWithUdrs --name lxHaServerVm0 --out json
{
  ...
  "id": "/subscriptions/a...fe/resourceGroups/LinuxHaWithUdrs/providers/Microsoft.Compute/virtualMachines/lxHaServerVm0",
  "identity": {
    "principalId": "f3....26d",
    "tenantId": "72....47",
    "type": "SystemAssigned"
  },
  "instanceView": null,
  "licenseType": null,
  "location": "westeurope",
  "name": "lxHaServerVm0",
  "networkProfile": {
    ...
  },
  "osProfile": {
    ...
  },
  "plan": null,
  "provisioningState": "Succeeded",
  "resourceGroup": "LinuxHaWithUdrs",
  "resources": [
    ...
  ],
  "storageProfile": {
      ...
    }
  },
  "tags": {},
  "type": "Microsoft.Compute/virtualMachines",
  "vmId": "52.....6bf"
}
mszcool@dev:~$ az ad sp show --id f3....26d
AppId             DisplayName       ObjectId          ObjectType
----------------  ----------------  ----------------  ----------------
8b............f1  RN_lxHaServerVm0  f3............6d  ServicePrincipal

As you can see, when you get the VM object through ARM, it contains a new section called identity which contains all the details about the managed service identity you need to retrieve further details from Azure AD (above also by using the CLI).

That information can be used for things such as creating custom roles with permissions and then assigning the MSI to this custom role instead of assigning explicit permissions.

And end-2-end example

As I’ve mentioned before, one of the main use cases – so also for my customer – to use these assets combined is all about VMs that need to retrieve (and modify) details about themselves and peers in a joint-deployment. In an simplified example I wanted to demonstrate the fundamental the basic mechanics of the Instance Metadata Service and the Managed Service Identity so that you understrand, how you can make use of them in your own scripts and applications.

The sample builds the foundation for the scenarios I’ve explained earlier (VMs getting infos about themselves and their peers). Rather than trying to hit it all with a single post, you can expect more complex scenario posts later on that make use of the mechanics explained in this post.

Essentially, the sample creates an infrastructure with a jump-box and a set of servers as shown in the following Azure Network Watcher topology diagram.

All of the code is available on my GitHub repository for review:

https://github.com/mszcool/azureMsiAndInstanceMetadata

Network Watcher Topology

On each of the servers, a simple GO-based REST API runs which allows to show the instance metadata of the server itself as well as get all the other servers in the same machine. The servers are exposed through an Azure Load Balancer using NAT so that every server can be accessed, individually on a port to be able to call into specific servers. Note that I’ve set this up this way for demo-purposes, only so that you easily can access each server and examine its instance metadata and its output of getting details about its peers individually.

In a real-world environment I could rarely or not at all think about scenarios to expose instance metadata or data about peers to the public, directly. So, this is for demo-purposes, only, I wanted to re-iterated on that.

Assigning MSIs to the Servers and giving them permissions

For the sample, I used ARM templates to assign MSIs to the individual Server VMs and enable the respective MSI VM extension so that an application running inside of the respective VM can get a token for accessing resources under the identity of the VM it’s running in – the excerpt is from the azuredeploy.json template on my GitHub repository.

...
{
    "apiVersion": "[variables('computeAPIVersion')]",
    "type": "Microsoft.Compute/virtualMachines",
    "copy": {
        "name": "serverVmCopy",
        "count": "[parameters('serverCount')]"
    },
    "name": "[concat(variables('serverVmNamePrefix'), copyIndex())]",
    "location": "[parameters('location')]",
    "identity": {
        "type": "systemAssigned"
    },
    "dependsOn": [
        "[resourceId('Microsoft.Network/networkInterfaces',concat(variables('serverNicNamePrefix'),copyIndex()))]",
        "[resourceId('Microsoft.Storage/storageAccounts', variables('storageAccountName'))]",
        "[variables('serversAvSetId')]"
    ],
    "properties": {
        ...
    }
}
...
{
    "apiVersion": "[variables('computeAPIVersion')]",
    "type": "Microsoft.Compute/virtualMachines/extensions",
    "name": "[concat(variables('serverVmNamePrefix'),copyIndex(),'/IdentityExtension')]",
    "location": "[parameters('location')]",
    "copy": {
        "name": "serverVmMsiExtensionCopy",
        "count": "[parameters('serverCount')]"
    },
    "dependsOn": [
        "[resourceId('Microsoft.Compute/virtualMachines', concat(variables('serverVmNamePrefix'), copyIndex()))]"
    ],
    "properties": {
        "publisher": "Microsoft.ManagedIdentity",
        "type": "ManagedIdentityExtensionForLinux",
        "typeHandlerVersion": "1.0",
        "autoUpgradeMinorVersion": true,
        "settings": {
            "port": "[variables('msiExtensionPort')]"
        },
        "protectedSettings": {}
    }
}
...

As you can see above, the server-VM gets a system assigned identity in the ARM template. Further down in the template, the Managed Identity Extension is activated for each server VM instance. The variable msiExtensionPort is set to 50342 in my example, which means that an application or script running inside of the VM can retrieve a token for management operations from within the VM on that port (http://localhost:50342/oauth2/token).

Taking care of RBAC

Now we have an MSI and the ability for applications to get tokens when running inside of the VM. But so far the possibilities of using that identity are limited since it does not have any permissions, yet. These are assigned through the ARM template, as well:

...
{
    "apiVersion": "[variables('authAPIVersion')]",
    "type": "Microsoft.Authorization/roleAssignments",
    "name": "[parameters('rbacGuids')[add(mul(copyIndex(),2),1)]]",
    "copy": {
        "name": "serverVmRbacDeployment",
        "count": "[parameters('serverCount')]"
    },
    "dependsOn": [
        "[resourceId('Microsoft.Compute/virtualMachines', concat(variables('serverVmNamePrefix'), copyIndex()))]"
    ],
    "properties": {
        "roleDefinitionId": "[variables('rbacContributorRole')]",
        "principalId": "[reference(concat(resourceId('Microsoft.Compute/virtualMachines',concat(variables('serverVmNamePrefix'),copyIndex())),'/providers/Microsoft.ManagedIdentity/Identities/default'),variables('managedIdentityAPIVersion')).principalId]",
        "scope": "[resourceGroup().id]"
    }
},
..

This assigns permissions to created MSIs for the VMs to read resources of the resource group the VMs are deployed in. To get the role definition, which is stored in the [variables('rbacContributorRole')] in my template, I had to execute an Azure CLI statement along the lines of the following:

az role definition list --query "[?properties.roleName == 'Contributor']" --out json

The next tricky bit is the name of the RBAC role assignment. Unfortunately, that needs to be a unqiue GUID. In my very simplified example, I pass in the GUIDs for the role assignments as parameters in the template:

...
"rbacGuids": {
    "type": "array",
    "metadata": {
        "description": "Exactly ONE UNIQUE GUID for each server VM is needed in this array for the RBAC assignments (sorry for that)! WARNING: if you want to keep this template deployment repeatable, you must generate new GUIDs for every run or delete RBAC assignments before running it, again!"
    },
    "defaultValue": [
        "12f66315-2fdf-460a-9c53-8654ae72c390",
        "12f66315-2fdf-460a-9c53-8654ae72c391",
        "12f66315-2fdf-460a-9c53-8654ae72c392",
        "12f66315-2fdf-460a-9c53-8654ae72c393",
        "12f66315-2fdf-460a-9c53-8654ae72c394",
        "12f66315-2fdf-460a-9c53-8654ae72c395",
        "12f66315-2fdf-460a-9c53-8654ae72c396",
        "12f66315-2fdf-460a-9c53-8654ae72c397",
        "12f66315-2fdf-460a-9c53-8654ae72c398",
        "12f66315-2fdf-460a-9c53-8654ae72c399"
    ],
    "minLength": 4,
    "maxLength": 18
}
...

The reason for this is to make it simple to replace those values as part of an integrated CI/CD pipeline with every continuous build that might involve such an ARM-template deployment. I might write a separate, short post about that topic. For now, I just grab a GUID for each server-RBAC-assignment I want to make as part of my template to generate a unique name for the assignment by using "name": "[parameters('rbacGuids')[add(mul(copyIndex(),2),1)]]".

The next trick part of this section in the template is getting the ID of the principal created for the managed service identity of the respective server VM. This part of the template really gets hard to read, so I broke it up into multiple lines although you cannot do that in a real template:

    "properties": {
        "roleDefinitionId": "[variables('rbacContributorRole')]",
        "principalId": "[reference
        (
            concat(
                resourceId(
                    'Microsoft.Compute/virtualMachines',
                    concat(
                        variables('serverVmNamePrefix'),copyIndex()
                    )
                ),'/providers/Microsoft.ManagedIdentity/Identities/default'
            ),
            variables('managedIdentityAPIVersion')
        ).principalId]",
        "scope": "[resourceGroup().id]"
    }

The code is using the reference()-template-function to get the principal ID of the service principal created as managed identity. That principal is a child-object of the virtual machine, so we need to start with the resourceId() of the virtual machine and attach the identities section to it. Finally, the reference()-function requires an API version where we use the version for the managed identity provider from a variable "managedIdentityAPIVersion": "2015-08-31-PREVIEW" in the code.

Getting a Token for your MSI

Based on the requests from that specific customer project where we needed this functionality, I decided to use Go as a programming language. I am still not a GoLang-expert, so I took the opportunity to learn. Using MSIs always follows two major steps:

  • Acquire a token through the locally installed VM Extension.This happens by calling into http://localhost:<port-selected-in-MSI-extension> settings/oauth2/token endpoint which is offered by the MSI VM Extension.
  • Use that token in REST API calls to the Azure Resource ManagerThese are regular REST-calls with the HTTP Authorization header containing the bearer token retrieved earlier.

In my GoLang-based example, I have one module contained in the file msitoken.go which performs a REST-call against the local OAuth2 server offered by the VM Extension (note that this is an incomplete excerpt, for the full code look at the file msitoken.go on my GitHub repo):

// etc. ...

const msiTokenURL string = "http://localhost:%d/oauth2/token"
const resourceURL string = "https://management.azure.com/"

// etc. ...

var myToken MsiToken

// Build a request to call the MSI Extension OAuth2 Service
// The request must contain the resource for which we request the token
finalRequestURL := fmt.Sprintf("%s?resource=%s", fmt.Sprintf(msiTokenURL, msiPort), url.QueryEscape(resourceURL))
req, err := http.NewRequest("GET", finalRequestURL, nil)
if err != nil {
    log.Printf("--- %s --- Failed creating http request --- %s", t.Format(time.RFC3339Nano), err)
    return myToken, "{ \"error\": \"failed creating http request object to request MSI token!\" }"
}

// Set the required header for the HTTP request
req.Header.Add("Metadata", "true")

// Create the HTTP client and call the instance metadata service
client := &http.Client{}
resp, err := client.Do(req);
if err != nil {
    t = time.Now()
    log.Printf("--- %s --- Failed calling MSI token service --- %s", t.Format(time.RFC3339Nano), err)
    return myToken, "{ \"error\": \"failed calling MSI token service!\" }"
}
// Complete reading the body
defer resp.Body.Close()

// Now return the instance metadata JSON or another error if the status code is not in 2xx range
if (resp.StatusCode >= 200) && (resp.StatusCode <= 299) {
    dec := json.NewDecoder(resp.Body)
    err := dec.Decode(&myToken)
    // etc. ...
}
// etc. ...

Two aspects are important:

  • First, you always need to add the “Metadata: true” header for the call. All other calls will be rejected!
  • Second, you need to add a query-string parameter to the request called resource=uri://to-your-resource-you-want-to-do-calls-to. In our case, this is always the Azure Resource Manager REST APIs resource https://management.azure.com/.

Once we have executed the call, we do have a valid token available. Note that we didn’t have to fiddle around or deal with any kinds of secrets which is super-convenient. The Azure MSI infrastructure is totally taking care of the required details and there is not even a possibility to get access to any kinds of secrets for Managed Identities.

Using the MSI Token

This is the rather simple part of the story because it’s no different to any other Azure REST API call performed with any other kind of Azure AD user/principal. Once you have the token, you just use it in the HTTP Authorization header to call into the Azure Resource Manager REST APIs and if permissions are set up as previously outlined when I wrote about RBAC, all should go well.

The following snippets are parts of the GoLang Source file mypeers.go

const (
    environmentNameSubscription string = "SUBSCRIPTION_ID"
    environmentNameResourceGroup string = "RESOURCE_GROUP"

    restAPIEndpoint string =
        "https://management.azure.com/subscriptions/%s/resourceGroups/%s/%s"

    vmRelativeEndpoint string =
        "providers/Microsoft.Compute/virtualmachines?api-version=2016-04-30-preview"

    authorizationHeader string = "%s %s"
)

func GetMyPeerVirtualMachines(msiToken MsiToken) (vms string, errOut string) {
    // etc. ...
    subID := os.Getenv(environmentNameSubscription)
    resGroup := os.Getenv(environmentNameResourceGroup)
    // etc. ...

    // Create the final endpoint URLs to call into the Azure Resource Manager VM REST API
    finalURL := fmt.Sprintf(restAPIEndpoint, 
                              subID, resGroup, vmRelativeEndpoint)
    finalAuthHeader := fmt.Sprintf(authorizationHeader,
                              msiToken.TokenType, msiToken.AccessToken)

    // Build a request to call the instance Azure in-VM metadata service
    req, err := http.NewRequest("GET", finalURL, nil)
    if err != nil {
        // etc. ...
    }
    req.Header.Add("Authorization", finalAuthHeader)

    // Create the HTTP client and call the instance metadata service
    client := &http.Client{}
    resp, err := client.Do(req);
    if err != nil {
        // etc. ...
    }
    // Complete reading the body
    defer resp.Body.Close()

    // Now return the raw VM JSON or another error if the status code is not in 2xx range
    if (resp.StatusCode >= 200) && (resp.StatusCode <= 299) {
        bodyContent, err := ioutil.ReadAll(resp.Body)
        if err != nil {
            // etc. ...
        }
        // etc. ...
        return string(bodyContent), ""
    }

    // etc. ...

    return "", fmt.Sprintf("{ \"error\": \"Azure Resource Manager REST API call returned non-OK status code: %d \" }", resp.StatusCode)
}

This code is super-simple and just retrieves all other servers in the same resource group. It assumes, that the resource group and the subscription ID are both set as environment variables before the GO-application is started. This should give you an idea, how a server in a resource group could find other servers and get their private IP addresses to automatically configure components such as e.g. keepalived during an automated post provisioning step or something similar.

The Instance Metadata Service

The MSI and Azure ARM REST API calls can help retrieving details about peers or performing more complex management operations incl. creating or updating resources depending on the permissions given to a particular MSI. But for retrieving information details about itself, a VM does not necessarily need to go through MSI and ARM REST APIs since there’s a way simpler approach if it’s just about retrieving details about the VM itself.

For a few months, Azure makes an in-VM instance metadata service available which can be called from within the VM, only, but without additional authentication requirements. The documentation about the instance metadata service shows, how-to retrieve the data with simple tools such as curl. Again, the important thing is to include the metadata header as with the MSI token service, before.

In this end-2-end sample, I show, how to call the in-VM instance metadata service from a GoLang application. Again, I just show the mechanics, no concrete scenario for this post, but it should equip you with being able to implement scenarios such as the ones I’ve explained several times throughout the post. And I plan for subsequent blog-posts making use of these mechanics for a real scenario implementation. Below again an excerpt of the GoLang-code that retrieves instance metadata, for the full code please review metadata.go:

const instanceMetaDataURL string =
          "http://169.254.169.254/metadata/instance?api-version=2017-04-02"

/*GetInstanceMetadata ()
 *Calls the Azure in-VM Instance Metadata service and returns the results to the caller*/
func GetInstanceMetadata() string {
    // etc. ...

    // Build a request to call the instance Azure in-VM metadata service
    req, err := http.NewRequest("GET", instanceMetaDataURL, nil)
    if err != nil {
        // etc. ...
    }

    // Set the required header for the HTTP request
    req.Header.Add("Metadata", "true")

    // Create the HTTP client and call the instance metadata service
    client := &http.Client{}
    resp, err := client.Do(req);
    if err != nil {
        // etc. ...
    }
    // Complete reading the body
    defer resp.Body.Close()

    if (resp.StatusCode >= 200) && (resp.StatusCode <= 299) {
        bodyContent, err := ioutil.ReadAll(resp.Body)
        // etc. ...
        return string(bodyContent)
    }
    // etc. ...
    return fmt.Sprintf("{ \"error\": \"instance meta data service returned non-OK status code: %q \" }", resp.StatusCode)
}

The Main Go-Application

Before putting it all together, let’s have a quick look at the main GoLang application so that you get a sense, where those previous pieces of code are called from. The main application is fairly simple, it bootstraps a GoLang HTTP server and configures some routes for the HTTP-handlers (full source in main.go).

package main

import (
     "log"
     "net/http"
     "github.com/gorilla/mux"
)

var myRoutes = map[string]func(http.ResponseWriter, *http.Request){
        "/": Index,
        "/meta": MyMeta,
        "/servers": MyPeers}

func main() {
    router := mux.NewRouter().StrictSlash(true);
    for key, value := range myRoutes {
        router.HandleFunc(key, value);
    }
    log.Fatal(http.ListenAndServe(":8080", router))
}

The handlers.go then contains the functions which are referred to in the array myRoutes defined in the source code above. These are the actual functions called when the respective route URLs are called:

/*Index (w, r)
 *Returns with a list of available functions for this simple API*/
 func Index(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintln(w, "Welcome!");
}

/*MyMeta (w, r)
 *Returns instance metadata retrieved through the in-VM instance metadata service of the VM*/
func MyMeta(w http.ResponseWriter, r *http.Request) {
    metaDataJSON := GetInstanceMetadata()
    fmt.Fprintf(w, metaDataJSON)
}

/*MyPeers (w, r)
 *Uses the MSI to get a token and list all the other servers available in the resource group*/
func MyPeers(w http.ResponseWriter, r *http.Request) {
    token, err := GetMsiToken(50342)
    if err != "" {
        fmt.Fprint(w, err)
    } else {
        peerVms, err := GetMyPeerVirtualMachines(token)
        if err != "" {
            fmt.Fprint(w, err)
        } else {
            fmt.Fprint(w, peerVms)
        }
    }
}

Putting it all together

To make exploring this as easy as possible for you, the ARM templates and scripts I provide as part of this solution are setting up the entire environment automatically. To recall, here’s the screen shot of the entire environment from the Azure Network Watcher, again:

Network Watcher Topology

The ARM template sets up the Network, Virtual Machines, Network Security Groups etc. and for making it simple to explore the responses of the different servers without SSHing into the VMs, I also added a Load Balancer that exposes the GoLang application via Port-Mapping to each of the servers on the public load balancer. That means, you can just perform an http-request against the public load balancer with a port that maps to the server for which you would like to see the responses for. A few examples:

Of course, you can also SSH into the Jump-Box set up as part of this deployment and explore everything from the inside. Essentially, what I do is the following as part of the ARM template deployment to automate the setup of the GoLang application:

  • The ARM-template contains a custom script extension that runs on each of the servers to build the Go-application and generate a shell-script that registers the GoLang REST-API I’ve explained above as a service daemon.
  • The Service Daemon script which is generated as part of the server setup and copied to /etc/init.d/msiandmeta.sh sets the Subscription ID and the target resource group as an environment variable before launching the GoLang Application.

For making the process simple and easy to follow, I use a template for the init.d-script that gets generated with the custom script extension. This script is also on my github repository called template.msiandmeta.sh.

#!/bin/bash
### BEGIN INIT INFO
# Provides:          msiandmeta
# Required-Start:    $local_fs $network $named $time $syslog
# Required-Stop:     $local_fs $network $named $time $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: GoLang App using Azure MSI and Metadata
# Description:       Runs a Go Application which is a web server that demonstrates usage of Managed Service Identities and in-VM Instance Metadata
### END INIT INFO

appUserName=__USER__
appPath=__APP_PATH__
appName=__APP_NAME__

processIDFilename=$appPath/$appName.pid
logFilename=$appPath/$appName.log

#
# Starts the simple GO REST service
# 
start() {
    # Needed by the GO App to access subscription and resource group, correctly
    export SUBSCRIPTION_ID="__SUBSCRIPTION_ID__"
    export RESOURCE_GROUP="__RESOURCE_GROUP__"

    # Check if the service runs by looking at it's Process ID and Log Files
    if [ -f $processIDFilename ] && [ "`ps | grep -w $(cat $processIDFilename)`" ]; then
        echo 'Service already running' >&2
        return 1
    fi
    echo 'Starting service...' >&2
    su -c "start-stop-daemon -SbmCv -x /usr/bin/nohup -p \"$processIDFilename\" -d \"$appPath\" -- \"./$appName\" > \"$logFilename\"" $appUserName
    echo 'Service started' >&2
}

#
# Stops the simple GO REST service
#
stop() {
    if [ ! -f $processIDFilename ] && [ ! "`ps | grep -w $(cat $processIDFilename)`" ]; then
        echo "Service not running" >&2
        return 1
    fi
    echo "Stopping Service..." >&2
    start-stop-daemon -K -p "$processIDFilename"
    rm -f "$processIDFilename"
    echo "Service stopped!" >&2
}

#
# Main script execution
#

case $1 in

    start)
      start
      ;;

    stop)
      stop
      ;;

    restart)
      stop
      start
      ;;

    \?)
      echo "Usage: $0 start|stop|restart"
esac

In this script, you can see tokens such as __SUBSCRIPTION_ID__. These tokens are replaced by the script that’s executed at provisioning time for each of the servers through the custom script extension definition in the main ARM template for the entire solution:

{
    "apiVersion": "[variables('computeAPIVersion')]",
    "type": "Microsoft.Compute/virtualMachines/extensions",
    "name": "[concat(variables('serverVmNamePrefix'),copyIndex(),'/SetupScriptExtension')]",
    "location": "[parameters('location')]",
    "copy": {
        "name": "serverVmSetupExtensionCopy",
        "count": "[parameters('serverCount')]"
    },
    "dependsOn": [
        "[resourceId('Microsoft.Compute/virtualMachines',concat(variables('serverVmNamePrefix'), copyIndex()))]",
        "[concat('Microsoft.Compute/virtualMachines/', concat(variables('serverVmNamePrefix'),copyIndex()),'/extensions/IdentityExtension')]"
    ],
    "properties": {
        "publisher": "Microsoft.Azure.Extensions",
        "type": "CustomScript",
        "typeHandlerVersion": "2.0",
        "autoUpgradeMinorVersion": true,
        "settings": {
            "fileUris": [
                "[concat(parameters('_artifactsLocation'),'/scripts/setup_server_node.sh',parameters('_artifactsStorageSasToken'))]",
                "[concat(parameters('_artifactsLocation'),'/scripts/template.msiandmeta.sh',parameters('_artifactsStorageSasToken'))]",
                "[concat(parameters('_artifactsLocation'),'/app/main.go',parameters('_artifactsStorageSasToken'))]",
                "[concat(parameters('_artifactsLocation'),'/app/handlers.go',parameters('_artifactsStorageSasToken'))]",
                "[concat(parameters('_artifactsLocation'),'/app/metadata.go',parameters('_artifactsStorageSasToken'))]",
                "[concat(parameters('_artifactsLocation'),'/app/msitoken.go',parameters('_artifactsStorageSasToken'))]",
                "[concat(parameters('_artifactsLocation'),'/app/mypeers.go',parameters('_artifactsStorageSasToken'))]"
            ]
        },
        "protectedSettings": {
            "commandToExecute": "[concat('./setup_server_node.sh -a ', parameters('adminUsername'), ' -s ', subscription().subscriptionId, ' -r ', resourceGroup().name)]"
        }
    }
}

The script that’s invoked through the custom script extension above is also on my GitHub repository and generates the final init.d-script for the service registration based on the input parameters. These input-parameters are exactly the subscription-name, the resource group name and the user under which the daemon should run. Here’s an excerpt of the setup_server_node.sh that builds the GoLang App and generates the target init.d-script:

#
# Next compile the Go Application
#
mkdir ./app
mv *.go ./app

export PATH="$PATH:/usr/local/go/bin"
export GOPATH="`realpath ./`/app"
export GOBIN="$GOPATH/bin"
go get ./app
go build -o msitests ./app

sudo mkdir /usr/local/msiandmeta
sudo cp ./msitests /usr/local/msiandmeta
sudo chown -R $adminName:$adminName /usr/local/msiandmeta

#
# Configure apache2 to use the Go application as a CGI script
#
cat ./template.msiandmeta.sh \
| awk -v USER="$adminName" '{gsub("__USER__", USER)}1' \
| awk -v APP_NAME="msitests" '{gsub("__APP_NAME__", APP_NAME)}1' \
| awk -v APP_PATH="/usr/local/msiandmeta" '{gsub("__APP_PATH__", APP_PATH)}1' \
| awk -v SUBS="$subscriptionId" '{gsub("__SUBSCRIPTION_ID__", SUBS)}1' \
| awk -v RGROUP="$resGroup" '{gsub("__RESOURCE_GROUP__", RGROUP)}1' \
>> msiandmeta.sh

#
# Now make sure the script is handled by the system for starting/stopping the service
#
sudo cp ./msiandmeta.sh /etc/init.d
sudo chmod +x /etc/init.d/msiandmeta.sh
sudo update-rc.d msiandmeta.sh defaults

With that, the GoLang-application that accesses the ARM REST APIs through the MSI and the instance metadata service as part of this sample should run, automatically, and always find the correct subscription ID and resource group name as part of the environment variables since they’re set by the init.d-script generated from the template through this way!

Testing the environment

Once you have deployed the ARM template into your subscription, you should be able to call the GoLang-application I’ve explained above that demonstrates the mechanics of the instance metadata service and the Managed Service Identity in action through the Load-Balancer using the NAT-ports for each server. The reason for mapping each server through a port to the outside world was for demo-purposes and to make it as easy as possible for you to examine the different responses of the different servers without SSHing into any machine. The following screen shot shows this in action by comparing different responses from different servers.

Running the app in action

Of course, in the real world you would not expose these things, directly, but rather use them from within your applications!! For this sample and for enabling you to ramp up with details, quickly, it should be helpful, hopefully!

Final Words

Managed Service Identities and the in-VM Instance Metadata Service are extremly helful and it was long overdue to have these kind of great capabilities. Both services allow you to implement complex scenarios such as:

  • Implementing licensing and IP-protection strategies based on the in-VM instance metadata service.
  • Script automated configurations of clustered environments by being able to call into Azure Resource Manager REST APIs from within Virtual Machines without the need of managing secrets for Service Principals.
  • many, many more and similar scenarios.

With both services availabe on Azure, my previous blog-post becomes obsolete for this specific scenario, although there might still be many reasons for leveraging service principals for other scenarios, of course (so it might still be a good source for learning details about service principals in Azure AD, in general). But the specific scenario outlined in both, that previous post and this one, can be implemented way better with Managed Service Identities and the in-VM Instance Metadata Service combined!

I hope you enjoyed reading this and it was valuable for you. We went through something that leverages these mechanics in a very similar way for a concrete scenario with one of my customers… my plan is to post about a concrete scenario that leverages these mechanics as one of my next blogging activities.

Stay Tuned!

Cloud Foundry – SAP Cloud Platform on Azure

Cloud Foundry – SAP Cloud Platform on Azure (Beta)

The main project that kept me busy for the past 7 months was the work with SAP Labs Israel to get a beta version of SAP Cloud Platform running on Azure. During SAP Sapphire, SCP on Azure got now announced together with other public cloud vendors as part of SAP’s multi-cloud strategy (see here and here. It is currently available as public beta for anyone to try on Azure. Here I dig into some first basics…

Note: as outlined in my previous blog post about HANA Express, please note that this is my personal blog which reflects my personal thoughts instead of Microsoft’s official opinion. For official announcements and statements, please refer to the Microsoft Azure Blog.

Cloud Foundry on Azure and SCP on Azure

Before digging into the details of SAP Cloud Platform (short: SCP), just a quick reminder about how we support Cloud Foundry as a platform on Azure in general.

With SCP, we now support the 2nd commercially available Cloud Foundry flavor on Azure next to Pivotal Cloud Foundry. But, what makes SCP on Azure really different to the offers we had so far is, that it is a fully managed PaaS… managed by SAP. So using SCP on Azure is exactly the same as on any other public cloud platform which gives you full portability in a multi-cloud world. We do have a few differentiators in my opinion, though:

Finally, I strongly believe that from the work we’ve done with SAP, Cloud Foundry on Azure benefits across the board, in general. We’ve made numerous improvements to the Bosh CPI implementation for Azure as part of the efforts and we fixed several bugs. Examples of that are e.g. the Keep-Alive of unhealthy virtual machines for debugging purposes or support for managed disks (which was on the plans, already, but we needed it earlier:)).

Until General Availabiltiy of SAP Cloud Platform on Azure, you’ll see many more improvements in our CPI which I cannot elaborate in detail at the moment.

SAP Cloud Platform on Azure – Getting Started

To get SCP on Azure, you work through the regular SCP administation cockpit. So your starting point is the same as for any other work you do with SCP. You register or log-in as usual, nothing Azure specific with that. For activating Azure, which is at the time of writing this post available as Beta+Trial, you click on the “Start Cloud Foundry Trial”-button within the cockpit!

https://account.hanatrial.ondemand.com

SCP Cockpit Login and highlight activation button

When clicking the button, the SCP cockpit will ask you for selecting a region in which you want to activate the trail. Now, within SCP, activated Azure regions do just appear as an SCP-region within the cockpit. That means, you would pick an SCP-region for the activation that runs on-top of Azure as shown below:

Pick the SCP region

Until now, I’ve done several trial activations. It usually takes a minute till the tenant is activated and available for you. For Cloud Foundry knowledable folks, what happens is the creation of an organization and a space for you within the SCP Cloud Foundry environment.

Exploring what’s available

After that process is completed, you can navigate to your organization and space within the SCP cockpit. Through the cockpit, you have convenient ways to browse and montior deployed applications or also browse the marketplace to review, which backing services SAP has enabled for the Beta+Trial which we’ve enabled on Azure. Cloud Foundry veterans will find it quite easy to navigate through the portal.

SCP Marketplace

Now, since this is a beta as well as a trial, the backing services are all running as single instances in containers behind the scenes. I think that’s more than fair because you’re not paying for the service as long as it is in beta. And… this is also the general approach followed by SAP for using trial accounts on generally available regions, as well. For GA and production (non-trial) environments, these backing services will run in Virtual Machine clusters, of course.

The exception even for today in that list is… guess what… SAP HANA of course. The HANA backing service offered as part of the Azure SCP beta is a multi-tenant, shared HANA environment that runs in Virtual Machines behind the scenes.

Note: When using any of the backing services from the SCP market place, you indeed use SAP-deployed backing services. That gives you a higher level of portability since SAP and not Azure controls the versions and configurations used for the technologies that are the foundation for those backing services. Now, if you still would prefer to use native Azure services, you can use the Azure Meta Service Broker to hook them up with applications you run in SCP the Cloud Foundry way (you can use them, directly, as well, of course). But in that case, you need to be aware that this has impact on how portable your applications are across multiple cloud vendors if that’s relevant for you.

Using the Cloud Foundry CLI

Now, the cool thing about SCP is, that it is Cloud Foundry. It’s a Cloud Foundry PaaS enriched by many SAP services (ok, it will take a while till we have them all active on Azure as that requires further engineering work). It ultimately means, you manage and work with it as with other Cloud Foundry environments including the CLI and the APIs. For SAP-specific services such as HANA, SAP provides plug-ins for the CLI to make it easier to work with those services such as the HANA MTA plug-in which I think SAP announced during this year’s Sapphire, as well.

For CLI/API interaction purposes, SCP uses different API endpoints for each region. When using the Azure Beta region which we’ve deployed right now, you’d use the following API endpoint:

cf api api.cf.us20.hana.ondemand.com
cf login

From that moment forward, everything looks and feels like Cloud Foundry. You can push apps, you can explore your marketplace etc. The learning curve is quite easy for Cloud Foundry experienced developers, I’d say:

SCP CF CLI in Action

Of course, I was curious what development runtimes SAP has enabled and as a .NET developer I wanted to know about .NET. Looking at the screen shot above, I found it great that .NET Core was enabled. But be aware: SAP typically offers some sort of enterprise support for languages and runtimes in their environment, but from the build packs listed, only a few enjoy that status. .NET Core does not. So, check back with SAP on that… .NET Core only gets community-level support through the Cloud Foundry Foundation.

Using the Azure Meta Service Broker with SAP Cloud Platform

When you’re running on Azure, it makes sense to use Azure-native services if you want to. At the end of the day, that’s when you can unleash the full power of running on a specific cloud platform, right:)? And… you still could implement your “portability” thorugh IoC/DI at the application level to have a more effective/efficient integration into native cloud services but still stay portable to a certain extent.

Fortunately, SAP allows you to enable the Azure Meta Service Broker at the space-level when using SCP. I thought I wanna try that out again now this is in beta (we prototyped that in early phases). Essentially, you really only need to follow the standard-procedure to enable the Azure Meta Service Broker as documented here.

The first step, though, is to create an Azure SQL Database instance and a Service Principal in your Azure Active Directory, as shown below. The Service Principal is needed because the service broker dynamically creates/deletes resources as they are requested through the Cloud Foundry API (or the CLI).

$ az ad app create --display-name=yourazureadappdisplayname --homepage=http://yourhomepage.com --identifier-uris="https://yourazureadserviceprincipalname" --key-type=Password --reply-urls="https://notreallyimportantwhatyouputhere"
AppId                                 DisplayName                  Homepage                 ObjectId                              ObjectType
------------------------------------  ---------------------------  -----------------------  ------------------------------------  ------------
yourazureadserviceprincipalappid      yourazureadappdisplayname    http:/yourhomepage.com   thecreatedappobjectid                 Application
$
$ az ad sp create --id https://yourazureadserviceprincipalname
AppId                                 DisplayName                  ObjectId                              ObjectType
------------------------------------  ---------------------------  ------------------------------------  ----------------
yourazureadappdisplayname             yourazureadappdisplayname    thecreatedappobjectid                 ServicePrincipal
$
$ az ad app update --id=yourazureadserviceprincipalappid --password="yourazureadserviceprincipalpassword"
$

Don’t forget to give the created service principal contributor-rights to the resource group you want to use for the resources of the Meta Service Broker.

After you’ve done the basic creation of the assets mentioned above, the next step is to clone the Azure Meta Service Broker repositiory and adjust the Cloud Foundry Application Manifest for it. The Meta Service Broker essentially is a CF-application that represents the Service Broker for several Azure Services as documented. You can run it wherever you want, but it’s built for being pushed as an application into CF with configuration data about your Azure Subscription to use for the resources created by the Meta Service Broker. Here’s a sample configuration, so that’s how a filled manifest.yml for the Azure Meta Service Broker looks like.

Note that the parameters for SUBSCRIPTION_ID, TENANT_ID, CLIENT_ID and CLIENT_SECRET need to match those you’ve used or retrieved through the Azure CLI commands above as they represent your subscription as well as the servvice principal you’ve created above! Ok, the commands above don’t give you the subscription ID. You can retrieve that by issuing az account list --output=json or by looking it up in the Azure Portal.

---
applications:
- name: meta-azure-service-broker
  buildpack: https://github.com/cloudfoundry/nodejs-buildpack
  instances: 1
  env:
    ENVIRONMENT: AzureCloud
    SUBSCRIPTION_ID: yoursubscriptionid
    TENANT_ID: yourazureadtenantidfromtheserviceprincipal
    CLIENT_ID: yourazureadserviceprincipalappid
    CLIENT_SECRET: yourazureadserviceprincipalpassword
    SECURITY_USER_NAME: usertoauthenticateagainstmetaservicebroker
    SECURITY_USER_PASSWORD: passwordtoauthenticateagainstmetaservicebroker
    AZURE_BROKER_DATABASE_PROVIDER: sqlserver
    AZURE_BROKER_DATABASE_SERVER: yourazuresqldbserver.database.windows.net
    AZURE_BROKER_DATABASE_USER: yourazuresqldbuser@yourazuresqldbserver
    AZURE_BROKER_DATABASE_PASSWORD: yourazuresqldbpassword
    AZURE_BROKER_DATABASE_NAME: yourazuresqldbdatabasename
    AZURE_BROKER_DATABASE_ENCRYPTION_KEY: 32charactersofyourchoicegohereok
    AZURE_SQLDB_ALLOW_TO_CREATE_SQL_SERVER: false
    AZURE_SQLDB_ENABLE_TRANSPARENT_DATA_ENCRYPTION: true

Assuming you have logged-in with cf login as mentioned above, you can just push the meta service broker application into your SCP tenant using cf push. Important tip: since the settings specified above in the manifest are set as environment variables, be careful with special characters used by bash when defining the passwords. Either avoid or escapte them!! A successful deployment of the meta service broker app should end up with the following output on your console:

Service Broker App Pushed

When you’ve pushed the application, you only created the meta service broker but you did not tell Cloud Foundry that this is indeed a Service Broker and not any other type of general application. So, as a next step after the meta service broker has been deployed, successfully, is to register it as a service broker. The steps for that are:

  • Review the deployed application and take note of it’s entry point URL.
  • Create the service broker using that URL. You also need to authenticate against the meta service broker. For that purpose, use the previously specified SECURITY_USER_NAME and SECURITY_USER_PASSWORD parameters from the manifest file above.
$ cf apps
Getting apps in org yourorgname_trial / space dev as your@username.com...
OK

name                        requested state   instances   memory   disk   urls
meta-azure-service-broker   started           1/1         1G       1G     urlfromyourapppush.cfapps.us20.hana.ondemand.com
$
$
$ cf create-service-broker nameofyourbroker usertoauthenticateagainstmetaservicebroker passwordtoauthenticateagainstmetaservicebroker  https://urlfromyourapppush.cfapps.us20.hana.ondemand.com --space-scoped

Important Note: since SCP is a multi-tenant Cloud Foundry environment, you need to know that you are only the administrator of the spaces you’re creating. Permissions to organizations are limited and you don’t have any permissions beyond organizations. That means when activating the service broker, you need to use the --space-scoped switch to activate the broker on your space. When you do that, you are using a so called private broker with private plans which become active, automatically. That means, although the documentation of the Azure Meta Service Broker states it, you do not need to call cf enable-service-access since all services are enabled for your space in that case, by default. A simple cf marketplace confirms that:

cf marketplace results

As a next step, you can start using the services exposed by the Azure Meta Service Broker. At the time of writing this post, those services included: Azure Storage, Azure SQL DB, Azure DocumentDB, Azure Service Bus, Azure Redis Cache and Azure Key Vault services. Means you can start using those services right away. If you have asks/requests for additional services Microsoft should enable, I think you’re best by posting requests/issues into the GitHub Repository for the Azure Meta Service Broker or file a pull request.

The way how you make use of the services follows standard cloud foundry patterns with service brokers. You create a service and then you bind the service to one or many applications using the Cloud Foundry service broker APIs or CLI-commands. Let’s create an Azure Storage account that way so you get a sense of it. The approach is always the same for all services offered through the Azure Meta Service Broker.

First you create a parameters-file specifying some of the provisioning paramters for creating the respective Azure service. An example of such a parameters JSON-file for creating an Azure storage account via the service broker looks as follows:

{
  "resource_group_name": "marioszpScpMetaBroker",
  "storage_account_name": "marioszpscpteststrg",
  "location": "westus",
  "account_type": "Standard_LRS"
}

With this JSON-file in-place, you can leverage the service broker CLI commands or Cloud Foundry APIs to create a new service and make Cloud Foundry aware of it:

$ cf create-service azure-storage standard marioszpscpstoragetest -c .\params.json

$ cf service marioszpscpstoragetest

Service instance: marioszpscpstoragetest
Service: azure-storage
Bound apps:
Tags:
Plan: standard
Description: Azure Storage Service
Documentation url:
Dashboard:

Last Operation
Status: create succeeded
Message: Created the storage account, state: Succeeded
Started: 2017-05-19T11:02:21Z
Updated: 2017-05-19T11:02:40Z
$

Reults of creating a service with the broker

Now, once this is done you can check on the provisioning status and availability of that service using the cf service marioszpscpstoragetest command which gives you all the details about the created service. Finally, by using the cf bind-service command, you can bind the service to any application you’ve deployed. Configuration parameters will then be exposed through environment variables as per the documentation of the respective service in the Azure Meta Service Broker GitHub. For an Azure storage account, an environment variable with a JSON-document will be added to your applications Environment variables that contains e.g. the storage account name and storage account keys you can use to authenticate and execute calls against the created storage account.

What’s really cool now is, that all of what I’ve done above with the Cloud Foundry CLI is visible in the SAP Cloud Platform Cockpit, as well. For example, since I’ve activated the Azure Meta Service Broker, you’ll see all the services in the marketplace in the portal as well. Even the Azure Storage Account which is surfaced as a service as shown below. That means the Azure Meta Service Broker really brings SCP and Azure close together and allows you to look at Azure resources through the SCP as well as the Azure management perspectives.

SCP Cockpit with Azure Services

Final Words

This is just the beginning of SAP Cloud Platform on Azure. Note that we’ve announced Beta. During our path forward for becoming generally available, we’ll get to see several improvements in Azure’s Bosh CPI and optimized backing services (currently all except HANA run in single instance containers in the Beta/Trial).

There’s one question a lot of people ask when looking at something like this, though: how does that relate to native Azure PaaS services such as Azure App Services or Service Fabric? The answer is simple: Azure is an open Cloud Platform. Microsoft’s strategy is clearly to enable as many relevant platforms as possible in addition to Microsoft’s own technology. This gives customers and partners choice and allows them to leverage the skills and knowledge they have available in their portfolio, already. Cloud Foundry and it’s flaviours do fit very well into this strategy. If you prefer Cloud Foundry based PaaS platforms, then Azure is a place where you can go… and if you want you can even integrate with all the native goodness of Azure including especially Azure Active Directory for Single Sign-On or all of the other services in addition to those offered with the Azure Meta Service Broker.

Developed an SAP HANA Express Azure Quick Start Template

Context

This is an exciting week for me… although I am usually not that much into attending Business-oriented conferences, over the past two years I did so by attending SAP Sapphire.

Up until know, that mainly is caused by the fact that I’ve contributed some aspects to what was announced with regards to the partnership between Microsoft and SAP at the conference. Last year, my part was mainly about Office 365 with the work I’ve supported for Concur, Ariba, Fieldglass and SuccessFactors as well as the Sports Basement Demo which was shown in the Key Note to highlight the HANA One Certification for Azure in our DS14 VM-Series at that time.

While I cannot write about the major project I supported for this year, yet, here’s a little nugget to get started with – a Quick Start Template for SAP HANA Express!

Important Note: While I am working for Microsoft, this blog summarizes my personal opinions and my personal understanding of topics. This means that all I am writing here is not related to Microsoft’s official opinion, at all! If you want to get that view, look at the Azure Blog or Jason Zander’s Blog for official announcements!

Important Note: The Pull-Request into the official Microsoft Azure Quick Start Templates GitHub repository is not completed, yet. Therefore, the link still redirects to the working branch in my GitHub repository. I’ll update the blog-post as soon as the pull request is completed (there’s currently a general issue with the Travis CI pipeline used for validation on the Azure Quick Start Templates GitHub repository that caused an unexpected delay for the pull request to go through in time for Sapphire).

SAP HANA Express

As you might know, already, SAP HANA is SAP’s in-memory Database Engine which is supposed to back all of the major future releases of SAP’s core business suites centered around S/4HANA. But, HANA can be used as a stand-alone database system for developing custom solutions, as well. The Sports Basement Demo on Azure DS14 Instances from last year’s Sapphire Key Note is an example of that. It was a plain HANA Database fronted by a Java Web Application.

Now, within last year’s Sapphire and now, SAP released a version of SAP HANA that is free for development and testing purposes up to 32GBs of RAM called HANA Express.

For more details, you should navigate to the SAP HANA Express Homepage to get the full picture and the official view:

https://www.sap.com/developer/topics/sap-hana-express.html

Azure Quick-Start-Templates

Now, shortly before Sapphire some folks from Microsoft and SAP approached me for creating an Azure Marketplace Image for HANA Express. That’s something we’re working on, but it is not something that can be done in just a few days. Too short… but since HANA Express clearly addresses developers, I thought a good solution that can be implemented within a few days is an Azure Quick-Start-Template.

For those of you who are new to Azure: Azure Quick-Start-Templates are open source based Azure Resource Manager Templates and Deployment Scripts which can be used to quickly spin-up Solutions on Azure. Many of those are used as a learning resource, but some of them can definitely be used for dev/test scenarios or even as a starting point for production scenarios.

All of these are available under the following two links:

Now, the main point with these quick start templates is, that they’re automating most of the setup/provisioning procedure by using Scripts and Templates so you can get started, quickly.

SAP HANA Express Quick Start Template

I decided to build such a template for HANA Express and make it available as part of the Quick-Start-Templates. This Quick-Start works with SAP HANA Express 2.0 SPS1 and it also should work with other version, but I’ve only tested it with this one.

http://aka.ms/sap-hana-express-quickstart

There’s just one caveat: since SAP requires you to accept the EULA for SAP HANA Express, you first need to go to SAP’s HANA Express Home Page, register and download the SAP HANA Express Setup Images manually before being able to start using the Quick Start Template I’ve created. But from there on you can start using the template. The basic workings of the template are:

  1. First you register with SAP and download the HANA Express Setup Packge.
  2. Then you use the quick start template to upload the setup package into your private Azure Storage Account.
  3. From that moment forward you can use the Azure Resource Manager Template included in the template to deploy as many HANA Express Instances into your own subscription as you want.

Starting with Step 2, everything is automated with Scripts and Templates. That means, only the first step – downloading the setup packges and accepting the EULA at the SAP HANA Express Setup Homepage is something you need to do manually. Sure, a Marketplace Image would be more convenient, but we’ll work with SAP on that…

All the details are explained in the sections below in this blog post including how you can validate, that the installation really went well at the end of the entire process.

Important Note: Please don’t forget, that quick start templates are not something that is backed by Microsoft Support of any means. They are here to help you getting started on your own, they are full open source and maintained at a best effort basis!

Requirements

Before moving on, all you need on your local machine are the following assets/tools:

Register Download HANA Express Setup from SAP

Yes, that’s the first step. It needs to be done for two reasons: first, you need to accept SAP’s EULA and second SAP will inform you about important updates and service releases for HANA when registering at the registration page.

That said, the first thing you do is navigating to the SAP HANA Express Home Page to register with SAP and Accept the EULA:

HANA Express

HANA Express Download Manager Option

Next, you’ll need to use the SAP Download Manager to download the SAP HANA Express Setup packages. The Download Manager exists as native version for Linux or Windows or as a Cross-Platform Version for Java (distributed as JAR-package). I’ve used the JAR-package version, but it should not matter at all.

What matters is selecting the right type of setup packages. The quick-start-template I’ve built is tested for the Server-Version, only, without XA Advanced services. So you should select the following for download when using the SAP HANA Express Download Manager:

HANA Express Server Only Download Option

The downloaded setup package will appear as TAR-Archive in your local Downloads-folder (or wherever you downloaded it to). It should be called something along the lines of hxe.tgz.

Upload the HANA Express to your Azure Storage Account

For automating the setup procedure of SAP HANA Express inside of an Azure Virtual Machine, the HANA Express Setup packages need to be available for download through an automation script! To avoid uploading the setup package (~ 1.6GBs of data) with each new VM, the best approach is uploading that into an Azure storage account and using a shared access signature to feed the setup files into the provisioning script for Virtual Machines.

Now, performing those steps can be done manually by using a tool such as the Azure Storage Explorer. But I decided to automate the procedure using the Azure CLI 2.0.

Assuming that you have your SAP HANA Express Setup packages downloaded to the folder /mnt/c/temp/hxe.tgz, you can execute the following command:

# Login into your Azure Subscription using the Azure CLI 2.0
az login

# Create a resource group for the storage account
az group create --name "sampleresourcegroupname" --location "westeurope"

# Upload the HANA Express Setup Files to your Azure Storage Account
./prepare-hxe-setup-files.sh sampleresourcegroupname samplestorageaccountname samplecontainer westeurope /home/mydirectory/hxe.tgz

Now, as I mentioned before, the automated setup of HANA Express happens in a script that runs in the post-provisioning phase of the Azure VM. That means, this script needs to have access to those setup files for an automated download without user interaction. To enable that scenario, the prepare-hxe-setup-files.sh-Script of my Quick Start uploads the setup packages for HANA Express to an Azure Storage Account and generates a Shared Access Signature URL which allows to simply download the packages using that Signature as a means of authentication by using wget or any similar shell-tool.

The following Screen-Shot shows the output of the prepare-hxe-setup-files-sh-script. You should especially take note of the Shared Access Signature URL the script outputs at the end!

Output of prepare-hxe-setup-files.sh

Deploy an SAP HANA Express using the templates

With the Setup packages for SAP HANA Express uploaded to an Azure Storage Account and the Storage Shared Access Signature generated as mentioned above, you can deploy as many SAP HANA Express Virtual Machines as you need to.

Important Note: the script prepare-hxe-setup-files.sh above generates Shared Access Signatures that are valid for a year. That means, after a year, you need to run the script, again, to generate a new Shared Access Signature. Note that the script is smart in detecting, if the files have been uploaded, already, and if so it generates the signature for the existing blob instead of uploading, again!

When using the quick-start-template, you can either use the “Deploy-To-Azure”-Button presented on the landing page of the Quick Start or you fill out the parameters in the azuredeploy.parameters.json-file as shown below and deploy the template via PowerShell or the Azure CLI:

Parameters filled in the azuredeploy.parameters.json

After you’ve filled out the parameters – the screen shot above shows the minimum ones you need to fill out – you can move ahead and deploy the template using code similar to the following:

# Create a resource group for your HANA Express VM Resources
az group create --name "samplehanaexpressgroup" --location "westeurope"

# Deploy the template with the filled parameters file
az group deployment create --resource-group="samplehanaexpressgroup" \
                           --template-file="azuredeploy.json" \
                           --parameters="@azuredeploy.sample.parameters.json" \
                           --name="samplehanaexpress"

The output of that script should look similar to the following screen shot:
Parameters filled in the azuredeploy.parameters.json

Validating the Installation

So far so good, if the output looks similar to the screen shot above, then you should be all set! But you can of course validate your installation – in two ways: using regular HANA tools to see if your instance is responsive or look at the installation logs from the provisioning process.

For that, you need to understand some background. I am using Azure Custom Script Extensions for Linux to automatically execute the HANA Installation with all required pre-requisites during the post-VM provisioning phase of the Azure Virtual Machine. That is expressed in the Azure Resource Manager Template with the following code:

... REST OF THE TEMPLATE ...

    {
        "type": "extensions",
        "name": "hxeinstallextension",
        "apiVersion": "2016-04-30-preview",
        "location": "[resourceGroup().location]",
        "dependsOn": [
            "[concat('Microsoft.Compute/virtualMachines/', parameters('vmNamePrefix'))]"
        ],
        "properties": {
            "publisher": "Microsoft.Azure.Extensions",
            "type": "CustomScript",
            "typeHandlerVersion": "2.0",
            "autoUpgradeMinorVersion": true,
            "settings": {
                "fileUris": [
                    "[parameters('hxeInstallScriptUrl')]"
                ]
            },
            "protectedSettings": {
                "commandToExecute": "[concat('sudo ./', parameters('hxeInstallScriptName'), ' \"', parameters('hxeSetupFileUrl'), '\" \"', parameters('hxeMasterPwd'), '\" && exit 0')]"
            }
        }
    }

... REST OF THE TEMPLATE ...

This part of the template shows, that after the Virtual Machine Resource has been provisionied, it uses the Azure Virtual Machine Agent to run the script specified in the template. This script is downloaded from the Quick-Start-Templates GitHub repository, directly. So no further steps needed to enable this.

If you now want to validate, whether the installation script from SAP HANA Express ran, successfully, you first should review the deployment logs within the Azure Portal similar to what’s shown in the following screen shot:

Azure Portal Deployment Log

If you still need to see more, then you just need to SSH into the virtual machine created (refer to the DNS name specified in the azuredeploy.parameters.json as per the screen shots above) and output the content of the stdout and stderr files within the /var/lib/waagent/custom-script/download/0 directory similar to what’s shown in the following screen shot:

Virtual Machine Deployment Log

When you look at the output, you’ll quickly realize, how much this really accelerates you. The setup script automatically performs the following steps for you:

  • Install the needed Oracle JDK on your machine
  • Install required library packages using zypper
  • Download and Extract the HANA Express Setup Packages to the VM
  • Install HANA Express on the VM

Final Words

The post above went quite into the details of every step of how the quick-start works. But the essence can literally be done within about 30min. depending on how fast your Internet connection is:

  • Register for an Azure Subscription if you don’t have one, yet
  • Setup the Azure CLI 2.0 on your machine if not done so, yet
  • Register with SAP
  • Download the SAP HANA Express Setup Packages
  • Execute the script prepare-hxe-setup-files.sh and note the generated Shared Access Signature
  • Click the “Deploy-to-Azure Button” or update the parameters file and execute az group deployment create

All of the needed pre-requisites and of course SAP HANA Express itself gets set up for you and within about 30min. you have an instance of it running in Microsoft Azure. I hope you find this more valuable and helps you to accelerate the setup of dev/test environments with SAP HANA Express since you don’t need to walk through all of the needed setup steps, manually.

Azure Virtual Machines – A Solution for Instance Metadata in Linux (and Windows) VMs

At SAP Sapphire we announced the availabiltiy of SAP HANA on Azure. My little contribution to this was working on case that was shown as a demo in the key note at SAP Sapphire 2016: Sports Basement with HANA on Azure. It was meant as a show-case and proof for running HANA One workloads in Azure DS14 VMs and it was the first case of HANA on Azure productive outside of the SAP HANA on Azure Large Instances.

While we proved we can run HANA One in DS14, what’s still missing is the official Marketplace image. We are working on that on-boarding of HANA One into the Azure Marketplace at the time I am writing this post here. This post is about a very specific challenge which I know is needed by many others, as well. While Azure will have a built-in solution, it is not available, today (August 2016), so this might be of help for you!

Scenario: A VM reading and modifying data about itself

This is a very common scenario. HANA One needs it as well. On other cloud platforms, especially AWS, a Virtual Machine can query information about itself without any hurdles through an instance metadata service. On Azure, as powerful as it is, we don’t have such a service available, yet (per August 2016). To be precise, we do, but it currently delivers information about regular maintenance, only. See here for further details. While such a service is in the works, it is not available, yet.

Instance metadata is especially interesting for software providers which want to offer their solutions through the marketplace. The metadata can be used for various aspects including association and validation of licenses or protection of software assets inside of the VM.

But what if a VM needs to modify settings through Cloud Provdier Management APIs, automatically? Even with an instance metadata service available, such requirements need a more advanced approach.

Solution: A possible approach outlined (and code on my GitHub Repo)

Based on that I started thinking about this challenge, prototyping it and sharing it with the broader technical community. With Azure having the concept of Service Principals available, I tried the following path:

  1. If we could pass in a Service Principal at the creation of the VM, we’d have all we need to call into Azure Resource Manager APIs.
  2. The VM can identify itself through it’s “Unique VM ID”. So we could query into Azure Resource Manager APIs and find the VM based on this ID.
  3. For Marketplace use cases it is necessary, that the user is FORCED to enter the credentials. So an ARM template with mandatory parameters for passing in the details for the Service Credential is needed.

With this in place we can solve both problems with a single solution: with the right permissions equipped, a Service Principal can query instance metadata through Azure Resource Manager APIs and modify virtual machine settings at the same time. Indeed, the Azure Cloud Foundry Bosh solution uses that approach as well, although it does not need to “identify” virtual machines. It just creates and deletes them…

For most Marketplace Vendors incl. the case above, the VM needs to change details about itself. So their would need to be a way for the VM to find itself through the VM Unique ID. Since nobody was able to answer the quesiton if that’s possible, I prototyped it with the Azure CLI.

Important Note: This is considered to be a prototype to proof if what is outlined above generally works. For production scenarios you’d need to code this in professional frameworks, better protect secrets by using those and build this into your product.

GitHub Repository: I’ve prototyped the entire solution and published it on my GitHub Repository here:

–>> https://github.com/mszcool/azureSpBasedInstanceMetadata

Step #1: Create a Service Principal

The first step is creating a Service Principal. That is not an easy task, especially when you think about offerings in a Marketplace where business people want to have fast and simple on-boarding.

Guess for what I’ve created this solution-prototype on my GitHub repository (with a blog-post followed). The idea of this prototype is to provide a ready-to-use service that creates Service Principals in your own subscription.

I still run this on my Azure Subscription, so if you need a Service Principal and you don’t like scripting, just use my tool for creating it. Note: please use in-private browsing and sign-in with a Global Admin (or get a Global Admin who does an Admin-Consent for my tool in your tenant).

If you love scripting, then you can use tools such as the Azure PowerShell or the Azure Cross Platform CLI. In my prototype, I built the entire set of scripts with the Azure CLI and tested it on Ubuntu Linux (14.04 LTS). Even cooler, I indeed developed and debugged all the Scripts on the new Bash on Ubuntu on Windows:
Bash on Windows

The script createsp.sh shows a sample-script which creates a Service Principal and assigns the needed roles to the Service Principal to read VM metadata in the subscription (it would be better to just target the resource group in which you want to create the VM… I just kept it like that for convenience).

# Each Service Principal in Azure AD is backed by an 'Application-registration'
azure ad app create --name "$servicePrincipalName" \
                    --home-page "$servicePrincipalIdUri" \
                    --identifier-uris "$servicePrincipalIdUri" \
                    --reply-urls "$servicePrincipalIdUri" \
                    --password $servicePrincipalPwd

# I use JQ to extract data out of JSON results such as the AppId
createdAppJson=$(azure ad app show --identifierUri "$servicePrincipalIdUri" --json)
createdAppId=$(echo $createdAppJson | jq --raw-output '.[0].appId')

azure ad sp create --applicationId "$createdAppId"

Note: I created the App and the Service Principal separately since the AppID is needed to login using Azure CLI with the Service Principal, anyways. Therefore I separated those steps since I needed to read the App and the Service Principal Object IDs, anyways.

Note: JQ is really a handy command line tool to extract data from the neat JSON-responses of the Azure CLI. Take a look at further details here.

After the Service Principal and the App are both created, I can assign the roles to the Service Principal so that he can query the VM Metadata in my subscription:

# If I would create the resource group earlier, I could use the
# --resource-group switch instead of the --subscription switch here to scope
# permissions to the resource group of the VM to-be-created, only.
azure role assignment create --objectId "$createSpObjectId" \
                             --roleName Reader \
                             --subscription "$subId" 

Finally, to complete the work, I needed the Tenant ID of the Azure AD Tenant for the target subscription which is also needed for the Login with a Service Principal with the Azure CLI. Indeed the following code-snippet is at the very beginning of the createsp.sh-Script:

# Get the entry for the target subscription
accountsJson=$(azure account list --json)

# The Subscription ID is needed throughout the script
subId=$(echo $accountsJson | jq --raw-output --arg pSubName $subscriptionName '.[] | select(.name == $pSubName) | .id')

# Finally get the TenantID of the Azure AD tenant which is associated to the Azure Subscription:
tenantId=$(echo $accountsJson | jq --raw-output --arg pSubName $subscriptionName '.[] | select(.name == $pSubName) | .tenantId')

With those data-assets above in place, the tenantId, the appId and the password selected for the app-creation, we can log-in with the service principal using the Azure CLI as follows:

azure telemetry --disable
azure config mode arm
azure login --username "$appId" --service-principal --tenant "$tenantId" --password "$pwd"

Note: Since we want to login in a script that runs automated in the VM to extract the metadata for an application at provisioning-time (in my sample – in the real world this could happen on a regular basis with a cron-job or something similar), we need to make sure to avoid any user prompts. The latest versions of Azure CLI prompt for telemetry data collection on the first call after installation. In an automation script you should always turn this off with the first command (azure telemetry --disable) in your script.

Step #2: A Metadata Extraction Script

Okay, now we have a Service Principal that could be used from backend jobs to extract metadata for the VM in an automated way, e.g. with the Azure CLI. Next we need a script to do exactly that. For my prototpye, I’ve created a shell script (readmeta.sh) that does exactly that. For this prototype I injected this script through the Custom Script Extension for Linux.

Note: Since the SAP HANA One team uses Linux as their primary OS, I just developed the entire prototype with Shell-Scripts for Linux. But fortunately, due to the Bash on Ubuntu on Windows 10, you can also run those from your Windows 10 machine right away (if you have the 2016 Anniversary Update installed).

You can dig into the depths of the entire readmeta.sh-Script if you’re interested. I just extract VM and Networking details in their to show, how to crack the VM UUID and to show, how-to extract related items which are exposed as separate resources in ARM attached to the VM.

Let’s start with first things first: the script requires the Azure Cross Platform CLI installed. On a newly provisioned Azure VM, that’s not there. So the script starts with installing stuff:

sudo mkdir /home/metadata
export HOME=/home/metadata

#
# Install the pre-requisites using apt-get
#

sudo apt-get -y update
sudo apt-get -y install build-essential
sudo apt-get -y install jq

curl -sL https://deb.nodesource.com/setup_4.x | sudo -E bash -
sudo apt-get -y install nodejs

sudo npm install -g azure-cli

Important Note: Since the script will run as a Custom Script extension, it does not have things like a user HOME directory set. To make NodeJS and NPM work, we need a Home-Directory. Therefore I set the HOME to /home/metadata to which I also save all the metadata JSON responses during the script.

The next hard thing was cracking the VM Unqiue ID. This Unique ID is available for some time in Azure and it identifiers a Virtual Machine for its entire lifetime in Azure. That ID changes when you take the VM off from Azure or delete it and re-create it. But as long as you just provision/de-provision or start/shutdown/start the VM, this ID remains the same.

But, the key question is whether you can use that ID to find a VM using ARM REST APIs to read metadata about itself, or even change its settings through Azure Resource Manager REST APIs. Obviously, the answer is yes, otherwise I would not write this post:). But the VM ID presented in responses from Azure Resource Manager REST APIs is different from what you get when reading it inside of the VM out of its asset tags – due to Big Endian bit ordering differences, also documented here.

So in my Bash-script for reading the metadata, I had to convert the VM ID before trying to use it to find my VM through the ARM REST APIs as follows:

#
# Read the VMID from the BIOS asset tag (skip the prefix, i.e. the first 6 characters)
#
vmIdLine=$(sudo dmidecode | grep UUID)
echo "---- VMID ----"
echo $vmIdLine
vmId=${vmIdLine:6:37}
echo "---- VMID ----"
echo $vmId

#
# Now switch the order due to encoding differences between the Windows and Linux World
#
vmIdCorrectParts=${vmId:20}
vmIdPart1=${vmId:0:9}
vmIdPart2=${vmId:10:4}
vmIdPart3=${vmId:15:4}
vmId=${vmIdPart1:7:2}${vmIdPart1:5:2}${vmIdPart1:3:2}${vmIdPart1:1:2}-${vmIdPart2:2:2}${vmIdPart2:0:2}-${vmIdPart3:2:2}${vmIdPart3:0:2}-$vmIdCorrectParts
vmId=${vmId,,}
echo "---- VMID fixed ----"
echo $vmId

That did the trick to get a VM ID which I can use to find my VM through ARM REST APIs, or through the Azure CLI since I am using bash-scripts here:

#
# Login, and don't forget to turn off telemetry to avoid user prompts in an automation script.
#
azure telemetry --disable
azure config mode arm
azure login --username "$appId" --service-principal --tenant "$tenantId" --password "$pwd"

#
# Get the details for the VM and save it
#
vmJson=$(azure vm list --json | jq --arg pVmId "$vmId" 'map(select(.vmId == $pVmId))')
echo $vmJson > /home/metadata/vmmetadatalist.json
echo "---- VM JSON ----"
echo $vmJson

What you see above is, that there’s today (as of August 2016) no way to query Azure Resource Manager REST APIs by using the VM Unique ID. Only attributes such as resource group and VM name can be used. Of course that applies to the Azure CLI, as well. Therefore I retrieve a list of VMs and filter it down using JQ by the VM ID… which fortunately is delivered as an attribute in the JSON response from the ARM REST APIs.

Now we have our first metadata asset, a simple list entry for the VM in which we are runnign with basic attributes. But what if you need more details. The obvious way is to execute an azure vm show --json command to get the full VM-JSON. But even that will not include all details. E.g. lets say you need the public or the private IP address assigned to the VM. What you need to do then is, navigating through relationships between those Azure Resource Manager Assets (the VM and the Network Interface Card resource, in specific). That is where it gets a bit tricky:

#
# Get the detailed VM JSON with relationship attributes (e.g. the NIC identified through its unique Resource ID)
#
vmResGroup=$(echo $vmJson | jq -r '.[0].resourceGroupName')
vmName=$(echo $vmJson | jq -r '.[0].name')
vmDetailedJson=$(azure vm show --json -n "$vmName" -g "$vmResGroup")
echo $vmDetailedJson > /home/metadata/vmmetadatadetails.json

#
# Then get the NIC for the VM through ARM / Azure CLI
#
vmNetworkResourceName=$(echo $vmJson | jq -r '.[0].networkProfile.networkInterfaces[0].id')
netJson=$(azure network nic list -g $vmResGroup --json | jq --arg pVmNetResName "$vmNetworkResourceName" '.[] | select(.id == $pVmNetResName)')
echo $netJson > /home/metadata/vmnetworkdetails.json

#
# The private IP is contained in the previously received NIC config (netJson)
#
netIpConfigsForVm=$(echo $netJson | jq '{ "ipCfgs": .ipConfigurations }')
echo $netIpConfigsForVm > /home/metadata/vmipconfigs.json

#
# But the public IP is a separate resource in ARM, so you need to navigate and execute a further call
#
netIpPublicResourceName=$(echo $netJson | jq -r '.ipConfigurations[0].publicIPAddress.id')
netIpPublicJson=$(azure network public-ip list -g $vmResGroup  --json | jq --arg ipid $netIpPublicResourceName '.[] | select(.id == $ipid)')
echo $netIpPublicJson > /home/metadata/vmipconfigspublicip.json

This should give you enough of the needed concepts to get all sorts of VM Metadata for your own VM using Bash-scripting. If you want to translate this to your Java, .NET, NodeJS or whatsoever code, then you need to look at the management libraries for the respective runtimes/languages.

Step #3: Putting it all together – the ARM template

Finally we need to put this all together! That happens in an ARM template and the parameters this ARM template requests from the user to be entered on provisioning. An ARM-template similar to this could be built for a solution template based Marketplace Offer.

On my GitHub repository for this prototype, the ARM template and its parameters are baked into the files azuredeploy.json and azuredeploy.parameters.json. I won’t go through all details of these templates. The most important aspects are in the parameters-section and in the VM creation section where I hook up the Service Principal with the Script and attach it as a Custom Script Extension. Start with an excerpt of the “parameters”-section of the template:

"parameters": {
    "storageAccountName": {
      "type": "string"
    },
    "dnsNameForPublicIP": {
      "type": "string"
    },
    "adminUserName": {
      "type": "string"
    },
    "adminPassword": {
      "type": "securestring"
    },
    "azureAdTenantId": {
      "type": "string"
    },
    "azureAdAppId": {
      "type": "string"
    },
    "azureAdAppSecret": {
      "type": "securestring"
    },
    ...
  },
...

The important parameters are the azureAdTenantId, azureAdAppId and azureAdAppSecret parameters. Those together form the sign-in details for the Service Principal as it is used in the script described in the previous section to read out the metadata for the VM on provisioning, automatically.

Reading the metadata is initiated through specifying my readmeta.sh-script as a custom script extension for the VM in the ARM template as below:

...
    {
      "type": "Microsoft.Compute/virtualMachines/extensions",
      "name": "[concat(parameters('vmName'),'/writemetadatajson')]",
      "apiVersion": "2015-06-15",
      "location": "[parameters('location')]",
      "dependsOn": [
        "[concat('Microsoft.Compute/virtualMachines/', parameters('vmName'))]"
      ],
      "properties": {
        "publisher": "Microsoft.OSTCExtensions",
        "type": "CustomScriptForLinux",
        "typeHandlerVersion": "1.5",
        "settings": {
          "fileUris": [
            "[concat('https://', parameters('storageAccountName'), '.blob.core.windows.net/customscript/readmeta.sh')]"
          ]
        },
        "protectedSettings": {
          "commandToExecute": "[concat('bash readmeta.sh ', parameters('azureAdTenantId'), ' ', parameters('azureAdAppId'), ' ', parameters('azureAdAppSecret'))]"
        }
      }
    }
...

Since the Azure Linux Custom Script extension prints a lot of diagnostics details about what it is doing, we need to at least make sure that our sensitive data, especially the Service Principal’s password is NOT included in that diagnostics logs to keep it protected (well… as good as possible:)). Therefore the commandToExecute-setting is put into the protectedSettings-section which is NOT disclosed in any diagnostics-logs from the Custom Script Extension.

Important Note: On the Azure Quickstarts Template Gallery are many templates that are using the custom script extension version 1.2. For having the commandToExecute-setting in the protectedSettings-section, you have to use a newer version. For me, the latest version 1.5 at the time of writing the post worked. With the previous versions it just didn’t call the script.

Step #4: Trying it out…

Before you can try things out, there’s one thing you need to prepare: create the storage account and upload the readmeta.sh-script into that account (argh, next time I just write the scripts to clone my GitHub-repository:)). To make it easy, I created a script called deploy.sh with 10 parameters that does everything:

  1. Create the Resource group
  2. Create the storage account
  3. Upload the script to the storage account
  4. Update the parameters in azuredeploy.parameters.json to reflect your service principal attributes
  5. Start the deployment with the template and the updated template parameters.

And while trying I thought the 10 parameters make it flexible, but it’s still a hard start if you’d love to just quickly try this. So I created another bash-script called getstarted.sh. That asks you for all the data interactively and calls the createsp.sh and deploy.sh scripts based on the input you interactively entered. Just like below:

Getting Started

Final Words

With this in place, you have a solution that allows you to do both, reading instance metadata of the VM in which your software runs and also (with the right permissions set on the Service Principal) modify aspects of the VM through Azure Resource Manager APIs or Command Line Interfaces.

Sure, this reads like a complex, long thing. It would be much easer for Instance Metadata if you could do it without authentication and Service Principals. All I can say is that this will change and will become easier. But for now, that’s a solution and I hope I provide you with valuable assets that make the story less complex for you to achieve this goal!

And even when we have a simpler solution for instance metadata available in Azure, the content above shows you some advanced scripting concepts of which I hope you can learn from. The coolest thing of it: since Windows 10 Anniversary Update you can run all of the above on both, Windows and Ubuntu Linux, BECAUSE all is written as Bash scripts.

For me the nice side-effect of this was experiencing, how mature the Linux Subsystem for Windows seems to be. What really surprised me is, that I even can run Node Version Manager and build-essential on it (I even tried compiling v5 of my Node.JS version using it and it ran through and works).

Anyways – if you have any questions, reach out to me on Twitter.

A Deep Dive into Azure AD Multi-Tenant Apps, OAuth/OpenIdConnect Flows, Admin Consent and Azure AD Graph API

I am currently working with one if my main Global Independent Software Vendor (ISV) partners for on-boarding their solution into the Azure Marketplace. The main challenge that we face there is, that the solution needs to do some post-provisioning steps in the end-customer’s target subscription as well as Azure Active Directory tenant:

  • Creating a Service Principal that can be used by the Software inside of the provisioned VM in the end-customer’s target directory.
  • Using that service principal to read data from the end-customer’s Azure Subscription.

Note: the end customer in this case is the customer, who purchases the product published by the ISV in the store!

Such cases typically require the creation of "multi-tenant" Azure Active Directory applications. And this application then needs to access the end-customer’s target directory using the Azure AD Graph API. At the same time, creating service principals is not an easy task.

A Multi-Tenant Web App to create Service Principals as Sample

To make this as practical as possible, I decided to create a web app that creates service principals in the target Azure Active Directory of an end-customer that’s using the web app.

This shows, how the general multi-tenancy challenge can be solved and at the same time provides a handy tool for creating Service Principals, which is a harder task on its own.

All the details for using the app and for cloning the source code are available on my GitHub-repository under the link below. In addition, I also run the app on my Azure Subscription as a free-tier Azure Web App.

The documentation shows, how-to register a multi-tenant application in your Azure AD tenant to make such an application available as a multi-tenant application. It shows, how such an application is reflected in a customer’s target Azure AD tenant and how-to manage access to it.

The sample also demonstrates the various OAuth- and OpenIdConnect-flows which are needed in a simple yet practical and useful scenario. All of this should be easy to reflect to your own scenarios and I found that, despite Microsoft has decent docs for Azure Active Directory out there, such a sample is not easy to find in an end-2-end and focused way. That’s what I tried to create.

The basic/initial OpenIdConnect-Flow for Signing-In

So, let’s start with digging into the OAuth details. First of all, all the theory is well-explained on the official Microsoft Azure and MSDN documentation pages (see last section of the article).

I just get down at the protocol-trace level so that it’s easy for developers to understand what’s going on and how simple those protocols are, indeed. Also it should help configuring/using other frameworks on all sorts of platforms appropriately to fit into this model.

For all of the below I am using the real deployment of my Service Principal Web App Demo mentioned above (note: I might remove that deployment at any point in time since I’ve guidance on my GitHub-repo for how-to deploy it in your own Azure AD tenant, as well).

  1. First the user browses to the target application which is secured by Azure AD.

  2. That typically ends up in a redirect to Azure AD as an IDP to get an initial token. A typical Redirect Request for an OAuth Sign-In flow looks as follows (using line-breaks to make it easier to read):

    GET https://login.microsoftonline.com/common/oauth2/authorize?
        client_id=---your client id from azure ad app registration---
        &response_mode=form_post
        &response_type=code+id_token
        &scope=openid+profile
        &state=OpenIdConnect.AuthenticationProperties%3dW1HmJdRTdYw...
        &redirect_uri=https%3a%2f%2flocalhost%3a44330%2f HTTP/1.1
    Host: login.microsoftonline.com
    Connection: keep-alive
    Upgrade-Insecure-Requests: 1
    User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
    Accept-Encoding: gzip, deflate, sdch, br
    Accept-Language: en-US,en;q=0.8
    
    
    • The client_id-parameter reflects the Client ID that is configured in Azure Active Directory for that application.
    • The scope-parameter contains various additional items used for token validation.
    • The nonce is used to protect against token replay attacks (typically). It’s value provided in the request must match the response and is unique per user session, typically.
  3. When the user (assume Admin) signs in for the first time, a consent dialog is displayed. This is part of the OAuth Authorization flow and gives the user a chance to "Accept" or decline the permissions the app needs. Since that is handled by Azure AD as an IdP, we don’t look into the details of the requests issued there.

    Consent

  4. Once the user accepted this consent, Azure AD posts a token to a target URL which was specified in the earlier request with the redirect_uri parameter. Let’s look at the details (again with newlines for readability):

    POST https://localhost:44330/ HTTP/1.1
    Host: localhost:44330
    Connection: keep-alive
    Content-Length: 2428
    Cache-Control: max-age=0
    Origin: https://login.microsoftonline.com
    Upgrade-Insecure-Requests: 1
    User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36
    Content-Type: application/x-www-form-urlencoded
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
    Referer: https://login.microsoftonline.com/common/Consent/Grant
    Accept-Encoding: gzip, deflate, br
    Accept-Language: en-US,en;q=0.8
    Cookie: OpenIdConnect.nonce.Z9f6E8u...
    
    code=AAABA...
    

    That post contains an OAuth-Authorization code in the body. This Code can be used to request tokens from Azure AD for downstream API-calls of APIs which are also secured by Azure AD. Of course, the code will only work for APIs to which the app has been given permissions in the Azure AD portal.

    Permissions of the App

    For the "Service Principal Demo App" those permissions are highlighted in the screen shot above. The Code therefore would work for requests of tokens for the Azure Active Directory Graph API (identified as https://graph.windows.net) and the Azure Service Management and Resource Manager APIs (identified as https://management.core.windows.net).

  5. When the Service Principal Web App receives the request, it actually uses it to request an additional token that permits the app to call into Azure Active Directory Graph APIs. This is another token-request which the app tries to execute when it received the post above.

    POST https://login.microsoftonline.com/common/oauth2/token HTTP/1.1
    Accept: application/json
    x-client-last-request: a5db36d8-ab46-4dfc-b96e-9dc31cf06a5c
    x-client-last-response-time: 1284
    x-client-last-endpoint: token
    x-client-SKU: PCL.Desktop
    x-client-Ver: 3.10.0.0
    x-client-CPU: x64
    x-client-OS: Microsoft Windows NT 10.0.10586.0
    x-ms-PKeyAuth: 1.0
    client-request-id: 1eb9034c-e02c-4e7b-8c4f-0fe5e2faabfe
    return-client-request-id: true
    Content-Type: application/x-www-form-urlencoded
    Host: login.microsoftonline.com
    Content-Length: 1079
    Expect: 100-continue
    
    resource=https%3A%2F%2Fgraph.windows.net&client_id=---your client id from azure ad app registration---&client_secret=---your client secret configured in the azure ad portal&grant_type=authorization_code&code=---previously received authorization code---&redirect_uri=https%3A%2F%2Flocalhost%3A44330%2F
    

    Such a request would the respond with a new OAuth Bearer Token that would permit us to call into the Azure AD Graph APIs. This token needs to be added to the HTTP Authorize header on each request, then. Here’s an example response for the request above:

    HTTP/1.1 200 OK
    Cache-Control: no-cache, no-store
    Pragma: no-cache
    Content-Type: application/json; charset=utf-8
    Expires: -1
    Server: Microsoft-IIS/8.5
    Strict-Transport-Security: max-age=31536000; includeSubDomains
    X-Content-Type-Options: nosniff
    x-ms-request-id: fb2db119-ca9e-421d-8007-6ae7e97d163e
    client-request-id: 1eb9034c-e02c-4e7b-8c4f-0fe5e2faabfe
    x-ms-responsehealth: TargetId=ESTSFE_IN_329;Action=None;Category=None;Health=0;Load=9;
    P3P: CP="DSP CUR OTPi IND OTRi ONL FIN"
    Set-Cookie: esctx=AAABAA ...; domain=.login.microsoftonline.com; path=/; secure; HttpOnly
    Set-Cookie: x-ms-gateway-slice=productionb; path=/; secure; HttpOnly
    Set-Cookie: stsservicecookie=ests; path=/; secure; HttpOnly
    X-Powered-By: ASP.NET
    Date: Wed, 29 Jun 2016 21:31:37 GMT
    Content-Length: 3826
    
    {
      "token_type": "Bearer",
      "scope": "Directory.AccessAsUser.All Directory.ReadWrite.All Group.ReadWrite.All User.Read",
      "expires_in": "3599",
      "ext_expires_in": "3600",
      "expires_on": "1467239498",
      "not_before": "1467235598",
      "resource": "https://graph.windows.net",
      "access_token": "eyJ0eXAiOiJK...",
      "refresh_token": "AAABAAAAiL9Kn2..."
    }
    

    The response is a JSON-response containing some helpful details about the issued token as well as a refresh-token to renew the actual access token. Note: if you need to get a new access token with the refresh token, you still need to have the Client ID and the App Secret available in that refresh-request.

OAuth Admin Consent for Multi-Tenant Azure AD Apps

Yikes, the biggest challenge I faced with the tool when building it was, that ordinary Azure AD Users (role = ‘User’) where not able to use it. You had to be a ‘Global Admin’ to execute it.

The main reason for that was, that my app requires "acting as the Signed-in User" against Azure AD Graph API. And for that, the Azure AD team changed the default behavior for a good reason a while ago (well, in March 2015): https://blogs.msdn.microsoft.com/aadgraphteam/2015/03/18/update-to-graph-api-consent-permissions/.

So, to enable ordinary users to make use of such applications, a Global Admin first needs to "approve" the application for the target directory by running through an OAuth Admin Consent. This is a special type of consent that asks the Global Admin if he wants to make the permissions the App requires available to ordinary users inside of the Organization (technically: in the target directory against the multi-tenant app tries to work depending on the signed-in user).

The steps are:

  1. The Global Admin needs to Sign-in into the application.

  2. The application needs to provide the appropriate "on-boarding"-function, which essentially initiates the Admin-Consent against the target directory of the signed-in user. I did this by just adding a button to my app that starts the Admin Consent.

    Admin Consent Function

  3. All that button does is composing a URL that goes against the Azure Active Directory OAuth endpoints to walk through the Admin Consent. This leads to the following request that initiates the Admin Consent:

    GET https://login.windows.net/yourazureadtenantid/oauth2/authorize?
        api-version=1.0
        &response_type=code
        &client_id=yourazureadappid
        &resource=https://management.core.windows.net/
        &redirect_uri%20=https://mszcoolserviceprincipal.azurewebsites.net/Home/CatchConsentResult
        &prompt=admin_consent 
        HTTP/1.1
    Host: login.windows.net
    Connection: keep-alive
    Cache-Control: max-age=0
    Upgrade-Insecure-Requests: 1
    User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
    Referer: https://mszcoolserviceprincipal.azurewebsites.net/
    Accept-Encoding: gzip, deflate, sdch, br
    Accept-Language: en-US,en;q=0.8    
    

    The really important aspect of that request is the query-string parameter prompt=admin_consent which does the work. I combine the request with issuing an authorization code right away, but I do think that’s optional (would need to read back in the specs:)).

  4. After that initial admin-consent is completed, every other ordinary user (role = ‘user’) can sign-in to the application and make use of it. The admin-consent literally approved the application through an administrator for the organization for security reasons.

The Graph API calls with the issued tokens

Finally, with that Access Token we can make calls into the Azure AD Graph API. The sample-calls for creating a Service Principal are similar to the following types of requests.

  1. First, the app tries to find if the needed "Application" for the Service Principal has been created in Azure AD, already:

    GET https://graph.windows.net/yourazureadtenantid/applications()?$filter=identifierUris/any(iduri:iduri%20eq%20'http%3A%2F%2Fyourappidurienteredinthescreen')&api-version=1.6 HTTP/1.1
    DataServiceVersion: 3.0;NetFx
    MaxDataServiceVersion: 3.0;NetFx
    Accept: application/json;odata=minimalmetadata
    Accept-Charset: UTF-8
    DataServiceUrlConventions: KeyAsSegment
    User-Agent: Microsoft Azure Graph Client Library 2.1.1
    Authorization: Bearer eyJ0eXAiOiJK...
    X-ClientService-ClientTag: Office 365 API Tools 1.1.0612
    Host: graph.windows.net
    Connection: Keep-Alive
    
    

    The request above looks, if an Application is registered in the target tenant yourazureadtenantid with the App ID URI http://yourappidurienteredinthescreen. The HTTP-response will have an OData-based JSON with the resulting elements in it if an App exists, already.

    {
      "odata.metadata": "https://graph.windows.net/yourazureadtenantid/$metadata#directoryObjects/Microsoft.DirectoryServices.Application",
      "value":[
        ...
      ]
    }
    
  2. If no application exists, it actually creates the application by posting an ApplicationEntity into the Graph API:

    POST https://graph.windows.net/yourazureadtenantid/applications?api-version=1.6 HTTP/1.1
    DataServiceVersion: 3.0;NetFx
    MaxDataServiceVersion: 3.0;NetFx
    Content-Type: application/json;odata=minimalmetadata
    Accept: application/json;odata=minimalmetadata
    Accept-Charset: UTF-8
    DataServiceUrlConventions: KeyAsSegment
    User-Agent: Microsoft Azure Graph Client Library 2.1.1
    Authorization: Bearer eyJ0eXAiOiJKV1...
    X-ClientService-ClientTag: Office 365 API Tools 1.1.0612
    Host: graph.windows.net
    Content-Length: 201
    Expect: 100-continue
     
    {
     "odata.type": "Microsoft.DirectoryServices.Application",
     "displayName": "YourAppDisplayName",
     "identifierUris@odata.type": "Collection(Edm.String)",
     "identifierUris": [
         "http://YourAppIdUri"
     ]
    }
    

    This post will return with a detailed JSON object which contains all the details about the created App including it’s AppId.

  3. Then the application does the same for checking if a Service Principal exists for the Application previously created, already.

    GET https://graph.windows.net/yourazureadtenantid/servicePrincipals()?$filter=appId%20eq%20'b3ccae52-19bc-45a1-a4e4-f572f6963213'&api-version=1.6 HTTP/1.1
    DataServiceVersion: 1.0;NetFx
    MaxDataServiceVersion: 3.0;NetFx
    Accept: application/json;odata=minimalmetadata
    Accept-Charset: UTF-8
    DataServiceUrlConventions: KeyAsSegment
    User-Agent: Microsoft Azure Graph Client Library 2.1.1
    Authorization: Bearer eyJ0eXAiOiJKV1...
    X-ClientService-ClientTag: Office 365 API Tools 1.1.0612
    Host: graph.windows.net
    

    The response will again contain an OData JSON document with the service principal if it exists, already. I am skipping the details for now…

  4. Finally, if the Service Principal does not exist, the app creates one with a password credential attached to it. That means this principal can be used by service- and backend-applications.

    POST https://graph.windows.net/yourazureadtenantid/servicePrincipals?api-version=1.6 HTTP/1.1
    DataServiceVersion: 3.0;NetFx
    MaxDataServiceVersion: 3.0;NetFx
    Content-Type: application/json;odata=minimalmetadata
    Accept: application/json;odata=minimalmetadata
    Accept-Charset: UTF-8
    DataServiceUrlConventions: KeyAsSegment
    User-Agent: Microsoft Azure Graph Client Library 2.1.1
    Authorization: Bearer eyJ0eXAiOiJKV1...
    X-ClientService-ClientTag: Office 365 API Tools 1.1.0612
    Host: graph.windows.net
    Content-Length: 627
    Expect: 100-continue
     
    {
     "odata.type": "Microsoft.DirectoryServices.ServicePrincipal",
     "accountEnabled": true,
     "appId": "b3ccae52-19bc-45a1-a4e4-f572f6963213",
     "displayName": "tttttteeeeeeeessssstttt",
     "passwordCredentials@odata.type": "Collection(Microsoft.DirectoryServices.PasswordCredential)",
     "passwordCredentials": [
         {
             "customKeyIdentifier": null,
             "endDate": "2017-06-29T21:43:15.6654372Z",
             "keyId": "0259571d-a663-4507-94e9-9381629e2116",
             "startDate": "2016-06-29T21:43:15.6639533Z",
             "value": "pass@word1"
         }
     ],
     "servicePrincipalNames@odata.type": "Collection(Edm.String)",
     "servicePrincipalNames": [
         "b3ccae52-19bc-45a1-a4e4-f572f6963213",
         "http://tttttteeeeeeeessssstttt"
     ]
    }
    

Note: One piece missing is to assign appropriate roles for executing on Service Management Operations for Azure Resource Manager Rolebased Access Control so that the Service Principal can execute the needed operations against the management APIs.

Do you really need to know all of these details?

With that we went through all the protocol details for the OAuth, OpenIdConnect and Graph API calls that are needed to accomplish an end-2-end task. It’s actually a very practical look at how all these "sequence diagrams" that are talking about OAuth are looking in the real world.

My intent to show these details was, to help people which are working with programming languages and runtimes that do not have nice SDKs available for encapsulating those protocol details to at least have a high-level overview and starting-point without reading the OAuth and OpenIdConnect specs. I know it’s high-level, but it’s practical.

OAuth and OpenId Connect Azure AD Resources

The following links do explain all the different query string parameters of the OAuth/OpenIdConnect flows with Azure AD. They are a great resource to better understand the http-requests I’ve outlined above.

SDKs for languages and Runtimes

Fortunately, if you are a .NET, Java, Node.js, PHP or Python developer, there are numerous examples and resources available. Also for Azure AD’s Graph API there’s a nice tool available to dig into all the JSON and protocol details.

Here are the most important links:

For Graph API there are also good samples and SDKs out there:

I hope that was helpful and gives you a great background or even a handy tool to create Service Principals. My partner needed the understanding of how-to build such multi-tenant Azure AD applications that do access the Azure AD Graph API and they needed to create Service Principals out of such a multi-tenant web application. So I thought it’s worth spending the additional time and getting it documented!

Microsoft Cognitive Services – Shopping Offer Comparisons with Bing Image Search v5

This week I got the chance for my first attempts with Cognitive services and image search – based on a request from my Global ISV partner:) While the services are actually easy to use for a developer, documentation is behind and Internet Search is highly miss-leading (since it mostly points to a previous Bing-search API that does not exist, anymore). That’s why I decided to blog about it and point people into the right direction.

The Case – Price Comparisons

The use case is simple and has been made available with our latest Bing-App for the iOS: perform price comparisons of products across multiple shops. But instead of an App, this blog-post is about doing this from within any application with right usage of the new Bing Search APIs in their version 5.0.

Cognitive Services

At the annual //build 2016 conference, Microsoft announced its new Cognitive services. These are services exposed through simple-to-use APIs for doing all sorts of intelligent stuff based on extensive data and machine learning algorithms. Face recognition, Speech Recognition and the likes are some of the more advanced APIs.

What many people don’t know is, that Bing APIs are also part of the Cognitive Services, now. That’s because a lot of the intelligence behind Cognitive Services is powered by Bing services incl. their intelligence and machine learning components. So don’t let yourself miss-lead by Internet-search results pointing to any kind of previous services.

That means don’t search the Internet, just navigate to Cognitive services right away and dig into the documentation.

Walking through the Use Case

Ok, let’s walk through the Use Case in a schematic way by looking at the APIs and their responses. That will give you a good view on how it actually works.

  1. You need a Microsoft Account, so if you don’t have one, sign up for one first.
  2. If you have a Microsoft Account, the first thing you need is signing up for the Cognitive Services Preview. For that purpose just navigate to the Cognitive Services Subscriptions Page.
  3. Once you have signed up for the subscriptions, you get application keys for each of the different types of APIs as shown in my screen-shot below. You need to “Show” and “Copy” the key for Bing Search to implement the case I’ve described above:
    Subscription Keys
  4. Once you have copied the subscription key for Bing Search, you can use Bing Image Search to look for images products you want to get price comparisons for.
    • I know, that sounds a bit confusing. But let’s assume you want to look for offerings of a sports watch such as Garmin Forerunner 225 (which I am currently interested in:)).
    • To get to shopping offers through Bing or Bing APIs, you’d do an image search for “Garmin Forerunner 225” and then the “new Bing” and “new Bing APIs” will give you that additional details of shopping offers.
    • Let me show you how that works with the API, but note that the same thing works with Bing Search in the browser for end-users.
  5. For testing the APIs I use the available Test User Interfaces that you can use for learning the APIs without writing code right away.
  6. Now, let’s dig into a few requests. If you want to get offers for e.g. a “Garmin Forerunner” watch, first you need to find images for that watch.
    GET https://bingapis.azure-api.net/api/v5/images/search?q=garmin forerunner 225&count=10&offset=0&mkt=en-us&safeSearch=Moderate HTTP/1.1
    Host: bingapis.azure-api.net
    Ocp-Apim-Subscription-Key: <<your API Key taken from the previous screen shot above>>
    
  7. Next you need to examine the results of the request above. The interesting pieces of the JSON response are the imageInsightsToken and the insightsSourcesSummary elements as highlighted below:
    Insights Details Highlighted
  8. These two attributes are used for the following purposes:
    • imageInsightsToken is used in the next subsequent request to get further details about the image kept and managed by Bing.
    • insightsSourcesSummary is something you can use to asses if it’s worth querying further insights for the image. E.g. in my case I wanted to get as many shop-offers for the Garmin as possible. So if I’d write this in a program I
      would search the top-most search results (first page of JSON elemetns I got from the previous API request) and pick those with the highest shoppingSourcesCount value as a simple strategy.
  9. So, let’s use the imageInsightsToken to get further details from the image. the following code shows the next request we’re executing with the token passed in as an additional parameter. Also note the use of the modulesRequested parameter
    which I use to specify which kinds of additional details I’d love to get from Bing for that image. That said, there are different modules providing additional information beyond the shopping sources.

    GET https://bingapis.azure-api.net/api/v5/images/search?
      q=garmin forerunner 225&count=10&offset=0
      &mkt=en-us
      &safeSearch=Moderate
      &modulesRequested=shoppingSources
      &insightstoken=ccid_eDfbozgF*mid_7F932B775F08CAD51E0BC3609B3ABA1B7AB73856*simid_608041772346312873 
      HTTP/1.1
    Host: bingapis.azure-api.net
    Ocp-Apim-Subscription-Key: <<your API Key taken from the previous screen shot above>>
    
  10. Finally we get the results we want to have from this request. We see a list of sources which are offering the product for a given price and the basic stock-information which Bing extracted from the web pages of that source for my Garmin Forerunner 225 search.
    Shopping Sources Results

I really find this kind-of cool. I found it really interesting to work with my ISV partner through that and given that documentation is not always most up2date on those new services I thought it’s worth blogging about it:).

Further resources for reading: Here are a few helpful links with further details and information. Fortunately from the time I started writing this post until I got it published, the team has updated additional documentation on MSDN!

Let me know your thoughts, best via http://twitter.com/mszcool!

NServiceBus, Azure Service Bus and Service Bus for Windows Server – A PoC for a Hybrid Cloud / Portable Solution

NServiceBus is a very popular messaging and workflow framework for .NET developers across the globe. This week a few peers of mine and I were working for one of our global ISV partners to evaluate, if NServiceBus can be used for Hybrid Cloud and portable solutions, that can be moved seamless from On-Premises to the Public Cloud and vice-versa.

My task was to evaluate, if NServiceBus can be used with both, Microsoft Azure Service Bus in the public cloud as well as Service Bus 1.1 for Windows Server in the private cloud. It was a very interesting collaboration and finally I got to write some prototype code for one of our partners, again. How cool is that – doing interesting stuff and at the same time it helps a partner. That’s how it should be!

Part #1: On-Premises Service Bus 1.1 Environment

The journey and prototyping began with setting up an On-Premises Service Bus 1.1 environment in my home lab. Fortunately there are some good instructions out there, but of course nothing goes without any pitfalls. Here’s a good set of instructions to start with – note that I did setup an entire Azure Pack express setup, which is clearly optional. But it makes things more convenient, especially for presentations, since it provides the nice, good old Azure Management Portal experience for your Service Bus on-premises. Here’s where you should look at how-to setup things:

  • Install the Azure Pack Express Setup
    • This shows, how-to setup a basic Azure Pack environment using Web Platform Installer.
    • I did not run into any problems installing it on a Hyper-V Box on my Home Lab. So it should be fairly straight forward.
    • You can install all on a single machine. What you need is SQL Server (Express is sufficient) pre-installed.
  • Install Service Bus for Windows Server
    • Again this happens via Web Platform Installer. With this I had a little challenge. Unfortunately as of writing this article, the link to the required version of the Windows Fabric in the Web Platform Installer was broken. I’ve uploaded it
      on my public OneDrive for convenience. You find it here. But I’ve had conversations with the product team, they will fix the broken link. So when you try it, it might work, already.
  • Configure Service Bus using the Wizard.
    • After installing you need to configure Service Bus for Windows Server. That happens through a Wizard. It essentially allows you to configure endpoints, ports and certificates used for security purposes for Service Bus 1.1 for Windows Server.
    • The configuration failed on the first attempt because something went wrong with installing the Service Bus patch. Uninstall and re-install did solve the problem.
  • Configure Service Bus for Windows Server for the Azure Pack Portal
    • That’s the final step to get the Azure Pack management portal experience for Service Bus 1.1 for Windows Server.
    • If you are fine with managing Service Bus through PowerShell, you can skip the entire Azure Pack Express stuff, start with Service Bus 1.1 right away and manage it through PowerShell.

At the end of this journey, which took me about half a day overall starting from scratch and figuring out the little gotchas mentioned above, I had a lab environment to test against. I am a fan of using Royal TS, hence the screen shot with my on-premises Service Bus as web pages embedded in Royal TS:

Part #2: NServiceBus and Azure Service Bus

The second part of the challenge was easy – figure out if NServiceBus supports Azure Service Bus, already. Because that would give us a good starting point, wouldn’t it!? Here are the docs for the NServiceBus transport extension for the Azure Serivce Bus: source code.

But the point is, the earliest version that supports Azure seems to be NServiceBus v5.0.0 and they started with a Microsoft.ServiceBus.dll above 3.x in the code-base. This version of the library is not compatible with Service Bus 1.1 for Windows Server. So I had to dig into the source code and back-port the library. Fortunately, Particular is open sourcing most of the framework’s bits and pieces on GitHub – so it does for the NServiceBus.AzureServiceBus connector here.

Note that I am referring to the version 6.2 of the implementation, directly, since that works with NServiceBus 5.0.0 which our global ISV partner is using at this point in time. I also tried back-porting the current development branch, but that turned
out to be way more complex and risky. And it was not needed for the partner, either:)

Part #3: Back-Porting to Microsoft.ServiceBus.dll v2.1

So the needed step is to back-port to a Service Bus SDK library that also works with Service Bus 1.1. Service Bus for Windows Server recently received a patch to work with .NET 4.6.1, but it has not received any major updates since its original release.
So it is behind with regards to its APIs compared to Service Bus in Azure.

I’ve done all of the steps below on my GitHub repository in a fork of the original implementation. Note that you should only look at my work in the branch ‘support-6.2’ which is the one that works with NServiceBus 5.0.0. The rest is considered to be experiments as we speak right now:)

Here is the link to my GitHub repo and the fork!!

The first step for doing so was to remove the NuGet package and replace it with one that works with Service Bus for Windows Server. Microsoft fortunately released a separate NuGet package that contains the version that is compatible with Service Bus for Windows Server:

The rest was all about looking where the NServiceBus-implementation is using features that are not available in version 2.1 of the Service Bus SDK and testing it against my Service Bus 1.1 for Windows Server lab setup. I think the best place to look at
what actually changed is by looking at the change-logs on my GitHub-Repository:

  • Initial back-port with most code changes (click here to open details on GitHub)
    • Update the NuGet Package to “ServiceBus.v1_1” instead of “WindowsAzure.ServiceBus”.
    • Remove EnablePartitioning because that’s not supported on SB 1.1.
    • Use MessagingFactory.CreateFromConnectionString() instead of MessagingFactory.Create() because the latter one does not assume different ports on different EndPoints for different APIs Service Bus is exposing. But that’s typically the case on default-setups of Service Bus for Windows Server (see my first ScreenShot).
    • I also added some regular expressions to detect, if a Service Bus Connection String is one for on-premises or the public cloud to keep most of the default behaviors when connecting against the public cloud true. See the code-snippet below. It might not be complete or perfect, but it fulfills the basic needs.
  • Added some samples (click here to open details on GitHub)
    • This contains a basic Sender and Receiver implementation that uses the transport.
    • You need to set the Environment Variable AzureServiceBus.ConnectionString in a command prompt and start Visual Studio from that one to successfully execute. Btw. that’s also needed if you need to run the tests. In that case you
      also need to set AzureServiceBus.ConnectionString.Fallback with an alternate Service Bus connection string.

Here is the little code-snippet that checks if the code is used for on-premises Service Bus services or for Azure Service Bus instances:

class CreatesMessagingFactories : ICreateMessagingFactories
{
    #region mszcool 2016-04-01

    // mszcool - Added Connection String parsing to detect whether a public or private cloud Service Bus is addressed!
    public static readonly string Sample = "Endpoint=sb://[namespace name].servicebus.windows.net;SharedAccessKeyName=[shared access key name];SharedAccessKey=[shared access key]";
    private static readonly string Pattern =
        "^Endpoint=sb://(?<namespaceName>[A-Za-z][A-Za-z0-9-]{4,48}[A-Za-z0-9]).servicebus.windows.net/?;SharedAccessKeyName=(?<sharedAccessPolicyName>[\\w\\W]+);SharedAccessKey=(?<sharedAccessPolicyValue>[\\w\\W]+)$";

    public static readonly string OnPremSample = "Endpoint=[namespace name];StsEndpoint=[sts endpoint address];RuntimePort=[port];ManagementPort=[port];SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=[shared access key]";
    private static readonly string OnPremPattern =
        "^Endpoint=sb\\://(?<serverName>[A-Za-z][A-Za-z0-9\\-\\.]+)/(?<namespaceName>[A-Za-z][A-Za-z0-9]{4,48}[A-Za-z0-9])/?;" +
        "StsEndPoint=(?<stsEndpoint>https\\://[A-Za-z][A-Za-z0-9\\-\\.]+\\:[0-9]{2,5}/[A-Za-z][A-Za-z0-9]+)/?;" +
        "RuntimePort=[0-9]{2,5};ManagementPort=[0-9]{2,5};" +
        "SharedAccessKeyName=(?<sharedAccessPolicyName>[\\w\\W]+);" +
        "SharedAccessKey=(?<sharedAccessPolicyValue>[\\w\\W]+)$";

    private bool DetectPrivateCloudConnectionString(string connectionString)
    {
        if (Regex.IsMatch(OnPremPattern, connectionString, RegexOptions.IgnoreCase))
            return true;
        else if (Regex.IsMatch(Pattern, connectionString, RegexOptions.IgnoreCase))
            return false;
        else {
            throw new ArgumentException($"Invalid Azure Service Bus connection string configured. " +
                                        $"Valid examples: {Environment.NewLine}" +
                                        $"public cloud: {Pattern} {Environment.NewLine}", 
                                        $"private cloud (SB 1.1): {OnPremPattern}");
        }
    }

    #endregion

    ICreateNamespaceManagers createNamespaceManagers;
    // ... rest of the implementation ...
}

The part where I needed this detection most was to decide, how-to instantiate the MessagingFactory. This is the relevant piece of code – note that MessagingFactory.Create() with the NamespaceManager-Address passed in does only work in the public cloud, not with Service Bus 1.1 on Windows Server:

class CreatesMessagingFactories : ICreateMessagingFactories
{
    // ... earlier stuff in that class including 'DetectPrivateCloudConnectionString' ...

    ICreateNamespaceManagers createNamespaceManagers;

    public CreatesMessagingFactories(ICreateNamespaceManagers createNamespaceManagers)
    {
        this.createNamespaceManagers = createNamespaceManagers;
    }

    public MessagingFactory Create(Address address)
    {
        var potentialConnectionString = address.Machine;
        var namespaceManager = createNamespaceManagers.Create(potentialConnectionString);

        // mszcool - Updated to detect if Service Bus 1.1 for Windows Server is used
        if (DetectPrivateCloudConnectionString(potentialConnectionString))
        {
            // mszcool - Need to use this approach because different ports for control and transport endpoints are used
            return MessagingFactory.CreateFromConnectionString(potentialConnectionString);
        }
        else {
            var settings = new MessagingFactorySettings
            {
                TokenProvider = namespaceManager.Settings.TokenProvider,
                NetMessagingTransportSettings =
            {
                BatchFlushInterval = TimeSpan.FromSeconds(0.1)
            }
            };
            return MessagingFactory.Create(namespaceManager.Address, settings);
        }
    }
}

Finally with those fixes incorporated, I was able to get almost all things working and all except two tests passing for now. For the proof-of-concept that is successful for now, since it proofs that the partner could achieve what they need to achieve.

Part #4: See it in Action

Now comes the cool part – testing it out and seeing it in Action. The Samples I’ve added to the git-repository are simple messaging examples which I’ve modified from the NServiceBus Samples repository as well. Note that I’ve taken the Non-Durable MSMQ sample to proof since starting-point for the partner was MSMQ and I wanted something super-simple to start with. That is just how far I got, eventually I try other samples (but no promise at this time:)). Below the code-snippet of the sender – the receiver looks nearly identical and you can look it up on my repository on GitHub:

static void Main()
{
    string connStr = System.Environment.GetEnvironmentVariable("AzureServiceBus.ConnectionString");

    Console.Title = "Samples.MessageDurability.Sender";
    #region non-transactional
    BusConfiguration busConfiguration = new BusConfiguration();
    busConfiguration.Transactions()
        .Disable();
    #endregion
    busConfiguration.EndpointName("Samples.MessageDurability.Sender");
    busConfiguration.ScaleOut().UseSingleBrokerQueue();
    busConfiguration.UseTransport<AzureServiceBusTransport>()
        .ConnectionString(connStr);
    busConfiguration.UseSerialization<JsonSerializer>();
    busConfiguration.EnableInstallers();
    busConfiguration.UsePersistence<InMemoryPersistence>();

    using (IBus bus = Bus.Create(busConfiguration).Start())
    {
        bus.Send("Samples.MessageDurability.Receiver", new MyMessage());
        Console.WriteLine("Press any key to exit");
        Console.ReadKey();
    }
}

Here’s the code actually in action and working. You see what it produced on my on-premises Service Bus as well as the log output from the console windows. Note that the message handler part of the receiver outputs that it received a message.

One thing I played around with was having two receivers, that’s why you see two output-lines for one single message in my receiver. To clarify, here’s the code of MyHandler.cs from the Receiver project which outputs those lines:

public class MyHandler : IHandleMessages<MyMessage>
{
    static ILog logger = LogManager.GetLogger<MyHandler>();

    public void Handle(MyMessage message)
    {
        logger.Info("Hello from MyHandler");
    }
}

public class MyHandler2 : IHandleMessages<MyMessage>
{
    static ILog logger = LogManager.GetLogger<MyHandler2>();

    public void Handle(MyMessage message)
    {
        logger.Info("Hello from MyHandler2!");
    }
}

Final Words

I think this Proof-Of-Concept we’ve built alongside with other aspects we covered for that Global Software Vendor partner in UK demonstrates several aspects:

  • That it is possible to have a solution that works (nearly) seamless on-premises and in the public cloud with largely the same code base.
    • The situation should DRAMATICALLY improve once Microsoft has released the Azure Stack, which is the successor of what I’ve used here (which was the Azure Pack).
    • We can expect that the Azure Stack will deliver a much more up2date and consistent experience with Azure in the public cloud once it is fully available incl. Service Bus.
  • That with NServiceBus one of the most important 3rd-party middle-ware frameworks plays very well together with Azure and that it also can be used for Service Bus On-Premises with some caveats (like back-porting the transport-library).
    • An alternative, which I also tried to demonstrate with the simple sample, would be to use the MSMQ NServiceBus transport on-premises and use the AzureServiceBus Transport from NServiceBus for public cloud deployments. As long as only features
      supported on both sides are used, that might be the preferred way since with that you can fully rely in code delivered for you by NServiceBus without any changes.

Note that my attempts are meant to be a Proof-of-Concept, only. You can look at them, try them and even apply them for your solutions fully at your own risk:)

I think it was a great experience working with the team in UK on this part of a larger Proof-of-Concept (which also included e.g. Azure Service Fabric for software that
needs to be portable between on-premises and the public cloud but wants to make use of a true Platform-as-a-Service foundation).

I hope you enjoyed reading this and found it interesting and useful.

How-to remove secrets from the entire history of a Git-repository!!

I really do like Git a lot and even for my private projects I use it as the default. But some aspects of it are quite tricky. A well-known practice is, that you should never check-in secrets or things you don’t want to share with others into a Git-repository. That is especially interesting with public repositories hosted on e.g. GitHub.

Well, saying you should not and actually not forgetting about it are two different things. Sometimes it just happens. And even if you are careful with secrets, it can also be other stuff you checked in but didn’t want to share with others. So it happened to me when I wrote the last blog-post about automating my developer machine setup and published my Machine Setup Script.

The secrets in the history on GitHub!?

As explained in my previous blog-post, in the setup automation script I use for re-setting up a fresh developer machine, I also do clone a hand of repositories which are of relevance to me and/or to which I contributed some code. The majority of those repositories is public on GitHub. But some of them are from real-world projects with our customers and partners which are hosted in a private VSTS environment we run for our global team. I accidentally published that list of git clone commands as well.

No passwords, no secrets – but the repository names sometimes contained the names of the partners/customers and some of that work is not done or public, yet. So even though these were not secrets, I am not supposed to share them, yet.

Unfortunately, I realized that only after a few check-ins. So the entire history in my public GitHub repository contained those repository-names from an internal VSTS environment which I didn’t want to share. Damn… the post is out, the link points to
the repository… what to do?

How-to remove secrets/content from the entire history with Git?

Of course the “easy” way for this specific case would have been to delete the repository and re-create a new one with the fixed file published. That works for cases where the history is not really important and where you have a truly small repository. In other words, it works for samples and the likes. But even I have received a pull-request for that file which I didn’t want to loose, either. By all means, deleting and re-creating is not something that should be considered as a solution for this problem.

So, I did a little Internet-search and came across something that can save many Git-Hub repositories from mandatory deletion to remove things from the entire history, I guess:

The BFG Repo Cleaner

This is an awesome tool if you ran into the problem I’ve had. Let’s say you have published something into a Git-repository across multiple commits and pushes that you want to get rid of from the entire history. All you need to do are the following steps:

  1. Download the BFG Repo Cleaner into a local directory of your choice.
    1. The app is written in Scala
    2. It requires a Java-runtime on your machine.
    3. It is distributed as a JAR-package that contains all dependenices.
  2. Open up a command prompt and switch to a temporary directory.
    1. I did this in a temp-directory because it requires a new git clone --mirror of your repository which is a 1:1 mirror of the remote repository.
    2. After that you need to push that mirror back to the remote repository again. And then you can delete the mirror and return to your ordinary repository clone.
  3. Perform a clone of your repository with the option --mirror (I am using my devmachinesetup-repo here since I had to do it with this one, so just replace devmachinesetup with any of your repository-names in the commands below).
    1. git clone --mirror https://github.com/mszcool/devmachinesetup.git
    2. This clones a mirror of your remote repository with the entire history into a sub-folder of the current folder called devmachinesetup.git.
  4. Stay in the folder that contains the devmachinesetup.git folder with the mirrored repository in it.
  5. Create a text file that contains the text you want to purge from the history of all files in your git repository.
    1. Each line contains a string (incl. spaces, special characters etc.) that you want to remove. In my case these strings were the complete git clone <<repositoryname>> commands which I wanted to remove from the history
      of commits of the script in the repository. Each line in this text-file contained one of those entire commands.
    2. BFG searches every file in your git-mirror folder and replaces each instance of each lines from the text-file with the text *** REMOVED *** in the target files of the repository.
    3. A little sample excerpt for how the content of that text file shows, how simple it is – in my case it was just one git-clone per line which I wanted to remove from the history:git clone https://xyz.visualstudio.com/_DefaultCollection/first.git
      git clone https://xyz.visualstudio.com/_DefaultCollection/second.git
      git clone https://xyz.visualstudio.com/_DefaultCollection/third%20complex%20name.git thirdrepo
  6. Execute the BFG command. Note that BFG is based on the Java-runtime, so either add the folder with BFG JAR-package to your CLASSPATH environment variable or specify the full path to the JAR-package when executing Java. This looks similar to:
    1. java -jar C:\Temp\bfg-1.12.8.jar --replace-text myunwantedtext.txt devmachinesetup.git
    2. Note that when you download the BFG JAR package, the version in the name of the .jar-file might be different.
    3. file myunwantedtext.txt contained the full
  7. Now BFG has replaced the unwanted content in the local clone. Last-but-not-least you need to push that one back to the remote repository.
    1. Again, in your command prompt window remain in the directory which contains the devmachinesetup.git sub-directory with your git-mirror.
    2. Execute git push to push the mirror back.

That’s it, you’re done. After I executed the steps above on my repository, I checked online with several commits if it worked. In your case, the result should now look similar to what I’ve achieved once you’ve completed the steps above:

Results of Removing unwanted content

Final words

Removing unwanted content from the entire history of a Git-repository is needed sometimes. Whether it’s about accidental commits of secrets or other (sensitive) content or e.g. large files you want to clean up from your repository.

The BFG Repo Cleaner is a handy tool for such cases. It can indeed be used for cases such as the one I described. But it also contains options for other cases such as removing large files from the history of your repository which are not needed there, anymore.

BFG is cool and handy, but if you need more advanced scenarios, you might need to fall-back to the way more powerful, yet much more complex git-filter-branch tool (here). I guess that for 80% of the cases, BFG might be good enough and given the fact it is super-easy to use I’d first give it a chance before digging through the docs of git-filter-branch.

Kudos to the folks which built BFG… great job and thank you very much for saving my day (I will donate;))…

My “Developer Machine Setup Automation Script” / Chocolatey & PowerShell published

I know, I wanted to blog more often… but then a happy event came in between which was the birth of my little baby daughter – Reinhard’s 2 year younger sister Linda. Now I am sitting here at the registry office for her documents. What could be a
better thing to write a blog-post while waiting for the next steps!?

Since my brand-new Surface Book arrived, yesterday, one thing I have to do is setting it up for my daily use. That means installing all the handy tools that I use on a daily basis such as PDF readers, KeePass and the likes. It also means setting up the
many tools and development environments I use (some of them more often, others less often).

Well, now the cool thing is that I’ve automated most of these setup procedures using Chocolatey and PowerShell. And given the situation I am
right now in (new daughter, waiting at the registry office, Surface Book arrived) I thought why not share my script with the rest of the world and explain it a bit…

Chocolatey and PowerShell Script

A while ago when one of my peers, Kristofer Liljeblad, pointed me to Chocolatey, I started building out a fairly comprehensive script that automates most of the steps for setting up of my machines. I’ve now published that script on my github repositiory:

mszCool’s Dev Machine Setup Script

Since I thought I don’t want to install all tools on every machine, I added some switches to the script that allows me to install a group of tools, only. E.g. when I setup a machine for one of my relatives that typically is not used for development,
I only install a bunch of end-user tools such as a PDF reader. On my developer machines I typically run all of the switches.

But in that case it turned out, that sometimes PowerShell needs to be restarted after certain install procedures. Therefore I added a few other switches to the script so that I can re-start PowerShell in between of those steps.

Finally, I still do install certain parts manually. E.g. for SQL Server Management Studio on Windows 10 I need to add the .NET Framework 3.5 Feature Set which I have not coded into PowerShell, yet (I know it does work).

All of this resulted into a workflow of several phases for which most of the times the script just does its work but sometimes I need to intervene and re-start PowerShell or do some manual installation steps. I know, there might be more elegant solutions
to this. But remember, it’s just for setting up my developer machines, so it’s just quick and pragmatic.

Using the Script

To use the script, as I said there are a few steps that need to be executed manually upfront.

  • Install the .NET Framework 3.5 Feature through the Windows Settings Panel
  • Start an elevated PowerShell Window as
  • Set the PowerShell Execution Policy to Unrestricted by executing Set-ExecutionPolicy Unrestricted

After that you can start the script the first time. For that purpose I typically download it into a temp-directory on my local machine since it downloads things (which it typically deletes afterwards, but not always:)). From there, just execute the following
commands in the previously opened PowerShell Window:

.\Install-WindowsMachine.ps1 -installChoco -tools -ittools -dev -data

This installs the following items:

  • -installChoco installs Chocolatey as per their homepage.
  • -tools installs tools I commonly use such as Adobe Reader or KeePass.
  • -ittoolsinstalls tools I categorized as OS tools such as SysInternals.
  • -dev installs a set of independent tools I use for development such as Fiddler.
  • -data installs tools to manage databases, but not the DB-engines themselves. Examples are SQL Server Management Studio or MySQL Workbench.

Since this phase installs many tools which are added to the environment path, it is now necessary to re-start PowerShell before moving on.

Installing IDEs

The next step is installing Integrated Development Environments such as Visual Studio. If you are fine with the (free Community Edition of Visual Studio)[https://www.visualstudio.com/en-us/products/visual-studio-community-vs.aspx],
the script includes a switch for that.

As Eclipse-based IDE I decided to stick with Spring Tool Suite. The main reasons for that are one, because Spring is one of the most popular Java-frameworks and two, because it comes with a whole lot
of add-ins (e.g. Maven and many more) pre-packaged. That makes it convenient! Sometimes for Java I also play around with IntelliJ IDEA, so I also included that.

For all the rest I am using Visual Studio Code and Sublime, although I haven’t VS Code included in the script, yet (didn’t have the time to update:)), and still
install it manually.

If you don’t like those choices, now is the time to manually install your favorite versions of Visual Stuido, Eclipse and co. If you like it, the command below does the job and installs all the aforementioned environments – for Visual Studio Versions
I support both, 2013 and 2015:

.\Install-WindowsMachine.ps1 -installVs -vsVersion 2015 -installOtherIDE

After that phase it’s needed to re-start PowerShell, again, because IDEs and especially Visual Studio adds a few things to the environment path.

Visual Studio Extensions, Web PI and Database Engines

Yes, I also got bored by installing Visual Studio Extensions over and over again. So, what I did is writing a little PowerShell function which downloads Visual Studio Extensions and installs them using the Visual Studio Extensions Installer. Also there
are a few more tools which I typically need in Visual Studio which are available in the Web Platform Installer (WebPI). These are things such as the Azure SDKs and tools or Azure PowerShell
and the likes.

Simply call the script with the following switches after you’ve Visual Studio installed (note that -dev2 and -vsext are not dependent on each other):

.\Install-WindowsMachine.ps1 -dev2 -vsext -vsVersion 2015

To install database engines such as SQL Server Express, Cassandra (based on Datastax’s distribution), I’ve included an additional switch:

.\Install-WindowsMachine.ps1 -dataSrv

Cloning Repositories

Finally, on every developer machine I need to clone many repositories from GitHub or private TFS and Bitbucket environments. I’ve also added that to the script, although I did remove the ones from private environments for this publication. For this
I’ve added the following switch, which can be combined with ANY of the above AFTER -dev has ran (because that installs git):

.\Install-WindowsMachine.ps1 -cloneRepos

What’s missing & Caveats

This script really saves me a lot of time whenever setting up machines on a regular basis (i.e. development VMs for trying out something or re-setting up my machines because of Windows Insider Build stuff etc.).

A few things are missing which I typically install. I plan to add those pieces to the script at some point in time in the future as soon as I have some spare time. But for now I do install them manually:

For those I need to update the script to be able to download and run MSIs in a silent mode. Also Git-Posh requires cloning a git-repo and executing a few statements. All of that is simple, but I haven’t had the time to do it, yet. But I also do accept
pull-requests if you find the time to work on it and think it’s a useful extension to this work.

Final Words

This script is a simple and pragmatic thing. It installs most of the things in an automated way and typically with it and with a good Internet connection I have a fully working developer machine after a few hours without really doing a lot. The things
I install manually (e.g. Visual Studio 2015 Enterprise instead of Community) also do run a bit longer. So they don’t distract me while I do some other work on one of my other machines. And that’s the point – I can install new images while
working in parallel nearly end-2-end and it won’t hold me off from other work (as opposed to continuously clicking through some installer wizards:)).

I hope you like the work and as mentioned, I am happy about any feedback and pull-requests for it. Just go, look, download and use it from my github repository

Azure VMs – SQL Server AlwaysOn Setup across multiple Data Centers fully automated (Classic Service Management)

Last December I started working with two of my peers, Max Knor and Igor Pagliai, with a partner in Madrid on implementing a Cross-Data Center SQL Server AlwaysOn availability group setup for a financial services solution which is supposed to be provided to 1000s of banks across the world running in Azure. Igor posted about our setup experience which we partially automated with Azure PowerShell and Windows PowerShell – see here.

At the moment the partner’s software still requires SQL Server in VMs as opposed to Azure SQL Databases because of some legacy functions they use from full SQL Server – therefore this decision.

One of the bold goals was to fully enable the partner and their customers to embrace DevOps and continuous delivery across multiple environments. For this purpose we wanted to FULLY AUTOMATE the setup of their application together with an entire cross-data-center SQL Server AlwaysOn environment as outlined in the following picture:

In December we did a one-week hackfest to start these efforts. We successfully did setup the environment, but partially automated, only. Over the past weeks we went through the final effort to fully automate the process. I’ve published the result on my github repository here:

Deployment Scripts Sample Published on my GitHub Repository

Note: Not Azure Resource Groups, yet

Since Azure Resource Manager v2 which would allow us to dramatically improve the performance and reduce the complexity of the basic Azure VM environment setup is still in Preview, we were forced to use traditional Azure Service Management.

But about 50%-60% of the efforts we have done are re-usable given the way we built up the scripts. E.g. all the database setup and custom service account setup which is primarily built on-top of Azure Custom Script VM Extensions can be re-used after the basic VM setup is completed. We are planning to create a next version of the scripts that does the fundamental setup using Azure Resource Groups because we clearly see the advantages.

Basic Architecture of the Scripts

Essentially the scripts are structured into the following main parts which you would need to touch if you want to leverage them or understand them for learning purposes as shown below:

  • Prep-ProvisionMachine.ps1 (prepare deployment machine)
    A basic script you should execute on a machine before starting first automated deployments. It installs certificates for encrypting passwords used as parameters to Custom Script VM Extensions as well as copying the basic PowerShell modules into the local PowerShell module directories so they can be found.
  • Main-ProvisionConfig.psd1 (primary configuration)
    A nice little trick by Max which is nice to provide at least some sort of declarative configuration was to build a separate script file that creates an object-tree with all the configuration data typically used for building up the cluster. It contains cluster configuration settings, node configuration settings and default subscription selection data.
  • Main-ProvisionCrossRegionAlwaysOn.ps1 (main script for automation)
    This is the main deployment script. It performs all the actions to setup the entire cross-region cluster including the following setups:
    • Setup your subscription if requested
    • Setup storage accounts if they do not exist, yet
    • Upload scripts required for setup inside of the VMs to storage
    • Setup cloud services if requested
    • Create Virtual Networks in both regions (Primary/Secondary)
    • Connect the Virtual Networks by creating VPN Gateways
    • Set the primary AD Forest VM and the Forest inside of the VM
    • Setup secondary AD DC VMs including installing AD
    • Provision SQL Server VMs
    • Setup the Internal Load Balancer for the AlwaysOn Listener
    • Configure all SQL VMs to have AlwaysOn enabled
    • Configure the Primary AlwaysOn node with the initial database setup
    • Join secondary AlwaysOn nodes and restore databases for sync
    • Configure a file-share based witness in the cluster
  • VmSetupScripts Folder
    This is essentially a folder with a series of PowerShell scripts that do perform single installation/configuration steps inside of the Virtual Machines. They are downloaded with a Custom Script VM Extension into the Virtual Machines and executed through VM Extensions, as well.

Executing the Script and Looking at the Results

Before executing the main command make sure to execute .\Prep-ProvisionMachine.ps1 to setup certificates or import the default certificate which I provide as part of the sample. If you plan to seriously use those scripts, please create your own certificate. Prep-ProvisionMachine.ps1 provides you with that capability assuming you have makecert.exe somewhere on your machines installed (please check Util-CertsPasswords for the paths in which I look for makecert.exe).

# To install a new certificate
.\Prep-ProvisionMachine.ps1

# To install a new certificate (overwriting existing ones with same Subject Names)
.\Prep-ProvisionMachine.ps1 -overwriteExistingCerts

# Or to install the sample certificate I deliver as part of the sample:
.\Prep-ProvisionMachine.ps1 -importDefaultCertificate

Then everything should be fine to execute the main script. If you don’t specify the certificate-related parameters as shown below I assume you use my sample default certificate I include in the repository to encrypt secrets pushed into VM Custom Script Extensions.

# Enter the Domain Admin Credentials
$domainCreds = Get-Credential

# Perform the main provisioning

.\Main-ProvisionCrossRegionAlwaysOn.ps1 -SetupNetwork -SetupADDCForest -SetupSecondaryADDCs -SetupSQLVMs -SetupSQLAG -UploadSetupScripts -ServiceName "mszsqlagustest" -StorageAccountNamePrimaryRegion "mszsqlagusprim" -StorageAccountNameSecondaryRegion "mszsqlagussec" -RegionPrimary "East US" -RegionSecondary "East US 2" -DomainAdminCreds $domainCreds -DomainName "msztest.local" -DomainNameShort "msztest" -Verbose

After executing a main script command such as the following, you will get 5 VMs in the primary region and 2 VMs in the secondary region acting as a manual failover. 

The following image shows several aspects in action such as the failover cluster resources which are part of the AlwaysOn availability group as well as SQL Server Management Studio accessing the AlwaysOn Availability Group Listener as well as SQL Nodes, directly. Click on the image to enlarge it and see all details.

Please note that the failover in the secondary region needs to happen MANUALLY by executing either a planned manual failover or a forced manual failover as documented on MSDN. Failover in the primary region (from the first to the second SQL Server) is configured to happen automatically.

In addition on Azure it means to take the IP cluster resource for the secondary region online which by default is offline in the cluster setup as you can see on the previous image.

Customizing the Parts you Should Customize

As you can see in the image above, the script creates sample databases which it sets up for the AlwaysOn Availability Group to be synchronized across two nodes in the main. This happens based on *.sql scripts you can add to your configuration. To customize the SQL Scripts and Databases affected, you need to perform the following steps:

  • Create *.sql scripts with T-SQL code that creates the databases you want to create as part of your AlwaysOn Availability Group.
  • Copy the *.sql Files into the VmSetupScripts directory BEFORE starting the execution of the main script. That leads to have them included into the package that gets pushed to the SQL Server VMs
  • Open up the main configuration file and customize the database list based on the databases created with your SQL scripts as well as the list of SQL Scripts that should be pushed into osql.exe/sqlcmd.exe as part of the setup process for creating the databases.
  • Also don’t forget to customize the subscription name if you plan to not override it through the script-parameters (as it happens with the example above).

The following image shows those configuration settings highlighted (in our newly released Visual Studio Code editor which also has basic support for PowerShell):


Fundamental Challenges

The main script can primarily be seen as a PowerShell workflow (we didn’t have the time to really implement it as a Workflow, but that would be a logical next step after applying Azure Resource Groups).

It creates one set of Azure VMs after another and joins them to the virtual networks it has created before. It then executes scripts on the Virtual Machines locally which are doing the setup by using Azure VM Custom Script Extensions. Although custom script extensions are cool, you have two main challenges with them for which the overall package I published provides re-usable solutions:

  • Passing “Secrets” as Parameters to VM Custom Script Extensions such as passwords or storage account keys in a more secure way as opposed to clear-text.
  • Running Scripts under a Domain User Account as part of Custom Script Extensions that require full process level access to the target VMs and Domains (which means PowerShell Remoting does not work in most cases even with CredSSP enabled … such as for Cluster setups).

For these two purposes the overall script package ships with some additional PowerShell Modules I have written, e.g. based on a blog-post from my colleague Haishi Bai here.

Running Azure VM Custom Script Extensions under a different User

Util-PowerShellRunAs.psm1 includes a function called Invoke-PoSHRunAs which allows you to run a target script with its parameters under a different user account as part of a custom script VM Extension. A basic invocation of that script looks as follows:

$scriptName = [System.IO.Path]::Combine($scriptsBaseDirectory, "Sql-Basic01-SqlBasic.ps1") 
Write-Host "Calling into $scriptName"
try {
    $arguments = "-domainNameShort $domainNameShort " + `
                 "-domainNameLong $domainNameLong " +  `
                 "-domainAdminUser $usrDom " +  `
                 "-dataDriveLetter $dataDriveLetter " +  `
                 "-dataDirectoryName $dataDirectoryName " +  `
                 "-logDirectoryName $logDirectoryName " +  `
                 "-backupDirectoryName $backupDirectoryName " 
    Invoke-PoSHRunAs -FileName $scriptName -Arguments $arguments -Credential $credsLocal -Verbose:($IsVerbosePresent) -LogPath ".\LogFiles" -NeedsToRunAsProcess
} catch {
    Write-Error $_.Exception.Message
    Write-Error $_.Exception.ItemName
    Write-Error ("Failed executing script " + $scriptName + "! Stopping Execution!")
    Exit
}

This function allows you to either run through PowerShell remoting or in a separate process. Many setup steps of the environment we setup do actually not work through PowerShell remoting because they rely on impersonation/delegation or do PowerShell Remoting on their own which imposes several limitations.

Therefore the second option this script provides is executing as a full-blown process. Since Custom Script Extensions to run as local system, it is nevertheless not as simple as just doing a Start-Process with credentials being passed in (or a System.Diagnostics.Process.Start() with different credentials). Local System does not have those permissions, unfortunately. So the work-around is to use the Windows Task Scheduler. For such cases the function performs the following actions:

  • Schedule a task in the Windows Task Scheduler with the credentials needed to run the process as.
  • Manually start the task using PowerShell cmdLets
    • (Start-ScheduledTask -TaskName $taskName)
  • Wait for the task to be finished from running
  • Look at the exit code
  • Throw an Exception if the exit code is non-zero, otherwise assume success
  • Delete the task again from the task scheduler

This “work-around” helped us to completely execute the entire setup steps successfully. We were also discussing with the engineers building the SQL AlwaysOn single-data-center Azure Resource Group template that is available for single-data-center deployments in the new Azure Portal, today. They are indeed doing the same thing, details are just a bit different.

Encrypting Secrets Passed to Custom Script VM Extensions

Sometimes we were just required to pass secret information to custom script extensions such as storage account keys. Since Azure VM Custom Script Extensions are logged very verbose, it would be a piece of cake to get to that secret information by doing a Get-AzureVM and looking at the ResourceExtensionStatusList member which contains the status and detailed call information for all VM Extensions.

Therefore we wanted to encrypt secrets as they are passed to Azure VM Extensions. The basic (yet not perfect) approach works based on some guidance from a blog post from Haishi Bai as mentioned earlier.

I’ve essentially written another PowerShell module (Util-CertsPasswords) which can perform the following actions:

  • Create a self-signed certificate as per guidance on MSDN for Azure.
  • Encrypt Passwords using such a certificate and return a base64-encoded, encrypted version.
  • Decrypt Passwords using such a certificate and return the clear-text password.

In our overall workflow all secrets including passwords and storage account keys which are passed to VM Custom Script Extensions as parameters are passed as encrypted values using this module.

Using Azure CmdLets we make sure that the certificates are published with the VM as part of our main provisioning script as per Michael Washams guidance from the Azure Product group.

Every script that gets executed as part of a custom VM Script Extension receives an encrypted password and uses the module I’ve written to decrypt it and use it for the remaining script such as follows:

#
# Import the module that allows running PowerShell scripts easily as different user
#
Import-Module .\Util-PowerShellRunAs.psm1 -Force
Import-Module .\Util-CertsPasswords.psm1 -Force

#
# Decrypt encrypted passwords using the passed certificate
#
Write-Verbose "Decrypting Password with Password Utility Module..."
$localAdminPwd = Get-DecryptedPassword -certName $certNamePwdEnc -encryptedBase64Password $localAdminPwdEnc 
$domainAdminPwd = Get-DecryptedPassword -certName $certNamePwdEnc -encryptedBase64Password $domainAdminPwdEnc 
Write-Verbose "Successfully decrypted VM Extension passed password"

The main provisioning script encrypts the passwords and secrets using that very same module before being passed into VM Custom Script Extensions as follows:

$vmExtParamStorageAccountKeyEnc = `
Get-EncryptedPassword -certName $certNameForPwdEncryption `             -passwordToEncrypt ($StorageAccountPrimaryRegionKey.Primary)

That way we at least make sure that no un-encrypted secret is visible in the Azure VM Custom Script Extension logs that can easily be retrieved with the Azure Service Management API PowerShell CmdLets.

Final Words and More…

As I said, there are lots of other re-usable parts in the package I’ve just published on my Github Repository which even can be used to apply further setup and configuration steps on VM environments which have entirely been provisioned with Azure Resource Groups and Azure Resource Manager. A few examples:

  • Execute additional Custom Script VM Extensions on running VMs.
  • Wait for Custom Script VM Extensions to complete on running VMs.
  • A ready-to-use PowerShell function that makes it easier to Remote PowerShell into provisioned VMs.

We also make use of an AzureNetworking PowerShell module published on the Technet Gallery. But note that we also made some bug-fixes in that module (such as dealing with “totally empty VNET configuration XML files”).

Generally the experience of building these ~2500 lines of PowerShell code was super-hard but a great learning experience. I am really keen to publish the follow-up post on this that demonstrates how much easier Azure Resource Group templates to make such a complex setup.

Also I do hope that we will have such a multi-data-center template in the default gallery soon since it is highly valuable for all partners and customers that do need to provide high-availability across multiple data centers using SQL Server Virtual Machines. In the meantime we will try to provide a sample based on this work above as soon as we can have time/resources for implementation.

Finally – thanks to Max Knor and Igor Pagliai – without their help we would not have achieved these goals at this level of completeness!