How-to remove secrets from the entire history of a Git-repository!!

I really do like Git a lot and even for my private projects I use it as the default. But some aspects of it are quite tricky. A well-known practice is, that you should never check-in secrets or things you don’t want to share with others into a Git-repository. That is especially interesting with public repositories hosted on e.g. GitHub.

Well, saying you should not and actually not forgetting about it are two different things. Sometimes it just happens. And even if you are careful with secrets, it can also be other stuff you checked in but didn’t want to share with others. So it happened to me when I wrote the last blog-post about automating my developer machine setup and published my Machine Setup Script.

The secrets in the history on GitHub!?

As explained in my previous blog-post, in the setup automation script I use for re-setting up a fresh developer machine, I also do clone a hand of repositories which are of relevance to me and/or to which I contributed some code. The majority of those repositories is public on GitHub. But some of them are from real-world projects with our customers and partners which are hosted in a private VSTS environment we run for our global team. I accidentally published that list of git clone commands as well.

No passwords, no secrets – but the repository names sometimes contained the names of the partners/customers and some of that work is not done or public, yet. So even though these were not secrets, I am not supposed to share them, yet.

Unfortunately, I realized that only after a few check-ins. So the entire history in my public GitHub repository contained those repository-names from an internal VSTS environment which I didn’t want to share. Damn… the post is out, the link points to
the repository… what to do?

How-to remove secrets/content from the entire history with Git?

Of course the “easy” way for this specific case would have been to delete the repository and re-create a new one with the fixed file published. That works for cases where the history is not really important and where you have a truly small repository. In other words, it works for samples and the likes. But even I have received a pull-request for that file which I didn’t want to loose, either. By all means, deleting and re-creating is not something that should be considered as a solution for this problem.

So, I did a little Internet-search and came across something that can save many Git-Hub repositories from mandatory deletion to remove things from the entire history, I guess:

The BFG Repo Cleaner

This is an awesome tool if you ran into the problem I’ve had. Let’s say you have published something into a Git-repository across multiple commits and pushes that you want to get rid of from the entire history. All you need to do are the following steps:

  1. Download the BFG Repo Cleaner into a local directory of your choice.
    1. The app is written in Scala
    2. It requires a Java-runtime on your machine.
    3. It is distributed as a JAR-package that contains all dependenices.
  2. Open up a command prompt and switch to a temporary directory.
    1. I did this in a temp-directory because it requires a new git clone --mirror of your repository which is a 1:1 mirror of the remote repository.
    2. After that you need to push that mirror back to the remote repository again. And then you can delete the mirror and return to your ordinary repository clone.
  3. Perform a clone of your repository with the option --mirror (I am using my devmachinesetup-repo here since I had to do it with this one, so just replace devmachinesetup with any of your repository-names in the commands below).
    1. git clone --mirror https://github.com/mszcool/devmachinesetup.git
    2. This clones a mirror of your remote repository with the entire history into a sub-folder of the current folder called devmachinesetup.git.
  4. Stay in the folder that contains the devmachinesetup.git folder with the mirrored repository in it.
  5. Create a text file that contains the text you want to purge from the history of all files in your git repository.
    1. Each line contains a string (incl. spaces, special characters etc.) that you want to remove. In my case these strings were the complete git clone <<repositoryname>> commands which I wanted to remove from the history
      of commits of the script in the repository. Each line in this text-file contained one of those entire commands.
    2. BFG searches every file in your git-mirror folder and replaces each instance of each lines from the text-file with the text *** REMOVED *** in the target files of the repository.
    3. A little sample excerpt for how the content of that text file shows, how simple it is – in my case it was just one git-clone per line which I wanted to remove from the history:git clone https://xyz.visualstudio.com/_DefaultCollection/first.git
      git clone https://xyz.visualstudio.com/_DefaultCollection/second.git
      git clone https://xyz.visualstudio.com/_DefaultCollection/third%20complex%20name.git thirdrepo
  6. Execute the BFG command. Note that BFG is based on the Java-runtime, so either add the folder with BFG JAR-package to your CLASSPATH environment variable or specify the full path to the JAR-package when executing Java. This looks similar to:
    1. java -jar C:\Temp\bfg-1.12.8.jar --replace-text myunwantedtext.txt devmachinesetup.git
    2. Note that when you download the BFG JAR package, the version in the name of the .jar-file might be different.
    3. file myunwantedtext.txt contained the full
  7. Now BFG has replaced the unwanted content in the local clone. Last-but-not-least you need to push that one back to the remote repository.
    1. Again, in your command prompt window remain in the directory which contains the devmachinesetup.git sub-directory with your git-mirror.
    2. Execute git push to push the mirror back.

That’s it, you’re done. After I executed the steps above on my repository, I checked online with several commits if it worked. In your case, the result should now look similar to what I’ve achieved once you’ve completed the steps above:

Results of Removing unwanted content

Final words

Removing unwanted content from the entire history of a Git-repository is needed sometimes. Whether it’s about accidental commits of secrets or other (sensitive) content or e.g. large files you want to clean up from your repository.

The BFG Repo Cleaner is a handy tool for such cases. It can indeed be used for cases such as the one I described. But it also contains options for other cases such as removing large files from the history of your repository which are not needed there, anymore.

BFG is cool and handy, but if you need more advanced scenarios, you might need to fall-back to the way more powerful, yet much more complex git-filter-branch tool (here). I guess that for 80% of the cases, BFG might be good enough and given the fact it is super-easy to use I’d first give it a chance before digging through the docs of git-filter-branch.

Kudos to the folks which built BFG… great job and thank you very much for saving my day (I will donate;))…

My “Developer Machine Setup Automation Script” / Chocolatey & PowerShell published

I know, I wanted to blog more often… but then a happy event came in between which was the birth of my little baby daughter – Reinhard’s 2 year younger sister Linda. Now I am sitting here at the registry office for her documents. What could be a
better thing to write a blog-post while waiting for the next steps!?

Since my brand-new Surface Book arrived, yesterday, one thing I have to do is setting it up for my daily use. That means installing all the handy tools that I use on a daily basis such as PDF readers, KeePass and the likes. It also means setting up the
many tools and development environments I use (some of them more often, others less often).

Well, now the cool thing is that I’ve automated most of these setup procedures using Chocolatey and PowerShell. And given the situation I am
right now in (new daughter, waiting at the registry office, Surface Book arrived) I thought why not share my script with the rest of the world and explain it a bit…

Chocolatey and PowerShell Script

A while ago when one of my peers, Kristofer Liljeblad, pointed me to Chocolatey, I started building out a fairly comprehensive script that automates most of the steps for setting up of my machines. I’ve now published that script on my github repositiory:

mszCool’s Dev Machine Setup Script

Since I thought I don’t want to install all tools on every machine, I added some switches to the script that allows me to install a group of tools, only. E.g. when I setup a machine for one of my relatives that typically is not used for development,
I only install a bunch of end-user tools such as a PDF reader. On my developer machines I typically run all of the switches.

But in that case it turned out, that sometimes PowerShell needs to be restarted after certain install procedures. Therefore I added a few other switches to the script so that I can re-start PowerShell in between of those steps.

Finally, I still do install certain parts manually. E.g. for SQL Server Management Studio on Windows 10 I need to add the .NET Framework 3.5 Feature Set which I have not coded into PowerShell, yet (I know it does work).

All of this resulted into a workflow of several phases for which most of the times the script just does its work but sometimes I need to intervene and re-start PowerShell or do some manual installation steps. I know, there might be more elegant solutions
to this. But remember, it’s just for setting up my developer machines, so it’s just quick and pragmatic.

Using the Script

To use the script, as I said there are a few steps that need to be executed manually upfront.

  • Install the .NET Framework 3.5 Feature through the Windows Settings Panel
  • Start an elevated PowerShell Window as
  • Set the PowerShell Execution Policy to Unrestricted by executing Set-ExecutionPolicy Unrestricted

After that you can start the script the first time. For that purpose I typically download it into a temp-directory on my local machine since it downloads things (which it typically deletes afterwards, but not always:)). From there, just execute the following
commands in the previously opened PowerShell Window:

.\Install-WindowsMachine.ps1 -installChoco -tools -ittools -dev -data

This installs the following items:

  • -installChoco installs Chocolatey as per their homepage.
  • -tools installs tools I commonly use such as Adobe Reader or KeePass.
  • -ittoolsinstalls tools I categorized as OS tools such as SysInternals.
  • -dev installs a set of independent tools I use for development such as Fiddler.
  • -data installs tools to manage databases, but not the DB-engines themselves. Examples are SQL Server Management Studio or MySQL Workbench.

Since this phase installs many tools which are added to the environment path, it is now necessary to re-start PowerShell before moving on.

Installing IDEs

The next step is installing Integrated Development Environments such as Visual Studio. If you are fine with the (free Community Edition of Visual Studio)[https://www.visualstudio.com/en-us/products/visual-studio-community-vs.aspx],
the script includes a switch for that.

As Eclipse-based IDE I decided to stick with Spring Tool Suite. The main reasons for that are one, because Spring is one of the most popular Java-frameworks and two, because it comes with a whole lot
of add-ins (e.g. Maven and many more) pre-packaged. That makes it convenient! Sometimes for Java I also play around with IntelliJ IDEA, so I also included that.

For all the rest I am using Visual Studio Code and Sublime, although I haven’t VS Code included in the script, yet (didn’t have the time to update:)), and still
install it manually.

If you don’t like those choices, now is the time to manually install your favorite versions of Visual Stuido, Eclipse and co. If you like it, the command below does the job and installs all the aforementioned environments – for Visual Studio Versions
I support both, 2013 and 2015:

.\Install-WindowsMachine.ps1 -installVs -vsVersion 2015 -installOtherIDE

After that phase it’s needed to re-start PowerShell, again, because IDEs and especially Visual Studio adds a few things to the environment path.

Visual Studio Extensions, Web PI and Database Engines

Yes, I also got bored by installing Visual Studio Extensions over and over again. So, what I did is writing a little PowerShell function which downloads Visual Studio Extensions and installs them using the Visual Studio Extensions Installer. Also there
are a few more tools which I typically need in Visual Studio which are available in the Web Platform Installer (WebPI). These are things such as the Azure SDKs and tools or Azure PowerShell
and the likes.

Simply call the script with the following switches after you’ve Visual Studio installed (note that -dev2 and -vsext are not dependent on each other):

.\Install-WindowsMachine.ps1 -dev2 -vsext -vsVersion 2015

To install database engines such as SQL Server Express, Cassandra (based on Datastax’s distribution), I’ve included an additional switch:

.\Install-WindowsMachine.ps1 -dataSrv

Cloning Repositories

Finally, on every developer machine I need to clone many repositories from GitHub or private TFS and Bitbucket environments. I’ve also added that to the script, although I did remove the ones from private environments for this publication. For this
I’ve added the following switch, which can be combined with ANY of the above AFTER -dev has ran (because that installs git):

.\Install-WindowsMachine.ps1 -cloneRepos

What’s missing & Caveats

This script really saves me a lot of time whenever setting up machines on a regular basis (i.e. development VMs for trying out something or re-setting up my machines because of Windows Insider Build stuff etc.).

A few things are missing which I typically install. I plan to add those pieces to the script at some point in time in the future as soon as I have some spare time. But for now I do install them manually:

For those I need to update the script to be able to download and run MSIs in a silent mode. Also Git-Posh requires cloning a git-repo and executing a few statements. All of that is simple, but I haven’t had the time to do it, yet. But I also do accept
pull-requests if you find the time to work on it and think it’s a useful extension to this work.

Final Words

This script is a simple and pragmatic thing. It installs most of the things in an automated way and typically with it and with a good Internet connection I have a fully working developer machine after a few hours without really doing a lot. The things
I install manually (e.g. Visual Studio 2015 Enterprise instead of Community) also do run a bit longer. So they don’t distract me while I do some other work on one of my other machines. And that’s the point – I can install new images while
working in parallel nearly end-2-end and it won’t hold me off from other work (as opposed to continuously clicking through some installer wizards:)).

I hope you like the work and as mentioned, I am happy about any feedback and pull-requests for it. Just go, look, download and use it from my github repository

Detecting if a Virtual Machine Runs in Azure – Part 2 – Updates for Linux VMs

A few months ago I did blog-post about how-to detect whether a virtual machine runs in Azure or not. This is vital for many independent software vendors who are planning to offer their own software through the Azure Marketplace for Virtual Machines.

The main detection strategy (Windows, Ubuntu)

In the post I did explain a few tricks on detecting whether the VM runs in Azure or not for both, Windows and Linux. Still the most reliable check known as of today is to check if the DHCP option “unknown-245” is set for in the DHCP-lease options for
a virtual machine.

  • Ubuntu Linux: I’ve posted a bash script in my previous blog. I generally stated that this works for Linux all up without considering that other Linux distributions might have different configuration files for storing DHCP lease details. Hence the following script works on Ubuntu-Linux based flavors, only:
      if `grep -q unknown-245 /var/lib/dhcp/dhclient.eth0.leases`; then
          echo “Running in an Azure VM”
      fi
    

Detecting if a CentOS VM runs on Azure

My peer and colleague Arsen Vladimirskiy pointed out that on CentOS the file for DHCP leases is stored on a different location. Hence the detection strategy for the DHCP-lease option I’ve explained in my original post does not work in CentOS-based virtual machines.

For CentOS based virtual machines the DHCP lease options are indeed stored in the path /var/lib/dhclient/dhclient.leases (or in case of multiple network interfaces dhclient-eth0.leases whereas the part eth0 needs to be replaced with the networking interface device you’re going to check against).

Therefore in a default configuration with just one ethernet adapter the script needs to be updated as follows to work inside of a CentOS virtual machine:

# manually start dhclient (seems to be a workaround)
dhclient

# then check against the lease files
if `grep -q unknown-245 /var/lib/dhclient/dhclient.leases`; then
   echo "Running in Azure VM"
fi

Note: There was one weird issue I ran into when trying the approach above, hence the script starts with launching dhclient. On a fresh deployed CentOS 7 VM in Azure from the marketplace stock image dhclient is not started by default. Therefore files such as dhclient.leases or dhclient-*.leases do not exist by default under /var/lib/dhclient/.

Only after manually executing the command sudo dhclient for starting the DHCP-client the files where created successfully and the check works. Well, now someone could think that this might be related to static IP addresses – but in Azure that’s not correct since IP addresses are always assigned by the Azure DHCP server. In case you want to have static IPs you configure those through the Azure Portal or Management APIs so that the Azure DHCP server always assigns the same, static IP address to the VM in the private, virtual network. So that cannot be the reason.

A more Complete Story for detecting DHCP unknown-245 in Linux

Well, now the distributions above are very common ones but are by var not all of the supported ones on Azure. The source code for the Azure Linux Agent contains all the secrets currently valid. If you really want to be on the save side across multiple Linux distributions. A few hints in the Python-based source code are:

  • Line 99-100 do show the directories you should consider for your detection strategy
      VarLibDhcpDirectories = 
         ["/var/lib/dhclient", "/var/lib/dhcpcd", "/var/lib/dhcp"]
      EtcDhcpClientConfFiles = 
         ["/etc/dhcp/dhclient.conf", "/etc/dhcp3/dhclient.conf"]
    
  • Further down in the code starting at line 5107 there is a section that makes use of option 245 as well:
      # ... other code before
      elif option == 3 or option == 245:
          # ...
      else:
          # ...
      # ... more code goes here
    

This code has been updated to version 2.0.15 24 days before writing/publishing this post. So it should still be safe to leverage option 245 for your detection strategy. As soon as there’s something better available, I’ll definitely post another update for this blog-post!

Final Disclaimer

The approaches outlined above did work on both, Ubuntu and CentOS 7 based VMs in Resource Manager based deployments (using the new ARM-template approach introduced by the Azure teams earlier this year) at the time of publishing this post (2015-09) during my tests. When I published the original post I did test them with classic service management based VMs, of course.

Therefore and as there is still no better way introduced at the time of publishing this post, yet, the options outlined in this and my original post are still valid and eventually the best you can get so far for detecting if your VM runs inside of Microsoft Azure or not…

If you found better options don’t hesitate to contact me via my twitter feed

Azure VMs – SQL Server AlwaysOn Setup across multiple Data Centers fully automated (Classic Service Management)

Last December I started working with two of my peers, Max Knor and Igor Pagliai, with a partner in Madrid on implementing a Cross-Data Center SQL Server AlwaysOn availability group setup for a financial services solution which is supposed to be provided to 1000s of banks across the world running in Azure. Igor posted about our setup experience which we partially automated with Azure PowerShell and Windows PowerShell – see here.

At the moment the partner’s software still requires SQL Server in VMs as opposed to Azure SQL Databases because of some legacy functions they use from full SQL Server – therefore this decision.

One of the bold goals was to fully enable the partner and their customers to embrace DevOps and continuous delivery across multiple environments. For this purpose we wanted to FULLY AUTOMATE the setup of their application together with an entire cross-data-center SQL Server AlwaysOn environment as outlined in the following picture:

In December we did a one-week hackfest to start these efforts. We successfully did setup the environment, but partially automated, only. Over the past weeks we went through the final effort to fully automate the process. I’ve published the result on my github repository here:

Deployment Scripts Sample Published on my GitHub Repository

Note: Not Azure Resource Groups, yet

Since Azure Resource Manager v2 which would allow us to dramatically improve the performance and reduce the complexity of the basic Azure VM environment setup is still in Preview, we were forced to use traditional Azure Service Management.

But about 50%-60% of the efforts we have done are re-usable given the way we built up the scripts. E.g. all the database setup and custom service account setup which is primarily built on-top of Azure Custom Script VM Extensions can be re-used after the basic VM setup is completed. We are planning to create a next version of the scripts that does the fundamental setup using Azure Resource Groups because we clearly see the advantages.

Basic Architecture of the Scripts

Essentially the scripts are structured into the following main parts which you would need to touch if you want to leverage them or understand them for learning purposes as shown below:

  • Prep-ProvisionMachine.ps1 (prepare deployment machine)
    A basic script you should execute on a machine before starting first automated deployments. It installs certificates for encrypting passwords used as parameters to Custom Script VM Extensions as well as copying the basic PowerShell modules into the local PowerShell module directories so they can be found.
  • Main-ProvisionConfig.psd1 (primary configuration)
    A nice little trick by Max which is nice to provide at least some sort of declarative configuration was to build a separate script file that creates an object-tree with all the configuration data typically used for building up the cluster. It contains cluster configuration settings, node configuration settings and default subscription selection data.
  • Main-ProvisionCrossRegionAlwaysOn.ps1 (main script for automation)
    This is the main deployment script. It performs all the actions to setup the entire cross-region cluster including the following setups:
    • Setup your subscription if requested
    • Setup storage accounts if they do not exist, yet
    • Upload scripts required for setup inside of the VMs to storage
    • Setup cloud services if requested
    • Create Virtual Networks in both regions (Primary/Secondary)
    • Connect the Virtual Networks by creating VPN Gateways
    • Set the primary AD Forest VM and the Forest inside of the VM
    • Setup secondary AD DC VMs including installing AD
    • Provision SQL Server VMs
    • Setup the Internal Load Balancer for the AlwaysOn Listener
    • Configure all SQL VMs to have AlwaysOn enabled
    • Configure the Primary AlwaysOn node with the initial database setup
    • Join secondary AlwaysOn nodes and restore databases for sync
    • Configure a file-share based witness in the cluster
  • VmSetupScripts Folder
    This is essentially a folder with a series of PowerShell scripts that do perform single installation/configuration steps inside of the Virtual Machines. They are downloaded with a Custom Script VM Extension into the Virtual Machines and executed through VM Extensions, as well.

Executing the Script and Looking at the Results

Before executing the main command make sure to execute .\Prep-ProvisionMachine.ps1 to setup certificates or import the default certificate which I provide as part of the sample. If you plan to seriously use those scripts, please create your own certificate. Prep-ProvisionMachine.ps1 provides you with that capability assuming you have makecert.exe somewhere on your machines installed (please check Util-CertsPasswords for the paths in which I look for makecert.exe).

# To install a new certificate
.\Prep-ProvisionMachine.ps1

# To install a new certificate (overwriting existing ones with same Subject Names)
.\Prep-ProvisionMachine.ps1 -overwriteExistingCerts

# Or to install the sample certificate I deliver as part of the sample:
.\Prep-ProvisionMachine.ps1 -importDefaultCertificate

Then everything should be fine to execute the main script. If you don’t specify the certificate-related parameters as shown below I assume you use my sample default certificate I include in the repository to encrypt secrets pushed into VM Custom Script Extensions.

# Enter the Domain Admin Credentials
$domainCreds = Get-Credential

# Perform the main provisioning

.\Main-ProvisionCrossRegionAlwaysOn.ps1 -SetupNetwork -SetupADDCForest -SetupSecondaryADDCs -SetupSQLVMs -SetupSQLAG -UploadSetupScripts -ServiceName "mszsqlagustest" -StorageAccountNamePrimaryRegion "mszsqlagusprim" -StorageAccountNameSecondaryRegion "mszsqlagussec" -RegionPrimary "East US" -RegionSecondary "East US 2" -DomainAdminCreds $domainCreds -DomainName "msztest.local" -DomainNameShort "msztest" -Verbose

After executing a main script command such as the following, you will get 5 VMs in the primary region and 2 VMs in the secondary region acting as a manual failover. 

The following image shows several aspects in action such as the failover cluster resources which are part of the AlwaysOn availability group as well as SQL Server Management Studio accessing the AlwaysOn Availability Group Listener as well as SQL Nodes, directly. Click on the image to enlarge it and see all details.

Please note that the failover in the secondary region needs to happen MANUALLY by executing either a planned manual failover or a forced manual failover as documented on MSDN. Failover in the primary region (from the first to the second SQL Server) is configured to happen automatically.

In addition on Azure it means to take the IP cluster resource for the secondary region online which by default is offline in the cluster setup as you can see on the previous image.

Customizing the Parts you Should Customize

As you can see in the image above, the script creates sample databases which it sets up for the AlwaysOn Availability Group to be synchronized across two nodes in the main. This happens based on *.sql scripts you can add to your configuration. To customize the SQL Scripts and Databases affected, you need to perform the following steps:

  • Create *.sql scripts with T-SQL code that creates the databases you want to create as part of your AlwaysOn Availability Group.
  • Copy the *.sql Files into the VmSetupScripts directory BEFORE starting the execution of the main script. That leads to have them included into the package that gets pushed to the SQL Server VMs
  • Open up the main configuration file and customize the database list based on the databases created with your SQL scripts as well as the list of SQL Scripts that should be pushed into osql.exe/sqlcmd.exe as part of the setup process for creating the databases.
  • Also don’t forget to customize the subscription name if you plan to not override it through the script-parameters (as it happens with the example above).

The following image shows those configuration settings highlighted (in our newly released Visual Studio Code editor which also has basic support for PowerShell):


Fundamental Challenges

The main script can primarily be seen as a PowerShell workflow (we didn’t have the time to really implement it as a Workflow, but that would be a logical next step after applying Azure Resource Groups).

It creates one set of Azure VMs after another and joins them to the virtual networks it has created before. It then executes scripts on the Virtual Machines locally which are doing the setup by using Azure VM Custom Script Extensions. Although custom script extensions are cool, you have two main challenges with them for which the overall package I published provides re-usable solutions:

  • Passing “Secrets” as Parameters to VM Custom Script Extensions such as passwords or storage account keys in a more secure way as opposed to clear-text.
  • Running Scripts under a Domain User Account as part of Custom Script Extensions that require full process level access to the target VMs and Domains (which means PowerShell Remoting does not work in most cases even with CredSSP enabled … such as for Cluster setups).

For these two purposes the overall script package ships with some additional PowerShell Modules I have written, e.g. based on a blog-post from my colleague Haishi Bai here.

Running Azure VM Custom Script Extensions under a different User

Util-PowerShellRunAs.psm1 includes a function called Invoke-PoSHRunAs which allows you to run a target script with its parameters under a different user account as part of a custom script VM Extension. A basic invocation of that script looks as follows:

$scriptName = [System.IO.Path]::Combine($scriptsBaseDirectory, "Sql-Basic01-SqlBasic.ps1") 
Write-Host "Calling into $scriptName"
try {
    $arguments = "-domainNameShort $domainNameShort " + `
                 "-domainNameLong $domainNameLong " +  `
                 "-domainAdminUser $usrDom " +  `
                 "-dataDriveLetter $dataDriveLetter " +  `
                 "-dataDirectoryName $dataDirectoryName " +  `
                 "-logDirectoryName $logDirectoryName " +  `
                 "-backupDirectoryName $backupDirectoryName " 
    Invoke-PoSHRunAs -FileName $scriptName -Arguments $arguments -Credential $credsLocal -Verbose:($IsVerbosePresent) -LogPath ".\LogFiles" -NeedsToRunAsProcess
} catch {
    Write-Error $_.Exception.Message
    Write-Error $_.Exception.ItemName
    Write-Error ("Failed executing script " + $scriptName + "! Stopping Execution!")
    Exit
}

This function allows you to either run through PowerShell remoting or in a separate process. Many setup steps of the environment we setup do actually not work through PowerShell remoting because they rely on impersonation/delegation or do PowerShell Remoting on their own which imposes several limitations.

Therefore the second option this script provides is executing as a full-blown process. Since Custom Script Extensions to run as local system, it is nevertheless not as simple as just doing a Start-Process with credentials being passed in (or a System.Diagnostics.Process.Start() with different credentials). Local System does not have those permissions, unfortunately. So the work-around is to use the Windows Task Scheduler. For such cases the function performs the following actions:

  • Schedule a task in the Windows Task Scheduler with the credentials needed to run the process as.
  • Manually start the task using PowerShell cmdLets
    • (Start-ScheduledTask -TaskName $taskName)
  • Wait for the task to be finished from running
  • Look at the exit code
  • Throw an Exception if the exit code is non-zero, otherwise assume success
  • Delete the task again from the task scheduler

This “work-around” helped us to completely execute the entire setup steps successfully. We were also discussing with the engineers building the SQL AlwaysOn single-data-center Azure Resource Group template that is available for single-data-center deployments in the new Azure Portal, today. They are indeed doing the same thing, details are just a bit different.

Encrypting Secrets Passed to Custom Script VM Extensions

Sometimes we were just required to pass secret information to custom script extensions such as storage account keys. Since Azure VM Custom Script Extensions are logged very verbose, it would be a piece of cake to get to that secret information by doing a Get-AzureVM and looking at the ResourceExtensionStatusList member which contains the status and detailed call information for all VM Extensions.

Therefore we wanted to encrypt secrets as they are passed to Azure VM Extensions. The basic (yet not perfect) approach works based on some guidance from a blog post from Haishi Bai as mentioned earlier.

I’ve essentially written another PowerShell module (Util-CertsPasswords) which can perform the following actions:

  • Create a self-signed certificate as per guidance on MSDN for Azure.
  • Encrypt Passwords using such a certificate and return a base64-encoded, encrypted version.
  • Decrypt Passwords using such a certificate and return the clear-text password.

In our overall workflow all secrets including passwords and storage account keys which are passed to VM Custom Script Extensions as parameters are passed as encrypted values using this module.

Using Azure CmdLets we make sure that the certificates are published with the VM as part of our main provisioning script as per Michael Washams guidance from the Azure Product group.

Every script that gets executed as part of a custom VM Script Extension receives an encrypted password and uses the module I’ve written to decrypt it and use it for the remaining script such as follows:

#
# Import the module that allows running PowerShell scripts easily as different user
#
Import-Module .\Util-PowerShellRunAs.psm1 -Force
Import-Module .\Util-CertsPasswords.psm1 -Force

#
# Decrypt encrypted passwords using the passed certificate
#
Write-Verbose "Decrypting Password with Password Utility Module..."
$localAdminPwd = Get-DecryptedPassword -certName $certNamePwdEnc -encryptedBase64Password $localAdminPwdEnc 
$domainAdminPwd = Get-DecryptedPassword -certName $certNamePwdEnc -encryptedBase64Password $domainAdminPwdEnc 
Write-Verbose "Successfully decrypted VM Extension passed password"

The main provisioning script encrypts the passwords and secrets using that very same module before being passed into VM Custom Script Extensions as follows:

$vmExtParamStorageAccountKeyEnc = `
Get-EncryptedPassword -certName $certNameForPwdEncryption `             -passwordToEncrypt ($StorageAccountPrimaryRegionKey.Primary)

That way we at least make sure that no un-encrypted secret is visible in the Azure VM Custom Script Extension logs that can easily be retrieved with the Azure Service Management API PowerShell CmdLets.

Final Words and More…

As I said, there are lots of other re-usable parts in the package I’ve just published on my Github Repository which even can be used to apply further setup and configuration steps on VM environments which have entirely been provisioned with Azure Resource Groups and Azure Resource Manager. A few examples:

  • Execute additional Custom Script VM Extensions on running VMs.
  • Wait for Custom Script VM Extensions to complete on running VMs.
  • A ready-to-use PowerShell function that makes it easier to Remote PowerShell into provisioned VMs.

We also make use of an AzureNetworking PowerShell module published on the Technet Gallery. But note that we also made some bug-fixes in that module (such as dealing with “totally empty VNET configuration XML files”).

Generally the experience of building these ~2500 lines of PowerShell code was super-hard but a great learning experience. I am really keen to publish the follow-up post on this that demonstrates how much easier Azure Resource Group templates to make such a complex setup.

Also I do hope that we will have such a multi-data-center template in the default gallery soon since it is highly valuable for all partners and customers that do need to provide high-availability across multiple data centers using SQL Server Virtual Machines. In the meantime we will try to provide a sample based on this work above as soon as we can have time/resources for implementation.

Finally – thanks to Max Knor and Igor Pagliai – without their help we would not have achieved these goals at this level of completeness!

Detecting if a Virtual Machine Runs in Microsoft Azure (Linux & Windows) to Protect your Software when distributed via the Azure Marketplace

Our team started working more and more with software vendors we categorize as “Enablers” at a global scale. Such companies are providing building block services which can be used to build finished software services that do run in the cloud (or on-premises).

For such “Enablers” the Azure Marketplace is a key-instrument to gain visibility and traction as well as for instantiating their services in their customer’s Microsoft Azure Subscriptions.

At the moment most of the partners are working with us to deploy offerings based on templates with single or multiple Virtual Machines that do run their software. Later down the path we will also enable on-boarding of “Application Services” where customers do not have to instantiate and manage Virtual Machines, anymore.

One of the main challenges our partners do face when putting their software into Virtual Machine templates which can be instantiated and/or purchased through the Azure Marketplace is protecting their software from being operated outside of Azure since this would enable malicious people to operate the software without charging for it.

Customers have full control to VMs provisioned via the Marketplace

Since when end-customers create Virtual Machines via the Azure Marketplace they have full control of the resulting, instantiated VMs after they provisioned them, many of our partners start asking the following obvious question: How can I detect if a Virtual Machine runs in Azure so that my software can block itself from being started when not running in Azure?

Unfortunately as of today there’s no good and simple answer to that. There are various approaches out there which I would like to summarize below. I think the best possible way as of today (April 2015) is a combination of all of these approaches to make it as hard as possible running your software outside of an Azure VM.

Query for DHCP Option 245

The first option is one that originally came up by a fellow peer from our Azure support engineering team. It has been provided for Windows Virtual Machines as a PowerShell script and essentially performs the following two actions:

  1. Check if the VMBus driver from Hyper-V is active.
  2. If so, check the DHCP lease attributes for option “unknown-245”

The option “unknown-245” is an Azure-proprietary option which only gets issued by an Azure DHCP server. Since in Azure you always will get an address via DHCP (static IPs are also managed by the DHCP and with the REST management API) you will always (and in theory only) get this option as part of the DHCP lease attributes when your machine runs in Azure.

For Windows there is a ready-made PowerShell CmdLet that allows you to detect if a VM runs in Azure: https://gallery.technet.microsoft.com/scriptcenter/Detect-Windows-Azure-aed06d51

For Linux you can create a bash-script such as the following one to detect if the option unknown-245 is available to have a first indicator of whether you run in Azure or not:

if `grep -q unknown-245 /var/lib/dhcp/dhclient.eth0.leases`; then
    echo “Running in an Azure VM”
fi

This is currently considered to be the most used and simplest approach to detect if you’re running on Azure that is “good enough”. But for some partners it is understandably not enough, yet…

Use the Azure Agent as Detection-Strategy

On Linux VMs in specific, another approach is reading the configuration from the Microsoft Azure Agent which is always installed on a Linux VM and try to reach it’s ping-counterpart on the host-agent side. If a VM does not run in Azure, trying to reach the host-agent end-point would always result in a timeout. Here’s a sample script for doing so:

curl –connect-timeout 1 `grep FullConfig /var/lib/waagent/GoalState.1.xml | perl -pe ‘s/<.?FullConfig>//g; s/\s//g’` && echo azure || echo no-azure

On Windows VMs there’s only an agent available when you explicitly select the VM Agent for VM Extensions to be installed. Some partners are checking if that agent is available and explicitly document for their customers that they MUST install the VM Agent when provisioning a Marketplace Image from the Azure Marketplace for their software to work correctly.

Checking your external IP Address

If you Virtual Machine has a public endpoint attached, you can also verify the public IP address your VM is using when trying to access other services and compare it against the IP address ranges that are reserved for Azure data centers.

The Azure data center IP address ranges can be downloaded from the Microsoft Download Center here: http://www.microsoft.com/en-us/download/details.aspx?id=41653

Valuable services to get your publicly visible IP address are services such as the http://ifconfig.me/ip which can also be used in a PowerShell script or bash-script easily:

function Get-ExternalIP {
    (Invoke-WebRequest ifconfig.me/ip).Content
}
Get-ExternalIP

A more complex script can then even automatically download the Azure IP ranges from the Microsoft Download center via the direct URL (which are stored as XML) and try to check if the ranges match.

Leveraging the Azure REST Management API or CLI-interfaces

Finally you could also ship your VM with either Azure PowerShell CmdLets or the Azure Cross-Platform CLI shipped and query details about your VM through the rest management API.

But – that has one big step you need to take upfront: this requires you to force the user somehow to provide credentials or a management certificate that gives the VM access to the customer’s subscription in which the VM is deployed so that you can query the details about the VM (which belongs to the customer’s subscription and is owned by the customer… and not you as the provider/creator of the marketplace VM template and offering).

To get this done you need to do e.g. one of the following things:

  • Write a very explicit documentation for your customers that explains what they need to do after they provisioned the VM from the Azure Marketplace into their subscription before they can use your software in that VM or VMs.
  • Or e.g. write a little provisioning web application which is shipped as part of the VM image that the user needs to browse to immediately after provisioning the VM from the marketplace to enter the remaining details that enable your software or “bootstrapping-scripts” to use the Azure Service Management API or CLI to query additional information about your VM and use that for detecting if you run in Azure or not (e.g. query your public and internal IP and compare with what your VM reports etc.).Of course you need to make sure that this “provisioning-app” you need to build is only active in the provisioned instance after the initial creation from the marketplace to avoid any kind of security issues.

At some point in time in the near future, the Azure Marketplace service will enable publishers of images to require the user providing additional details through the Azure portal as part of the provisioning/creation-process, already. But as long as that’s not possible you need to look at approaches such as the ones I’ve outlined above.

Final Words

The approaches outlined above are all used by publishers of VM templates in the marketplace today and they work. I know they are not optimal, but we also know that the product group is aware of the challenges and will work on better solutions in the future. For now, the approaches I outlined above are easy and pragmatic ways that at least give you some level of guarantee for detecting of whether a VM and your software runs in Microsoft Azure (public cloud) or not…