NASA is about to launch it’s most powerful rocket ever, the Space Launch System (SLS) on it’s maiden voyage around the moon any day now. Cobbled together from old Space Shuttle parts it’s taken 11 years and $4 billion of tax payer’s money to build. The SLS, like the previous giant Saturn 5 rocket before it, is not reusable.
Meanwhile privately funded company SpaceX has been working away on it’s giant Starship rocket which is totally reusable.
Both these rockets are designed to do the same job, get astronauts to the moon. But what’s interesting is the processes that brought them to life.
It was thought by using old Space Shuttle parts the SLS would be cheaper to develop, in fact, it’s the opposite which has shown to be true. NASA’s public funding model has had a very undesirable effect. The requirement from congress to provide jobs through contracts to American companies means that delivery is incentivised to take longer and therefore at a higher cost.
Fortunately, as a result of the Obama government, NASA did provide funding for private space companies to take up the challenge and this has resulted in SpaceX being able to develop Starship which competes directly with the SLS but for a fraction of the cost.
The SLS seems to have been a very costly insurance project, just in case the privately funded space companies did not rise to the challenge. But I can’t help thinking what else could NASA have built with $4 billion? Maybe something akin to amazing scientific research projects like the James Web Telescope.
At the end of the day these rockets are really just infrastructure, a way to get payloads up into low earth orbit and eventually the moon, they require iteration, experimentation, an agile mindset and above all, an ability to embrace failure. This is where competition and innovation shine and what SpaceX has in spades. This is in stark contrast to NASA’s compliance driven, risk averse, bureacratic culture. No longer driven by the mission, a far cry from the NASA of 1969 and the first moon landings.
If you’re using tools like Checkmarx or JFrog Xray to scan for security vulnerabilities in your third party dependencies in your NPM builds then you may have noticed that they can highlight a lot of security vulnerabilities that come from development only dependencies.
If you’re producing a shared NPM library or service there is no need for your development dependencies to be included in the final package and to acheive this you have to pass the –only=production flag.
This will save a lot of time as security scans will only consider production dependencies.
Example – Using JFrog Xray on Azure Pipelines
Here is the complete code snippet to install only development dependencies, pack and publish the artifact, collect the build-info (for Xray) and then perform a Xray scan of the build.
Deployment from the Azure DevOps cloud service to on-premise servers can be done in either a pull or push setup. Usually I’ve found the pull approach most suitable as it easily scales to multiple target machines in each environment and does require the pipeline deployment job itself to know about each server. (Server in this case can be a VM or actual physical server)
Using YAML pipelines (preferred over the classic release GUI pipelines) we can implement pull deployments using Environments. Basically each Environment is configured within Azure Pipelines and target servers are added. Each target server requires an agent to be running on it to communicate back to the deployment job in Azure Pipelines. So this makes each of your target servers not only run your application but also the run deployment (via the installed agent)
Tags allow you to differentiate between server types or roles such as web, app, database or primary and secondary regions. This is useful when you configure your jobs in the pipeline so certain jobs will run against certain targets.
One good thing about Environmentsis that they are not part of limit for parrallel agents
The agent talks to Azure DevOps over port 443 which means that you can have pretty strong rules on inbound traffic to the server, as only outbound traffic over 443 is required for the agent to work.
Compromised agent software will have direct, on machine access to all servers including production
Having a variation of an application for each environment. A single artifact should be built once and then deployed to all environments otherwise you can’t guarantee that each variation has been tested.
Having to re-build and re-deploy the artifact if changes are required in it’s environmental configuration, for same reason as above.
Have secrets mixed in with your non-secret environment config (or anywhere in source code for that matter).
Consider Separation of Concerns
Consider who is going to make changes to the config, both secret and non secret.
Responsible for the schema of the configuration:Need to know the keys NOT the values across environments
Responsible for the life cycle of the configuration e.g.CRUDRenew expiring secretsEnsure securityNeed to set the values as they are the ones who create these for each environment
Infrastructure team have to modify application source code to update values (or pass values to Developers)
Resource names are potentially sensitive information that might help a hacker to gain access to systems
Infrastructure would need to modify multiple app configs if those apps are deployed to a single service e.g. K8s cluster. Whereas if config is obtained by the service runtime this only has to be done once
Stores environment specific values for the application but does not contain sensitive informationDoes not require encryption
Access by general operations
Sensitive configuration valuesConnection stringsCertificatesAccess tokensEach application should have it’s own scope
Access restricted to elevated operational roles
Consider how config changes get into production
By accessing the config at runtime you avoid having to rebuild and redeploy app when config changes
Required to make config change live
Config store accessed at runtimee.g. Azure App Configuration accessed by SDK
App picks up while running
Config store accessed at build timee.g. Config file in with source code.A file for each environment e.g. appsettings.json, appsettings-dev.json, appsettings-test.json
Rebuild and redeploy the app
Config store accessed at deploy time e.g. Helm
Redeploy the app
Why use config store?
Allows config to be centrally stored so easier to debug problems and compare config across related services
Supports hierarchies of config parameters
Control feature availability in real-time through feature flags
Identify the long running stages that don’t need to run sequentially. For example you may run static code analysers, a Sonar code quality scan and also a Checkmarx cxSAST security scan. These can be run independently and so are good candidates to run at the same time. They also tend to take a few minutes which is generally longer than most other build tasks.
Azure Pipelines allows stages to be run in parallel by simply not specifying a dependsOn to indicate dependent jobs
- job: Windows
- script: echo hello from Windows
- job: macOS
- script: echo hello from macOS
- job: Linux
- script: echo hello from Linux
Rather than run all your build tasks for all branches including feature branches, think about moving longer running tasks to only execute on pull request builds. For example you could move the Sonar code quality scan to only run when merging to master through a pull request. The downside to this is that developers are getting the feedback slightly later in the cycle but one way to mitigate this is to run SonarLint within your IDE to get feedback as you code. https://www.sonarqube.org/sonarlint/
Downloading dependencies can be bandwidth intensive and time consuming. By caching the third party packages that are needed for your build you can avoid the cost of downloading each time. This is especially important if you use disposable agents that are thrown away after executing their build stage.
Azure Pipelines also supports caching across multiple pipeline runs.
Check how many agents you have available to run pipeline tasks. If you are running tasks in parallel you will need multiple agents per pipeline. Check the queued tasks and consider increasing the number of available agents if you see tasks are waiting for others to complete.
This post shows how to structure a monorepo for NuGet packages and then automate their build using Azure YAML pipelines.
Use a package’s solution file to define the location of the packages source code and test code. This single solution file can then be passed to the DotNetCoreCLI@2 task to only build and pack that particular package.
Solution files are Common.A.sln and Common.B.sln.
Project file structure for two packages common.A and common.B
As Seth Godin knows opportunity cost just went up. Building and delivering software is getting more complicated, so keep your human mind free to focus on the interesting bits and leave the boring, repetitive stuff to the machines.
What’s the best way to do this? Build your CD pipeline first, setup the infrastructure (preferably serverless) from the start and then get into the flow of development. Add more checks and balances as you go. Run automated tests. Deploy to the cloud.
Azure Pipelines supports storing secret variables within the project, either through variable groups or as secret variables.
This is a convenient place to store all those database connection strings and access tokens you need to allow access to external services like JFrog Artifactory or deploy to Azure services such as a Kubernetes cluster or CosmosDB.
Simply enter the secret value, check the padlock and everything is safe right? Well depends what you mean by ‘safe’!
Not so fast kiddo
Although Azure Pipelines take pains to obscure the echoing of secrets to the pipeline console and even prevents secret being made available by default to pipeline scripts this does not mean the secrets can’t be still be , how shall we say, obtained.
A pipeline developer who has access to the azure-pipelines.yml can quite easily grab the secret value, echo it to a file, publish that file as a pipeline artifact and then simply download the file from the pipeline console once run.
What do you mean – ‘developers can see the production passwords’?
Well, if you use variable groups to store secret variables for each environment you deploy to, then YES!
Now this may be fine if only admins have access to run the pipelines, but if your azure-pipelines.yml file is embedded within your application source code this in theory means an application developer could change the pipeline definition to reveal production secrets.
So how do I prevent this happening in our team?
Luckily the Azure Pipelines security onion does have a good selection of layers to peel back.
Force azure-pipelines.yml to extend a ‘master’ template which restricts what tasks and scripts can be run by the child script using a mechanism similar to inheritance. Use a ‘required template’ check to ensure only sanctioned templates that extend ‘master’ can be run.
Don’t put your azure-pipelines.yml in your app source code, instead store it within separate protected repo and pull in a reference to your app repo.
Use permissions on variable groups to allow access to pipeline admin level roles excluding developer roles. However this only works in devs are not allowed to deploy code to higher environments.
Separate the CI phase from the CD phase. This is similar to the previous technique and is in fact how pre yaml Azure Pipelines structures it’s builds. You could argue that the CI phase can be run freely by developers including deploying to a dev environment but the CD phase should only be accessible more privileged user that can promote the deployment through the various environments to production.
Finally, don’t use pipeline library secret variables. Instead use Service Connections, but a problem here is that not all services support service connections e.g. Azure Databricks.
Security should always be concern number zero with any production system and a CD Pipeline that holds the keys to so many precious castles is a core component to protect.