This repository contains solutions to two distinct tasks:
- A Python Flask API deploying to Production Environment
- Secure Database Access from the API
The codebase can be better visualized as below:
- Python Postgres Azure Flask Application
- Table of Content
- Python Flask API
- Theoretical Case: Secure Database Access
- Assumptions
- ToDo
In this section, we will cover a practical solution for setting up a deployable production environment for a simplified application. This environment consists of a API service deployed on Azure Cloud. The goal is to automate the setup of this production environment using "infrastructure as code" principles. Below are the steps to achieve this:
Note: The solution development was conducted on a MacBook M1. Therefore, the instructions are tailored for use in a macOS environment or a similar development environment.
Before we get started ensure you have below tools setup:
- Python v3.13 - For developing the Flask API application.
- Terraform v1.5.7 - Leading Infrastructure as Code framework for building Cloud & On-Prem Infrastructure.
- GoLang go v1.21.1 (Optional) - We are using
Terratest
for testing our TF Code. So only if you wish to write or run tests then you would require it. - Docker Desktop - Used for containerizing and testing the application locally.
- Azure Cloud Account (Optional) - For deploying the application to the Cloud Environment.
- Azure Container Apps - A fully managed serverless containerized application platform. We are deploying our API service to it.
- Github Account (Free Tier) - If you would like to fork this repo and run the pipeline.
- Github Actions - Our CI/CD platform providing a cloud agnostic approach for application and infrastructure deployment.
- DockerHub - For storing our private docker image. SaaS solution quick to use. Ideally for a enterprise solution we need a Paid tier of DockerHub or use other solutions such Azure Container Registry etc.
- Behave - For BDD testing using the Gherkins Syntax of the API.
All the tools we have used so far are Free to use for personal usage.
- TFSec - Scans Terraform Code for vulnerable configurations.
- TFLint - Linting the TF Code
- Terraform-Compliance - Lightweight, security and compliance focused test framework.
- Terratest - For Unit Testing the IAC and ensuring the configuration matches the desired state.
- Bandit - A python lib for static code analysis.
- Safety - A python lib for dependency vulnerability analysis.
- Checkov - IAC vulnerability scanning tool, we have used it for scanning our Dockerfile.
- trivy - For scanning vulnerabilities in docker image even before pushing it to the registry.
- Owasp Zap - PenTest
- Github Secret Scanning
- Dependabot and Mergify - Dependabot bumps the dependencies by creating a PR. This helps us keep our dependencies up to date and avoid vulnerabilities. We also use Mergify to streamline the PR merging process, automating it when all the necessary checks and criteria are satisfied.
-
Start Docker Desktop
-
From the root directory of this repository execute below command:
make start-app-db
The above command uses docker compose to run containerized instance of our API and
postgres-13.5
database and then uploads the mock data into the postgres database. -
Test the application by making API requests. For example:
-
GET - Greeting API (Health check)
curl http://127.0.0.1:3000
{"message":"Hello world!"}
-
GET - Rates API
curl "http://127.0.0.1:3000/rates?date_from=2021-01-01&date_to=2021-01-31&orig_code=CNGGZ&dest_code=EETLL"
The output should be something like this:
{ "rates" : [ { "count" : 3, "day" : "2021-01-31", "price" : 1154.33333333333 }, { "count" : 3, "day" : "2021-01-30", "price" : 1154.33333333333 }, ... ] }
-
-
Stop the application
make stop-app-db
-
Check all available options
make help
Click here to check the local execution steps
There’s an SQL dump in
db/rates.sql
that needs to be loaded into a PostgreSQL 13.5 database.After installing the database, the data can be imported through:
createdb rates psql -h localhost -U postgres < db/rates.sql
You can verify that the database is running through:
psql -h localhost -U postgres -c "SELECT 'alive'"
The output should be something like:
?column? ---------- alive (1 row)
Start from the
rates
folder.DEBIAN_FRONTEND=noninteractive apt-get update && apt-get install -y python3-pip pip install -U gunicorn pip3 install -Ur requirements.txt
gunicorn -b :3000 wsgi
The API should now be running on http://localhost:3000.
Get average rates between ports:
curl "http://127.0.0.1:3000/rates?date_from=2021-01-01&date_to=2021-01-31&orig_code=CNGGZ&dest_code=EETLL"
The output should be something like this:
{ "rates" : [ { "count" : 3, "day" : "2021-01-31", "price" : 1154.33333333333 }, { "count" : 3, "day" : "2021-01-30", "price" : 1154.33333333333 }, ... ] }
The solution uses Terraform (infrastructure as code components) that allow you to deploy this environment on cloud providers such as Azure.
We have 2 logical segregation of the terraform code as below:
Bootstrap Infrastructure refers to the essential infrastructure resources that are necessary during the initial provisioning phase and ideally remains unchanged or require infrequent modifications.
Contains terraform code for provisioning Resource Group, Storage account, Service Principal etc.
These kind of infrastructure requires higher level of privileges for provisioning and in some organization it's maintained by a separate team (Cloud Engineering, SRE etc) for various purposes.
Click here to see implementation details
Building infrastructure from scratch poses a "Chicken and Egg Paradox" challenge. This challenge arises because, for Terraform to store its state, a storage container is required. We may choose to provision this storage container manually. However, as per our commitment to provisioning all resources through Infrastructure as Code (IAC), we encounter a problem. We can deal with this problem as below:
- Initialize Terraform without a "remote backend."
- Write bare minimal Terraform code for provisioning a storage container.
- Apply Terraform changes to create this storage container successfully.
- After the container is provisioned, the Terraform backend configuration is added to the Terraform provider.Subsequently, Terraform is reinitialized.
- Now, the backend provisioning is managed by Terraform itself, ensuring ease of management.
Follow below steps for provisioning bootstrap infrastructure:
-
Ensure you have Terraform
1.5.7
installedterraform --version
-
Authenticate Azure CLI by running below command from the terminal (or other appropriate means)
az login
-
From the terminal change the directory
cd infrastructure/bootstrap
-
Comment terraform backend config
backend "azurerm" {}
is commented ininfrastructure\bootstrap\main.tf
-
Initialize Terraform for dev environment
terraform init -var-file=./dev/terraform.tfvars
-
Plan the Terraform changes and review
terraform plan -var-file=./dev/terraform.tfvars
-
Apply changes after its reviewed
terraform init -var-file=./dev/terraform.tfvars
-
Re Initialize Terraform to use a remote backend
Uncomment
# backend "azurerm" {}
Then execute below command and when prompted respond as
yes
:terraform init -backend-config=./dev/backend-config.hcl -var-file=./dev/terraform.tfvars
Once successfully executed the local
terraform.state
file has been securely stored in the Azure Storage Account. -
Repeat the steps for other environments.
We have configured a bootstrap-infrastructure
Github Actions Pipeline to automate the bootstrap infrastructure provisioning and avoid any local execution apart from the initial setup here
Contains terraform code for provisioning Azure VNet, VNet associated infrastructure components and Azure Container Apps for deploying a simple containerized application.
Application infrastructure refers to any infrastructure that we will use for deploying and running the application. They are intended for the team who owns the application and generally requires very specific privileges for the infrastructure deployment.
Click here to see the implementation details
The application infrastructure primarily contains the Terraform code for deploying the Python Flask API and the Database connection configurations can be managed by the terraform code defined in the terraform.tfvars
in the respective environment directory.
Override the default values set in the variables.tf
for each environment in the [ENVIRONMENT_NAME]/terraform.tfvars
file respectively as shown below:
app_container_config={
name = "[ENVIRONMENT_NAME]-python-postgres-azure-app"
revision_mode = "Single"
memory = "0.5Gi"
cpu = 0.25
ingress = {
allow_insecure_connections = false
external_enabled = true
target_port = 5000
}
environment_variables = [
{
name = "name"
value = "[ENVIRONMENT_NAME]postgres"
},
{
name = "user"
value = "[ENVIRONMENT_NAME]abhishek"
},
{
name = "host"
value = "[POSTGRES_HOST_URL_FOR_RESPECTIVE_ENVIRONMENT]"
}
]
}
These infrastructure changes are deployed as part of the python-application-deployment
Github Actions Pipeline configure here
We use Github Actions for automating below operations:
- Provisioning our Azure Infrastructure in all the environments with all the quality and security gates
- Deploying the Python Flask API in all the environments with all the quality and security gates
- Testing Dependabot and other PR's and automatically merging once all the success criteria are met
- Scheduled Vulnerability Scanning of all the components of the SDLC and Infrastructure
- Scheduled Smoke Test on Infrastructure for any drift detection
We are using Github-OIDC
for securing the connectivity between the Github Actions and Azure Cloud thus reducing the risk of compromising the credentials. Once the service principal is provisioned by the bootstrap infrastructure you must configure them in the github repository under Settings > Secrets and Variables > Actions > New Repository Secret
Secure Access to the Postgres Database deployed on Azure Cloud requires:
- End-to-end auditing capabilities for any operation performed.
- An automated solution for rotating the database credentials every X number of days and workflow capabilities for user management and manual approval/reviews before an action being taken.
- The solution should provide zero downtime for the application.
We would like to use below components of the Azure Cloud for implementing the solution:
-
Azure Active Directory: Integrate Azure AD with PostgreSQL database to manage users, authentication, and access control. This allows for centralized user management and provides a basis for approval workflows.
-
Azure Key Vault: Store and manage the secrets and passwords securely within Azure Key Vault. Key Vault provides features for secret rotation and auditing, which align with our compliance requirements.
-
Azure Logic Apps: Logic Apps to automates user creation, approval workflows, and password rotation. Logic Apps can integrate with Azure AD and Key Vault to perform these tasks.
-
Azure Monitor and Azure Security Center: Use of the Azure services to monitor and audit activities on our Azure PostgreSQL database. It provide insights and compliance checks for our database operations.
The solution can be visualized with the help of below request flow diagram:
- The API needs to be publicly accessible.
- The intendant audience has decent understanding of Azure Cloud, Github Actions, Terraform, Docker.
- They are using Macbook or similar development environment.
Below is the list of the things we must do to make the implementation production ready.
- Add Links to the official documentations.
- Migrate from Pip to modern package managers like
Pipenv
orPoetry
for better dependency management. - Implement Unit Test and E2E test appropriately to adhere
Test Pyramid strategy
thus maturing testing strategy. - Tagging Strategy: Currently we are using job id for tagging docker images which remains unique across the pipeline execution. A preferred approach would be to use
semver
for versioning the images and API.
- Use
Terratest
for Integration test. - Implement
Smoke/E2E
testing for IAC once the Infrastructure is provisioned. Execute on a scheduled event to detect any drift from the desired state defined as IAC. - Pass the plan artifact from plan to apply. Terraform best practice.
- Analyze the pros and cons of TF workspace.
- Analyze the pros and cons of Splitting the AZ Container Apps Environment provisioning through TF and AZ Container App deployment with configuration through Github Actions and YML configs.
- Refactor Github Actions Pipeline code to reduce Boilerplate code and practice DRY.
- Perform Pen Test after Dev Deployment.
- Certain organization requires manual approval step before
Prod
deployment and creation of aChange Request
for auditing purposes. Thus it should be implemented for the application and infrastructure deployments. - Automate
Chaos Testing
using Simian Army for testing Disaster Recovery strategy.
- Restrict Ingress and Egress to the API.
- Integrate Web Application Firewall.
- Integrate API Gateway and API Management with the AZ Container Apps. Use appropriate Authentication and Authorization mechanism to protect the API.
- Block Public access for the
Dev
andPre
. Configure VNet to allow access only within Organizations private network, example once the users are connected to the VPN. - Setup
tunnel
for using with the CI/CD pipeline thus allowing access to Dev/Pre environment API for executing tests once deployed. - Limit the scope of the Service Principals and roles assigned. Create a separate Service Principal for application deployment.