• Classes
  • 07 - Lambda Functions
  • Part 2

Infrastructure as Code (IaC)

Introduction

Infrastructure as Code (IaC) is a key practice in modern cloud computing and DevOps that allows you to manage and provision computing infrastructure through machine-readable definition files, rather than through physical hardware configuration or interactive configuration tools.

What is IaC?

Infrastructure as Code treats infrastructure the same way developers treat application code:

  • Version controlled: Infrastructure definitions are stored in version control systems
  • Reproducible: The same infrastructure can be created multiple times with identical results
  • Automated: Infrastructure deployment and management can be automated
  • Testable: Infrastructure changes can be tested before being applied to production

Why IaC for MLOps?

In MLOps, IaC becomes crucial for several reasons:

  • Consistency: Ensure that development, staging, and production environments are more consistent
  • Scalability: Easily replicate infrastructure for different models or experiments
  • Collaboration: Teams can collaborate on infrastructure changes through code reviews
  • Disaster Recovery: Quickly rebuild infrastructure from code definitions
  • Cost Control: Track and manage cloud resources more effectively

IaC Tools

Several tools can be used for IaC:

  • AWS CloudFormation: Native AWS service for infrastructure provisioning
  • AWS CDK: Code-based approach using familiar programming languages
  • AWS SAM: Simplified approach specifically for serverless applications
  • Pulumi: Modern IaC using real programming languages
  • Terraform: Popular multi-cloud IaC tool
  • OpenTofu: Community-driven fork of Terraform

OpenTofu: Fun Fact!

A few years ago, HashiCorp decided to change Terraform's license to a more restrictive model, which raised concerns about the project's freedom and sustainability.

In response, the community created an open fork to maintain collaboration and ensure transparency in development.

Initially called OpenTF and later renamed to OpenTofu, the project became maintained by the Linux Foundation, ensuring open governance and compatibility with Terraform.

Thus, it became a free and reliable alternative for users and companies that depend on IaC.

For this class, we'll focus on Terraform as it's widely adopted across the industry and provides excellent multi-cloud support.

Terraform Overview

Terraform is an open-source infrastructure as code software tool created by HashiCorp. It enables users to define and provision infrastructure using a declarative configuration language.

Key benefits of Terraform:

  • Multi-cloud: Works with AWS, Azure, GCP, and many other providers
  • Declarative: Describe what you want, not how to get there
  • State Management: Tracks infrastructure state and manages changes
  • Plan and Apply: Preview changes before applying them
  • Modular: Create reusable modules for common patterns

Setting Up the Environment

Install Terraform

Install the latest version of Terraform following the instructions for your operating system:

Extra

If needed, access more information directly on the Terraform website.

$ sudo apt-get update && sudo apt-get install -y gnupg software-properties-common
$ wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor | sudo tee /usr/share/keyrings/hashicorp-archive-keyring.gpg > /dev/null
$ gpg --no-default-keyring --keyring /usr/share/keyrings/hashicorp-archive-keyring.gpg --fingerprint
$ echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(grep -oP '(?<=UBUNTU_CODENAME=).*' /etc/os-release || lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
$ sudo apt update
$ sudo apt-get install terraform
$ brew tap hashicorp/tap
$ brew install hashicorp/tap/terraform
  1. Download the Terraform binary from https://developer.hashicorp.com/terraform/install
  2. Extract the ZIP file to a directory (e.g., C:\terraform)
  3. Add the directory to the system PATH Tutorial 1 Tutorial 2

Verify the installation:

$ terraform version

Configure AWS Profile

Make sure you have AWS credentials configured. Set the AWS profile for the session:

$ export AWS_PROFILE=mlops

$ $env:AWS_PROFILE="mlops"

$ set AWS_PROFILE=mlops

Verify your AWS configuration:

$ aws sts get-caller-identity --profile mlops

Question 1

Before we start, make sure you have:

  1. AWS CLI configured with appropriate credentials
  2. Docker installed and running
  3. Terraform installed

Getting Started with Terraform: S3 Bucket Experiment

Before diving into a more complex project, let's start with a simple Terraform experiment to understand the basics.

Create S3 Bucket Experiment

Let's create a simple S3 bucket using Terraform to understand the workflow.

$ mkdir terraform-s3-experiment
$ cd terraform-s3-experiment

Basic Terraform Configuration for S3

In Terraform, the main.tf file is where we define our infrastructure resources.

Question 2

Create a file main.tf with the following content:

# Configure the AWS Provider
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region  = var.aws_region
  profile = var.aws_profile
}

# Random ID for unique bucket naming
resource "random_id" "bucket_suffix" {
  byte_length = 8
}

# S3 Bucket
resource "aws_s3_bucket" "experiment_bucket" {
  bucket = "${var.bucket_prefix}-${var.student_id}-${random_id.bucket_suffix.hex}"

  tags = {
    Name        = "Terraform Experiment Bucket"
    Environment = "learning"
    StudentId   = var.student_id
    CreatedBy   = "Terraform"
  }
}

# S3 Bucket versioning
resource "aws_s3_bucket_versioning" "experiment_bucket" {
  bucket = aws_s3_bucket.experiment_bucket.id
  versioning_configuration {
    status = "Enabled"
  }
}

# S3 Bucket encryption
resource "aws_s3_bucket_server_side_encryption_configuration" "experiment_bucket" {
  bucket = aws_s3_bucket.experiment_bucket.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

# S3 Bucket public access block
resource "aws_s3_bucket_public_access_block" "experiment_bucket" {
  bucket = aws_s3_bucket.experiment_bucket.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

Variables Configuration for S3 Experiment

The variables.tf file is where we define the input variables for our Terraform configuration.

Question 3

Create a file variables.tf:

Info!

You don't need to modify the variables.tf file for this experiment!

variable "aws_region" {
  description = "AWS region for resources"
  type        = string
  default     = "us-east-2"
}

variable "aws_profile" {
  description = "AWS profile to use"
  type        = string
  default     = "mlops"
}

variable "student_id" {
  description = "Unique student identifier"
  type        = string
  validation {
    condition     = can(regex("^[a-z0-9-]+$", var.student_id))
    error_message = "Student ID must contain only lowercase letters, numbers, and hyphens."
  }
}

variable "bucket_prefix" {
  description = "Prefix for the S3 bucket name"
  type        = string
  default     = "terraform-experiment"
}

Variable Values for S3 Experiment

We can define the variable values in a separate file called terraform.tfvars.

Question 4

Create a file terraform.tfvars:

Important

Replace student_id with your unique identifier (e.g., your username)

# AWS Configuration
aws_region  = "us-east-2"
aws_profile = "mlops"

# Student Configuration (REQUIRED - Make this unique!)
student_id = "your-username-here"

# Bucket Configuration
bucket_prefix = "terraform-experiment"

Outputs Configuration for S3 Experiment

The outputs.tf file is where we define the output values for our Terraform configuration.

The output values are used to extract information from the resources created by Terraform.

Question 5

Create a file outputs.tf:

Info!

You don't need to modify the outputs.tf file for this experiment!

output "bucket_name" {
  description = "Name of the created S3 bucket"
  value       = aws_s3_bucket.experiment_bucket.bucket
}

output "bucket_arn" {
  description = "ARN of the created S3 bucket"
  value       = aws_s3_bucket.experiment_bucket.arn
}

output "bucket_region" {
  description = "Region of the created S3 bucket"
  value       = aws_s3_bucket.experiment_bucket.region
}

output "student_id" {
  description = "Student identifier for this deployment"
  value       = var.student_id
}

Deploy the S3 Bucket

Question 6

Now let's deploy our first Terraform infrastructure:

Step 1: Initialize Terraform The terraform init command is used to initialize a Terraform working directory.

$ terraform init

Step 2: Validate the configuration The terraform validate command is used to validate the Terraform configuration files in a directory.

$ terraform validate

Step 3: Format the configuration The terraform fmt command is used to format Terraform configuration files to a canonical format and style.

$ terraform fmt

Step 4: Plan the deployment The terraform plan command is used to create an execution plan, showing what actions Terraform will take to change the infrastructure.

Info!

It won't make any changes to your infrastructure, just show you what will happen.

$ terraform plan

Step 5: Apply the configuration

When we run the terraform apply command, Terraform will create the resources defined in the configuration files.

This means that the S3 bucket will be created in your AWS account.

Warning!

Type yes when prompted to confirm the deployment.

$ terraform apply

Verify S3 Bucket Creation

Question 7

After the deployment completes, verify your bucket was created using AWS CLI:

Step 1: View Terraform outputs

Let's check the outputs from our Terraform deployment.

$ terraform output

Step 2: List all S3 buckets and filter for yours

To show your S3 buckets, use the following command:

Info!

Not all buckets were created by Terraform.

Some may exist from previous classes!

$ aws s3 ls --profile mlops

Step 3: Filter buckets by your student ID

To filter the S3 buckets by your student ID, use the following command:

$ aws s3api list-buckets --query "Buckets[?contains(Name, '$(terraform output -raw student_id)')].{Name:Name,CreationDate:CreationDate}" --output table --profile mlops

Step 4: Get detailed information about your bucket

$ BUCKET_NAME=$(terraform output -raw bucket_name)
$ aws s3api head-bucket --bucket $BUCKET_NAME --profile mlops
$ aws s3api get-bucket-location --bucket $BUCKET_NAME --profile mlops
$ aws s3api get-bucket-versioning --bucket $BUCKET_NAME --profile mlops
$ aws s3api get-bucket-encryption --bucket $BUCKET_NAME --profile mlops

Test S3 Bucket Functionality

Let's test our bucket by uploading and downloading a file:

Question 8

Create a simple text file to upload to our S3 bucket.

$ echo "Hello from Terraform experiment!" > test-file.txt
$ echo "Student ID: $(terraform output -raw student_id)" >> test-file.txt
$ echo "Bucket Name: $(terraform output -raw bucket_name)" >> test-file.txt
$ echo "Created on: $(date)" >> test-file.txt
$ cat test-file.txt

Question 9

Upload the file to S3

$ BUCKET_NAME=$(terraform output -raw bucket_name)
$ aws s3 cp test-file.txt s3://$BUCKET_NAME/test-file.txt --profile mlops

Question 10

List bucket contents

$ aws s3 ls s3://$BUCKET_NAME/ --profile mlops

Question 11

Download the file with a new name

$ aws s3 cp s3://$BUCKET_NAME/test-file.txt downloaded-file.txt --profile mlops
$ cat downloaded-file.txt

Understanding Terraform State

Now, we are going to explore Terraform state management.

Question 12

View the state file:

$ terraform show

Question 13

List resources in state:

$ terraform state list

Question 14

View specific resource details:

$ terraform state show aws_s3_bucket.experiment_bucket

Make Changes and Update

Let's modify our infrastructure to understand how Terraform handles changes.

Question 15

Add a lifecycle rule to the bucket

Update your main.tf file by adding this resource after the existing S3 bucket resources:

# S3 Bucket lifecycle configuration
resource "aws_s3_bucket_lifecycle_configuration" "experiment_bucket" {
  bucket = aws_s3_bucket.experiment_bucket.id

  rule {
    id     = "delete_old_versions"
    status = "Enabled"

    filter {
      prefix = ""
    }

    noncurrent_version_expiration {
      noncurrent_days = 30
    }
  }

  rule {
    id     = "delete_incomplete_multipart_uploads"
    status = "Enabled"

    filter {
      prefix = ""
    }

    abort_incomplete_multipart_upload {
      days_after_initiation = 7
    }
  }
}

Question 16

Plan the changes. Which resources are being created or modified?

$ terraform plan

Question 17

Apply the changes.

$ terraform apply

Question 18

Verify the lifecycle configuration.

$ BUCKET_NAME=$(terraform output -raw bucket_name)
$ aws s3api get-bucket-lifecycle-configuration --bucket $BUCKET_NAME --profile mlops

Clean Up S3 Experiment

When you're ready to move to the next experiment, clean up these resources!

Question 19

Remove test files from bucket.

$ BUCKET_NAME=$(terraform output -raw bucket_name)
$ aws s3 rm s3://$BUCKET_NAME/test-file.txt --profile mlops

Question 20

Remove all objects and versions from bucket (required for versioned buckets).

$ BUCKET_NAME=$(terraform output -raw bucket_name)
$ aws s3api delete-objects --bucket $BUCKET_NAME --delete "$(aws s3api list-object-versions --bucket $BUCKET_NAME --query='{Objects: Versions[].{Key:Key,VersionId:VersionId}}' --profile mlops)" --profile mlops
$ aws s3api delete-objects --bucket $BUCKET_NAME --delete "$(aws s3api list-object-versions --bucket $BUCKET_NAME --query='{Objects: DeleteMarkers[].{Key:Key,VersionId:VersionId}}' --profile mlops)" --profile mlops

Alternative: Manual Bucket Cleanup

If terraform output is not available or if you need to clean up manually, you can use these commands instead:

Attention

Replace BUCKET_NAME with your actual bucket name

$ aws s3 ls --profile mlops | grep terraform-experiment
$ BUCKET_NAME="terraform-experiment-your-student-id-xxxxxxxx"
$ aws s3api delete-objects --bucket $BUCKET_NAME --delete "$(aws s3api list-object-versions --bucket $BUCKET_NAME --query='{Objects: Versions[].{Key:Key,VersionId:VersionId}}' --profile mlops)" --profile mlops
$ aws s3api delete-objects --bucket $BUCKET_NAME --delete "$(aws s3api list-object-versions --bucket $BUCKET_NAME --query='{Objects: DeleteMarkers[].{Key:Key,VersionId:VersionId}}' --profile mlops)" --profile mlops
$ aws s3 rb s3://$BUCKET_NAME --force --profile mlops

Question 21

Destroy the infrastructure.

Bucket Not Empty Error

If you get a "BucketNotEmpty" error during terraform destroy, it means there are still objects or versions in the bucket.

Use the commands above to remove all objects and versions before running terraform destroy again.

Info!

Type yes when prompted to confirm the destruction.

$ terraform destroy

Question 22

Verify bucket deletion.

$ aws s3api list-buckets --query "Buckets[?contains(Name, '$(terraform output -raw student_id)')]" --profile mlops

Now, you can go to the next activity!