📡Building a Linux Web Server with Terraform & Ansible – Part 13: Alerts

To wrap up this series, we’re going to set up monitoring and alerts to ensure our server stays healthy and responsive in production.

Way back in Part 1, we enabled monitoring in our Terraform configuration by setting monitoring = true on our droplet. This installed the DigitalOcean metrics agent, which collects CPU, memory, and disk metrics from our server.

Now it’s time to use that data to create alerts that notify us when something goes wrong.


⚙️ Step 1: Add a New Terraform File

Create a new file for our monitoring configuration:

cd terraform
touch monitoring.tf

✏️ Step 2: Add the Monitoring Configuration

Here’s what monitoring.tf should contain:

# 🔹 CPU Usage Alert (Triggers if CPU > 80% for 5 minutes)
resource "digitalocean_monitor_alert" "cpu_high" {
  description = "High CPU Usage"
  type        = "v1/insights/droplet/cpu"
  compare     = "GreaterThan"
  value       = 80
  window      = "5m"
  enabled     = true
  entities    = [digitalocean_droplet.web.id]

  alerts {
    email = [var.email]
  }
}

# 🔹 Memory Usage Alert (Triggers if RAM > 90% for 10 minutes)
resource "digitalocean_monitor_alert" "memory_high" {
  description = "High Memory Usage"
  type        = "v1/insights/droplet/memory_utilization_percent"
  compare     = "GreaterThan"
  value       = 90
  window      = "10m"
  enabled     = true
  entities    = [digitalocean_droplet.web.id]

  alerts {
    email = [var.email]
  }
}

# 🔹 Disk Usage Alert (Triggers if Disk I/O is too high)
resource "digitalocean_monitor_alert" "disk_io_high" {
  description = "High Disk I/O"
  type        = "v1/insights/droplet/disk_read"
  compare     = "GreaterThan"
  value       = 5000000  # Adjust based on usage (bytes per second)
  window      = "5m"
  enabled     = true
  entities    = [digitalocean_droplet.web.id]

  alerts {
    email = [var.email]
  }
}

# Create a new check for the target endpoint in a specific region
resource "digitalocean_uptime_check" "this" {
  name    = "Blog uptime check"
  target  = "https://${var.hostname}"
  regions = ["eu_west"]
}

resource "digitalocean_uptime_alert" "this" {
  name       = "latency-alert"
  check_id   = digitalocean_uptime_check.this.id
  type       = "latency"
  threshold  = 300
  comparison = "greater_than"
  period     = "2m"
  notifications {
    email = [var.email]
  }
}

🧠 What Each Alert Does

  • CPU Alert: Notifies you when CPU usage exceeds 80% over 5 minutes.
  • Memory Alert: Triggers if memory usage stays over 90% for 10 minutes.
  • Disk I/O Alert: Watches for high disk read activity (tune the threshold as needed).
  • Uptime Check: Monitors your app’s availability from the outside.
  • Latency Alert: Notifies you if response time from Europe exceeds 300ms for over 2 minutes.

These are lightweight defaults, but a great starting point. You can always add more alerts for disk usage, load average, or custom ports.


🛰️ Step 3: Apply the Changes

Run the usual Terraform commands:

terraform plan
terraform apply

Once applied, the alerts will show up in your DigitalOcean monitoring dashboard.


🧪 Optional: Trigger a Test Alert

If you want to see an alert in action, you can stress the CPU temporarily using the stress tool:

sudo apt install stress
stress --cpu 1 --timeout 300

This will simulate 100% CPU usage for 5 minutes, which should trigger the alert.


✅ Summary (This Article)

In this final article, we:

  • Created infrastructure-level monitoring alerts using Terraform
  • Set up CPU, memory, and disk I/O alerts
  • Added an uptime and latency check for external monitoring
  • Learned how to test alerts and get notified by email

Your app is now monitored, secure, and production-ready 🎉


🏁 Series Recap

Let’s recap what we accomplished in this series:

  1. 🧱 Provisioned a droplet with Terraform
  2. 🔐 Secured the server with Ansible (SSH hardening, fail2ban, UFW)
  3. 🐍 Deployed a Flask API with Gunicorn + Nginx
  4. 🚀 Automated deployments with GitHub Actions
  5. 🐬 Installed MySQL and connected the app to a real database
  6. 🌍 Configured DNS records using Terraform
  7. 🔒 Set up SSL certificates with Certbot and Let’s Encrypt
  8. 📡 Monitored the app and created automatic alerts

Not only is your Flask app now live, but you’ve built a full DevOps pipeline using Terraform, Ansible, GitHub Actions, and DigitalOcean—a real-world setup used by teams in production.


🔚 Final Thoughts

This series was designed to help you practice key DevOps skills:

  • Infrastructure-as-Code
  • Server automation
  • App deployment
  • CI/CD
  • Security and observability

Whether you’re a junior cloud engineer, a solo developer, or just DevOps-curious, I hope this series helped you level up.

🫶 Thanks for following along.

And if you want to dive deeper—Kubernetes, logging stacks, advanced scaling, or containerization—just let me know 👇

Leave a Reply

Your email address will not be published. Required fields are marked *