-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1 from sparkfabrik/0000-add-cloud-sql-monitors
refs #0000 Add Cloud SQL monitoring
- Loading branch information
Showing
10 changed files
with
352 additions
and
21 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,56 @@ | ||
# Terraform Module Template | ||
# Terraform GCP Services Monitoring Module | ||
|
||
This project can be used as a template for the initial stub of a Terraform | ||
module. | ||
This module creates a set of monitoring alerts for Google Cloud Platform services. | ||
|
||
We suggest following Terraform best practices as described in https://www.terraform-best-practices.com/code-structure. | ||
Supported services: | ||
|
||
- Cloud SQL | ||
- CPU usage | ||
- Storage usage | ||
- Memory usage | ||
|
||
<!-- BEGIN_TF_DOCS --> | ||
## Providers | ||
|
||
| Name | Version | | ||
|------|---------| | ||
| <a name="provider_google"></a> [google](#provider\_google) | >= 5.33 | | ||
|
||
## Requirements | ||
|
||
| Name | Version | | ||
|------|---------| | ||
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.5 | | ||
| <a name="requirement_google"></a> [google](#requirement\_google) | >= 5.33 | | ||
|
||
## Inputs | ||
|
||
| Name | Description | Type | Default | Required | | ||
|------|-------------|------|---------|:--------:| | ||
| <a name="input_auto_close"></a> [auto\_close](#input\_auto\_close) | n/a | `string` | `"86400s"` | no | | ||
| <a name="input_cloud_sql"></a> [cloud\_sql](#input\_cloud\_sql) | n/a | <pre>object({<br> project = optional(string, null)<br> auto_close = optional(string, null)<br> notification_channels = optional(list(string), [])<br> instances = optional(map(object({<br> cpu_utilization = optional(list(object({<br> severity = optional(string, "CRITICAL"),<br> threshold = optional(number, 0.90)<br> alignment_period = optional(string, "120s")<br> duration = optional(string, "300s")<br> })), [<br> {<br> severity = "WARNING",<br> threshold = 0.85,<br> duration = "1200s",<br> },<br> {<br> severity = "CRITICAL",<br> threshold = 1,<br> duration = "300s",<br> alignment_period = "60s",<br> }<br> ])<br> memory_utilization = optional(list(object({<br> severity = optional(string, "CRITICAL"),<br> threshold = optional(number, 0.90)<br> alignment_period = optional(string, "300s")<br> duration = optional(string, "300s")<br> })), [<br> {<br> severity = "WARNING",<br> threshold = 0.80,<br> },<br> {<br> severity = "CRITICAL",<br> threshold = 0.90,<br> }<br> ])<br> disk_utilization = optional(list(object({<br> severity = optional(string, "CRITICAL"),<br> threshold = optional(number, 0.90)<br> alignment_period = optional(string, "300s")<br> duration = optional(string, "600s")<br> })), [<br> {<br> severity = "WARNING",<br> threshold = 0.85,<br> },<br> {<br> severity = "CRITICAL",<br> threshold = 0.95, <br> }<br> ])<br> })), {})<br> })</pre> | n/a | yes | | ||
| <a name="input_notification_channels"></a> [notification\_channels](#input\_notification\_channels) | n/a | `list(string)` | `[]` | no | | ||
| <a name="input_project"></a> [project](#input\_project) | n/a | `string` | `null` | no | | ||
|
||
## Outputs | ||
|
||
| Name | Description | | ||
|------|-------------| | ||
| <a name="output_cloud_sql_cpu_utilization"></a> [cloud\_sql\_cpu\_utilization](#output\_cloud\_sql\_cpu\_utilization) | n/a | | ||
| <a name="output_cloud_sql_disk_utilization"></a> [cloud\_sql\_disk\_utilization](#output\_cloud\_sql\_disk\_utilization) | n/a | | ||
| <a name="output_cloud_sql_memory_utilization"></a> [cloud\_sql\_memory\_utilization](#output\_cloud\_sql\_memory\_utilization) | n/a | | ||
|
||
## Resources | ||
|
||
| Name | Type | | ||
|------|------| | ||
| [google_monitoring_alert_policy.cloud_sql_cpu_utilization](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/monitoring_alert_policy) | resource | | ||
| [google_monitoring_alert_policy.cloud_sql_disk_utilization](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/monitoring_alert_policy) | resource | | ||
| [google_monitoring_alert_policy.cloud_sql_memory_utilization](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/monitoring_alert_policy) | resource | | ||
|
||
## Modules | ||
|
||
No modules. | ||
|
||
|
||
<!-- END_TF_DOCS --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,155 @@ | ||
# ---------------------- | ||
# CloudSQL | ||
# ---------------------- | ||
locals { | ||
# Use the cloud_sql project if specified, otherwise use the project. | ||
cloud_sql_project = var.cloud_sql.project != null ? var.cloud_sql.project : var.project | ||
|
||
# Use the cloud_sql notification channels for if not specified in the configuration. | ||
cloud_sql_notification_channels = length(var.cloud_sql.notification_channels) > 0 ? var.cloud_sql.notification_channels : var.notification_channels | ||
|
||
# Use the cloud_sql auto_close if specified, otherwise use the auto_close. | ||
cloud_sql_auto_close = var.cloud_sql.auto_close != null ? var.cloud_sql.auto_close : var.auto_close | ||
|
||
cloud_sql_cpu_utilization = { | ||
for item in flatten( | ||
[ | ||
for instance, instance_config in var.cloud_sql.instances : [ | ||
for cpu_utilization in instance_config.cpu_utilization : | ||
merge( | ||
{ | ||
"instance" : instance, | ||
}, | ||
cpu_utilization | ||
) | ||
] | ||
] | ||
) : "${item.instance}--${item.severity}--${item.threshold}" => item | ||
} | ||
|
||
cloud_sql_memory_utilization = { | ||
for item in flatten( | ||
[ | ||
for instance, instance_config in var.cloud_sql.instances : [ | ||
for memory_utilization in instance_config.memory_utilization : | ||
merge( | ||
{ | ||
"instance" : instance, | ||
}, | ||
memory_utilization | ||
) | ||
] | ||
] | ||
) : "${item.instance}--${item.severity}--${item.threshold}" => item | ||
} | ||
|
||
cloud_sql_disk_utilization = { | ||
for item in flatten( | ||
[ | ||
for instance, instance_config in var.cloud_sql.instances : [ | ||
for disk_utilization in instance_config.disk_utilization : | ||
merge( | ||
{ | ||
"instance" : instance, | ||
}, | ||
disk_utilization | ||
) | ||
] | ||
] | ||
) : "${item.instance}--${item.severity}--${item.threshold}" => item | ||
} | ||
} | ||
|
||
# ---------------------- | ||
# CloudSQL CPU utilization | ||
# ---------------------- | ||
resource "google_monitoring_alert_policy" "cloud_sql_cpu_utilization" { | ||
for_each = local.cloud_sql_cpu_utilization | ||
|
||
display_name = "${local.cloud_sql_project} ${each.value.instance} - CPU utilization ${each.value.severity} ${each.value.threshold * 100}%" | ||
combiner = "OR" | ||
severity = each.value.severity | ||
|
||
conditions { | ||
condition_threshold { | ||
filter = "resource.type = \"cloudsql_database\" AND resource.labels.database_id = \"${local.cloud_sql_project}:${each.value.instance}\" AND metric.type = \"cloudsql.googleapis.com/database/cpu/utilization\"" | ||
comparison = "COMPARISON_GT" | ||
threshold_value = each.value.threshold | ||
duration = each.value.duration | ||
trigger { | ||
count = 1 | ||
} | ||
aggregations { | ||
alignment_period = each.value.alignment_period | ||
per_series_aligner = "ALIGN_MEAN" | ||
} | ||
} | ||
display_name = "${local.cloud_sql_project} ${each.value.instance} - CPU utilization ${each.value.severity} ${each.value.threshold * 100}%" | ||
} | ||
alert_strategy { | ||
auto_close = local.cloud_sql_auto_close | ||
} | ||
notification_channels = local.cloud_sql_notification_channels | ||
} | ||
|
||
# ---------------------- | ||
# CloudSQL Memory utilization | ||
# ---------------------- | ||
resource "google_monitoring_alert_policy" "cloud_sql_memory_utilization" { | ||
for_each = local.cloud_sql_memory_utilization | ||
|
||
display_name = "${local.cloud_sql_project} ${each.value.instance} - Memory utilization ${each.value.severity} ${each.value.threshold * 100}%" | ||
combiner = "OR" | ||
severity = each.value.severity | ||
conditions { | ||
display_name = "${local.cloud_sql_project} ${each.value.instance} - Memory utilization ${each.value.severity} ${each.value.threshold * 100}%" | ||
condition_threshold { | ||
filter = "resource.type = \"cloudsql_database\" AND resource.labels.database_id = \"${local.cloud_sql_project}:${each.value.instance}\" AND metric.type = \"cloudsql.googleapis.com/database/memory/utilization\"" | ||
duration = each.value.duration | ||
comparison = "COMPARISON_GT" | ||
threshold_value = each.value.threshold | ||
|
||
aggregations { | ||
alignment_period = each.value.alignment_period | ||
per_series_aligner = "ALIGN_MEAN" | ||
} | ||
} | ||
} | ||
|
||
alert_strategy { | ||
auto_close = local.cloud_sql_auto_close | ||
} | ||
|
||
notification_channels = local.cloud_sql_notification_channels | ||
} | ||
|
||
# ---------------------- | ||
# CloudSQL disk utilization | ||
# ---------------------- | ||
resource "google_monitoring_alert_policy" "cloud_sql_disk_utilization" { | ||
for_each = local.cloud_sql_disk_utilization | ||
|
||
display_name = "${local.cloud_sql_project} ${each.value.instance} - Disk utilization ${each.value.severity} ${each.value.threshold * 100}%" | ||
combiner = "OR" | ||
severity = each.value.severity | ||
|
||
conditions { | ||
display_name = "${local.cloud_sql_project} ${each.value.instance} - Disk utilization ${each.value.severity} ${each.value.threshold * 100}%" | ||
condition_threshold { | ||
filter = "resource.type = \"cloudsql_database\" AND resource.labels.database_id = \"${local.cloud_sql_project}:${each.value.instance}\" AND metric.type = \"cloudsql.googleapis.com/database/disk/utilization\"" | ||
duration = each.value.duration | ||
comparison = "COMPARISON_GT" | ||
threshold_value = each.value.threshold | ||
|
||
aggregations { | ||
alignment_period = each.value.alignment_period | ||
per_series_aligner = "ALIGN_MEAN" | ||
} | ||
} | ||
} | ||
|
||
alert_strategy { | ||
auto_close = local.cloud_sql_auto_close | ||
} | ||
notification_channels = local.cloud_sql_notification_channels | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,51 @@ | ||
/* | ||
# A simple example on how to use this module | ||
*/ | ||
|
||
locals { | ||
# Enable all Cdoud SQL monitorings on selected instances, eg. | ||
cloud_sql = { | ||
instances = { | ||
(google_sql_database_instance.master.name) = {} | ||
(google_sql_database_instance.stage.name) = {} | ||
} | ||
} | ||
|
||
# Use custom Cloud SQL cpu monitoring on google_sql_database_instance.master.name | ||
# Use all default Cloud SQL monitoring on google_sql_database_instance.stage.name | ||
# cloud_sql = { | ||
# instances = { | ||
# (google_sql_database_instance.master.name) = { | ||
# cpu_utilization = [{ | ||
# severity = "ALERT" | ||
# threshold = 0.90 | ||
# }] | ||
# } | ||
# (google_sql_database_instance.stage.name) = {} | ||
# } | ||
# } | ||
|
||
# Disable Cloud SQL monitoring | ||
# cloud_sql = { | ||
# instances = {} | ||
# } | ||
|
||
# Enable default Cloud SQL monitoring on instance google_sql_database_instance.master.name | ||
# Disable cpu utilization monitoring on instance google_sql_database_instance.stage.name | ||
# cloud_sql = { | ||
# instances = { | ||
# (google_sql_database_instance.master.stage) = { cpu_utilization = [] } | ||
# (google_sql_database_instance.master.prod) = {} | ||
# } | ||
# } | ||
|
||
} | ||
|
||
module "example" { | ||
source = "github.com/sparkfabrik/terraform-module-template" | ||
source = "github.com/sparkfabrik/terraform-google-services-monitoring" | ||
version = ">= 0.1.0" | ||
|
||
name = var.name | ||
notification_channels = var.notification_channels | ||
project = var.project | ||
cloud_sql = local.cloud_sql | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,6 @@ | ||
name = "SimpleExample" | ||
project = "Simple project" | ||
|
||
notification_channels = [ | ||
"cloud_support_email", | ||
"slack-channel" | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,10 @@ | ||
variable "name" { | ||
type = string | ||
description = "Describe what this variable is used for." | ||
|
||
variable "project" { | ||
type = string | ||
default = "" | ||
} | ||
|
||
variable "notification_channels" { | ||
type = list(string) | ||
default = [] | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +0,0 @@ | ||
resource "google_storage_bucket" "example" { | ||
name = var.name | ||
location = "EU" | ||
} | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,11 @@ | ||
output "example" { | ||
value = google_storage_bucket.example.name | ||
description = "The name of the resource." | ||
output "cloud_sql_disk_utilization" { | ||
value = { for k, v in google_monitoring_alert_policy.cloud_sql_disk_utilization : k => v.name } | ||
} | ||
|
||
output "cloud_sql_memory_utilization" { | ||
value = { for k, v in google_monitoring_alert_policy.cloud_sql_memory_utilization : k => v.name } | ||
} | ||
|
||
output "cloud_sql_cpu_utilization" { | ||
value = { for k, v in google_monitoring_alert_policy.cloud_sql_cpu_utilization : k => v.name } | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,72 @@ | ||
variable "name" { | ||
type = string | ||
description = "Describe what this variable is used for." | ||
variable "project" { | ||
type = string | ||
default = null | ||
} | ||
|
||
variable "notification_channels" { | ||
type = list(string) | ||
default = [] | ||
} | ||
|
||
variable "auto_close" { | ||
type = string | ||
default = "86400s" # 24h | ||
} | ||
|
||
variable "cloud_sql" { | ||
type = object({ | ||
project = optional(string, null) | ||
auto_close = optional(string, null) | ||
notification_channels = optional(list(string), []) | ||
instances = optional(map(object({ | ||
cpu_utilization = optional(list(object({ | ||
severity = optional(string, "CRITICAL"), | ||
threshold = optional(number, 0.90) | ||
alignment_period = optional(string, "120s") | ||
duration = optional(string, "300s") | ||
})), [ | ||
{ | ||
severity = "WARNING", | ||
threshold = 0.85, | ||
duration = "1200s", | ||
}, | ||
{ | ||
severity = "CRITICAL", | ||
threshold = 1, | ||
duration = "300s", | ||
alignment_period = "60s", | ||
} | ||
]) | ||
memory_utilization = optional(list(object({ | ||
severity = optional(string, "CRITICAL"), | ||
threshold = optional(number, 0.90) | ||
alignment_period = optional(string, "300s") | ||
duration = optional(string, "300s") | ||
})), [ | ||
{ | ||
severity = "WARNING", | ||
threshold = 0.80, | ||
}, | ||
{ | ||
severity = "CRITICAL", | ||
threshold = 0.90, | ||
} | ||
]) | ||
disk_utilization = optional(list(object({ | ||
severity = optional(string, "CRITICAL"), | ||
threshold = optional(number, 0.90) | ||
alignment_period = optional(string, "300s") | ||
duration = optional(string, "600s") | ||
})), [ | ||
{ | ||
severity = "WARNING", | ||
threshold = 0.85, | ||
}, | ||
{ | ||
severity = "CRITICAL", | ||
threshold = 0.95, | ||
} | ||
]) | ||
})), {}) | ||
}) | ||
} |
Oops, something went wrong.