Skip to content

Commit

Permalink
Merge pull request #1 from skyscrapers/init
Browse files Browse the repository at this point in the history
Add Dynamodb, Kinesis and Lambda modules
  • Loading branch information
simonrondelez authored Mar 20, 2019
2 parents 26b4ff1 + b2bec14 commit 368d0dd
Show file tree
Hide file tree
Showing 7 changed files with 308 additions and 0 deletions.
61 changes: 61 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# terraform-cloudwatch

Terraform module to setup cloudwatch alerts and push them to SNS. This repository contains the following modules:

* `kinesis`: Creates alerts for a kinesis stream. This is used in the `skyscrapers/terraform-kinesis` module.
* `dynamodb`: Creates the alerts needed for a dynamodb table.
* `lambda_function`: Creates the alerts for lambda functions.

## kinesis
Creates alerts for a kinesis stream. This is used in the `skyscrapers/terraform-kinesis` module.

The following resources are created:

* Cloudwatch alerts for the kinesis stream that was passed as variable

### Available variables

| Name | Description | Type | Default | Required |
|------|-------------|:----:|:-----:|:-----:|
| kinesis\_iterator\_age\_error\_evaluation\_periods | The number of periods over which data is compared to the specified threshold. | string | `"1"` | no |
| kinesis\_iterator\_age\_error\_period | The period in seconds over which the specified stat is applied. | string | `"300"` | no |
| kinesis\_iterator\_age\_error\_threshold | The value against which the specified statistic is compared. | string | `"1000000"` | no |
| kinesis\_stream\_name | Name of the kinesis stream to monitor | string | n/a | yes |
| kinesis\_write\_throughput\_exceeded\_evaluation\_periods | The number of periods over which data is compared to the specified threshold. | string | `"6"` | no |
| kinesis\_write\_throughput\_exceeded\_period | The period in seconds over which the specified stat is applied. | string | `"300"` | no |
| kinesis\_write\_throughput\_exceeded\_threshold | The value against which the specified statistic is compared. | string | `"10"` | no |
| sns\_topic\_arn | ARN of the SNS topic you want the alerts to be sent to | string | n/a | yes |

## dynamodb

Creates the alerts needed for a dynamodb table.

### Available variables

| Name | Description | Type | Default | Required |
|------|-------------|:----:|:-----:|:-----:|
| dynamodb\_table\_name | Name of the dynamodb table to monitor | string | n/a | yes |
| dynamodb\_throttle\_evaluation\_periods | The period in seconds over which the specified stat is applied. | string | `"1"` | no |
| dynamodb\_throttle\_period | The number of periods over which data is compared to the specified threshold. | string | `"60"` | no |
| dynamodb\_throttle\_threshold | The value against which the specified statistic is compared. | string | `"0"` | no |
| sns\_topic\_arn | ARN of the SNS topic you want the alerts to be sent to | string | n/a | yes |

## lambda_function

Creates the alerts for lambda functions.

### Available variables

| Name | Description | Type | Default | Required |
|------|-------------|:----:|:-----:|:-----:|
| lambda\_function | Name of the lambda function to monitor | string | n/a | yes |
| lambda\_invocation\_error\_evaluation\_periods | The number of periods over which data is compared to the specified threshold. | string | `"1"` | no |
| lambda\_invocation\_error\_period | The period in seconds over which the specified stat is applied. | string | `"60"` | no |
| lambda\_invocation\_error\_threshold | The value against which the specified statistic is compared. | string | `"5"` | no |
| lambda\_iterator\_age\_error\_evaluation\_periods | The number of periods over which data is compared to the specified threshold. | string | `"1"` | no |
| lambda\_iterator\_age\_error\_period | The period in seconds over which the specified stat is applied. | string | `"60"` | no |
| lambda\_iterator\_age\_error\_threshold | The value against which the specified statistic is compared. | string | `"1000000"` | no |
| lambda\_throttle\_error\_evaluation\_periods | The number of periods over which data is compared to the specified threshold. | string | `"1"` | no |
| lambda\_throttle\_error\_period | The period in seconds over which the specified stat is applied. | string | `"60"` | no |
| lambda\_throttle\_error\_threshold | The value against which the specified statistic is compared. | string | `"0"` | no |
| sns\_topic\_arn | ARN of the SNS topic you want the alerts to be sent to | string | n/a | yes |
19 changes: 19 additions & 0 deletions dynamodb/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
// Dynamo requests are being throttled.
resource "aws_cloudwatch_metric_alarm" "dynamo_table_throttles" {
alarm_name = "${var.dynamodb_table_name}_throttles"
alarm_description = "Requests are being throttled to the ${var.dynamodb_table_name} table"
namespace = "AWS/DynamoDB"
metric_name = "ThrottledRequests"
statistic = "Sum"
comparison_operator = "GreaterThanThreshold"
threshold = "${var.dynamodb_throttle_threshold}"
period = "${var.dynamodb_throttle_period}"
evaluation_periods = "${var.dynamodb_throttle_evaluation_periods}"

alarm_actions = ["${var.sns_topic_arn}"]
ok_actions = ["${var.sns_topic_arn}"]

dimensions = {
TableName = "${var.dynamodb_table_name}"
}
}
25 changes: 25 additions & 0 deletions dynamodb/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
variable "sns_topic_arn" {
description = "ARN of the SNS topic you want the alerts to be sent to"
}

// Dynamodb table name
variable "dynamodb_table_name" {
type = "string"
description = "Name of the dynamodb table to monitor"
}

//Alarm settings
variable "dynamodb_throttle_threshold" {
default = 0
description = "The value against which the specified statistic is compared."
}

variable "dynamodb_throttle_period" {
default = 60
description = "The number of periods over which data is compared to the specified threshold."
}

variable "dynamodb_throttle_evaluation_periods" {
default = 1
description = "The period in seconds over which the specified stat is applied."
}
43 changes: 43 additions & 0 deletions kinesis/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
// Kinesis: Iterator Age
resource "aws_cloudwatch_metric_alarm" "_kinesis_iterator_age" {
alarm_name = "${var.kinesis_stream_name}_high_iterator_age"
alarm_description = "The Get iterator age of ${var.kinesis_stream_name} is starting to lag behind"
namespace = "AWS/Kinesis"
metric_name = "GetRecords.IteratorAgeMilliseconds"
statistic = "Maximum"
comparison_operator = "GreaterThanThreshold"
threshold = "${var.kinesis_iterator_age_error_threshold}"
evaluation_periods = "${var.kinesis_iterator_age_error_evaluation_periods}"
period = "${var.kinesis_iterator_age_error_period}"
alarm_description = " Kinesis High Iterator Age: ${var.kinesis_stream_name}"

alarm_actions = ["${var.sns_topic_arn}"]
ok_actions = ["${var.sns_topic_arn}"]
insufficient_data_actions = ["${var.sns_topic_arn}"]

dimensions {
StreamName = "${var.kinesis_stream_name}"
}
}

// Kinesis: Write Throughput Exceeded
resource "aws_cloudwatch_metric_alarm" "_kinesis_write_exceeded" {
alarm_name = "${var.kinesis_stream_name}_write_exceeded"
alarm_description = "There are more writes going into ${var.kinesis_stream_name} than the chards can handle"
namespace = "AWS/Kinesis"
metric_name = "WriteProvisionedThroughputExceeded"
statistic = "Sum"
comparison_operator = "GreaterThanThreshold"
threshold = "${var.kinesis_write_throughput_exceeded_threshold}"
evaluation_periods = "${var.kinesis_write_throughput_exceeded_evaluation_periods}"
period = "${var.kinesis_write_throughput_exceeded_period}"
alarm_description = " Kinesis Write Throughput Exceeded: ${var.kinesis_stream_name}"

alarm_actions = ["${var.sns_topic_arn}"]
ok_actions = ["${var.sns_topic_arn}"]
insufficient_data_actions = ["${var.sns_topic_arn}"]

dimensions {
StreamName = "${var.kinesis_stream_name}"
}
}
41 changes: 41 additions & 0 deletions kinesis/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
variable "sns_topic_arn" {
description = "ARN of the SNS topic you want the alerts to be sent to"
}

// Kinesis stream name
variable "kinesis_stream_name" {
description = "Name of the kinesis stream to monitor"
type = "string"
}

// Kinesis Iterator Age Alarm Settings
variable "kinesis_iterator_age_error_threshold" {
description = "The value against which the specified statistic is compared."
default = "1000000"
}

variable "kinesis_iterator_age_error_evaluation_periods" {
description = "The number of periods over which data is compared to the specified threshold."
default = "1"
}

variable "kinesis_iterator_age_error_period" {
description = "The period in seconds over which the specified stat is applied."
default = "300"
}

// Kinesis Write Throughput Alarm Settings
variable "kinesis_write_throughput_exceeded_threshold" {
description = "The value against which the specified statistic is compared."
default = "10"
}

variable "kinesis_write_throughput_exceeded_evaluation_periods" {
description = "The number of periods over which data is compared to the specified threshold."
default = "6"
}

variable "kinesis_write_throughput_exceeded_period" {
description = "The period in seconds over which the specified stat is applied."
default = "300"
}
61 changes: 61 additions & 0 deletions lambda_function/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
// Setup Cloudwatch alarms for all Lambda function aliases (environments)
// Send alerts to given SNS topics.

// Lambda: Invocation Errors
resource "aws_cloudwatch_metric_alarm" "streamalert_lambda_invocation_errors" {
alarm_name = "${var.lambda_function}_invocation_errors"
alarm_description = "The errors on ${var.lambda_function} are higher than ${var.lambda_invocation_error_threshold} for ${var.lambda_invocation_error_period}"
namespace = "AWS/Lambda"
metric_name = "Errors"
statistic = "Sum"
comparison_operator = "GreaterThanThreshold"
threshold = "${var.lambda_invocation_error_threshold}"
evaluation_periods = "${var.lambda_invocation_error_evaluation_periods}"
period = "${var.lambda_invocation_error_period}"

alarm_actions = ["${var.sns_topic_arn}"]

dimensions {
FunctionName = "${var.lambda_function}"
}
}

// Lambda: Throttles
resource "aws_cloudwatch_metric_alarm" "streamalert_lambda_throttles" {
alarm_name = "${var.lambda_function}_throttles"
alarm_description = "Lambda function ${var.lambda_function} is being throttled"
namespace = "AWS/Lambda"
metric_name = "Throttles"
statistic = "Sum"
comparison_operator = "GreaterThanThreshold"
threshold = "${var.lambda_throttle_error_threshold}"
evaluation_periods = "${var.lambda_throttle_error_evaluation_periods}"
period = "${var.lambda_throttle_error_period}"

alarm_actions = ["${var.sns_topic_arn}"]
ok_actions = ["${var.sns_topic_arn}"]

dimensions {
FunctionName = "${var.lambda_function}"
}
}

// Lambda: IteratorAge
resource "aws_cloudwatch_metric_alarm" "streamalert_lambda_iterator_age" {
alarm_name = "${var.lambda_function}_iterator_age"
alarm_description = "Lambda High Iterator Age for ${var.lambda_function}"
namespace = "AWS/Lambda"
metric_name = "IteratorAge"
statistic = "Maximum"
comparison_operator = "GreaterThanThreshold"
threshold = "${var.lambda_iterator_age_error_threshold}"
evaluation_periods = "${var.lambda_iterator_age_error_evaluation_periods}"
period = "${var.lambda_iterator_age_error_period}"

alarm_actions = ["${var.sns_topic_arn}"]
ok_actions = ["${var.sns_topic_arn}"]

dimensions {
FunctionName = "${var.lambda_function}"
}
}
58 changes: 58 additions & 0 deletions lambda_function/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
variable "sns_topic_arn" {
description = "ARN of the SNS topic you want the alerts to be sent to"
}

// Lambda Function name
variable "lambda_function" {
type = "string"
description = "Name of the lambda function to monitor"
}

// Lambda Invocation Error Alarm Settings

variable "lambda_invocation_error_threshold" {
default = "5"
description = "The value against which the specified statistic is compared."
}

variable "lambda_invocation_error_evaluation_periods" {
default = "1"
description = "The number of periods over which data is compared to the specified threshold."
}

variable "lambda_invocation_error_period" {
default = "60"
description = "The period in seconds over which the specified stat is applied."
}

// Lambda Throttling Alarm Settings
variable "lambda_throttle_error_threshold" {
default = "0"
description = "The value against which the specified statistic is compared."
}

variable "lambda_throttle_error_evaluation_periods" {
default = "1"
description = "The number of periods over which data is compared to the specified threshold."
}

variable "lambda_throttle_error_period" {
default = "60"
description = "The period in seconds over which the specified stat is applied."
}

// Lambda Iterator Age Alarm Settings
variable "lambda_iterator_age_error_threshold" {
default = "1000000"
description = "The value against which the specified statistic is compared."
}

variable "lambda_iterator_age_error_evaluation_periods" {
default = "1"
description = "The number of periods over which data is compared to the specified threshold."
}

variable "lambda_iterator_age_error_period" {
default = "60"
description = "The period in seconds over which the specified stat is applied."
}

0 comments on commit 368d0dd

Please sign in to comment.