dubiety/terraform-aws-elasticsearch-cloudwatch-sns-alarms

terraform module that configures important alarms for AWS elasticsearch and send them to SNS topic

HCLMakefile
This is stars and forks stats for /dubiety/terraform-aws-elasticsearch-cloudwatch-sns-alarms repository. As of 27 Apr, 2024 this repository has 28 stars and 44 forks.

terraform-aws-elasticsearch-cloudwatch-sns-alarms Terraform module that configures the recommended Amazon ElasticSearch Alarms using CloudWatch and sends alerts to an SNS topic. By default, this module creates an SNS topic, but it can be configured to point to an existing SNS topic (see example) v1.x supports terraform v0.12+ syntax! v2.0+ supports terraform v1.0+ syntax! This project is inspired by CloudPosse It's 100% Open Source and licensed under the APACHE2. Metrics and Alarms Area Metric Operator Threshold Rationale Sharding ClusterStatus.red >= 1 At least one primary shard and its replicas are not allocated to a node Sharding ClusterStatus.yellow >= 1 At least one replica shard is not allocated to a node Storage FreeStorageSpace <= 20480 MB A node in your cluster is down to low storage space. Note, this alarm uses the aggregate Minimum which means this alarm triggers per-node in your cluster. This logic is based-on the AWS Recommended Alarms. It does not however alarm based on an aggregate of free space remaining. Storage FreeStorageSpaceTotal <= 20480 MB The overall disk space free is low. This alarm uses Sum across all your nodes, this can be useful on multi-node clusters. Disabled by default, to enable this you must set monitor_free_storage_space_total_too_low to true, and free_storage_space_total_threshold. Recommended to set the threshold to the number of nodes in your cluster multiplied by the free_storage_space_threshold Storage ClusterIndexWritesBlocked >= 1 Your cluster is blocking write requests. Node Count Nodes < x This alarm indicates that at least one node in your cluster has been unreachable for one day Snapshot AutomatedSnapshotFailure >= 1 An automated snapshot failed. This failure is often the result of a red cluster health status. CPU CPUUtilization >= 80 % 100% CPU utilization isn't uncommon, but sustained high usage is problematic. Consider using larger instance types or more instances. Memory JVMMemoryPressure >= 80 % The cluster could encounter out of memory errors if usage increases. Consider scaling vertically. CPU MasterCPUUtilization >= 80 % Consider using larger instance types for your dedicated master nodes. Memory MasterJVMMemoryPressure >= 80 % Consider using larger instance types for your dedicated master nodes. KMS KMSKeyError >= 1 The KMS encryption key that is used to encrypt data at rest in your domain is disabled. Re-enable it to restore normal operations Memory KMSKeyInaccessible >= 80 % The KMS encryption key that is used to encrypt data at rest in your domain has been deleted or has revoked its grants to Amazon ES For more information please see: recommended Amazon ElasticSearch Alarms. Examples See the examples/ directory for working examples. resource "aws_elasticsearch_domain" "es" { domain_name = "example" elasticsearch_version = "7.10" cluster_config { instance_type = "r4.large.elasticsearch" } snapshot_options { automated_snapshot_start_hour = 23 } tags = { Domain = "TestDomain" } } module "es_alarms" { source = "github::https://github.com/dubiety/terraform-aws-elasticsearch-cloudwatch-sns-alarms.git?ref=master" domain_name = "example" tags = { Domain = "TestDomain" } } You can alternatively have this module not create an SNS in case you have existing ones created elsewhere. module "es_alarms" { source = "github::https://github.com/dubiety/terraform-aws-elasticsearch-cloudwatch-sns-alarms.git?ref=master" domain_name = "example" sns_topic = "arn:aws:sns:us-east-1:123456123456:sns-to-slack" # < Put your full SNS ARN here, if necessary read from var or a resource create_sns_topic = false tags = { Domain = "TestDomain" } } Inputs Name Description Type Default Required domain_name The Elasticserach domain name you want to monitor. string - yes cluster_type The type of cluster, single or multi-node string "single" no alarm_name_postfix Alarm name postfix string "" no alarm_name_prefix Alarm name prefix string "" no create_sns_topic Will create an SNS topic, if you set this to false you MUST set sns_topic to a FULL ARN bool true no sns_topic SNS topic you want to specify. If leave empty, it will use a prefix and a timestamp appended. If create_sns_topic is set to false, this MUST be a FULL ARN string "" no sns_topic_postfix SNS topic postfix string "" no sns_topic_prefix SNS topic prefix string "" no tags Tags to associate with all created resources map {} no cpu_utilization_threshold The maximum percentage of CPU utilization string 80 no free_storage_space_threshold The minimum amount of available storage space in MiB. string 20480 no jvm_memory_pressure_threshold The maximum percentage of the Java heap used for all data nodes in the cluster string 80 no master_cpu_utilization_threshold The maximum percentage of CPU utilization of master nodes string "" no master_jvm_memory_pressure_threshold The maximum percentage of the Java heap used for master nodes in the cluster string "" no min_available_nodes The minimum available (reachable) nodes to have, set to non-zero to enable alarm string 0 no monitor_automated_snapshot_failure Enable monitoring of automated snapshot failure bool true no monitor_cluster_status_is_red Enable monitoring of cluster status is in red bool true no monitor_cluster_status_is_yellow Enable monitoring of cluster status is in yellow bool true no monitor_cluster_index_writes_blocked Enable monitoring of cluster index writes being blocked bool true no monitor_cpu_utilization_too_high Enable monitoring of CPU utilization is too high bool true no monitor_free_storage_space_too_low Enable monitoring of minimum per-node free storage is too low bool true no monitor_free_storage_space_total_too_low Enable monitoring of cluster total free storage is too low bool false no monitor_jvm_memory_pressure_too_high Enable monitoring of JVM memory pressure is too high bool true no monitor_kms Enable monitoring of KMS-related metrics, enable if using KMS bool false no monitor_master_cpu_utilization_too_high Enable monitoring of CPU utilization of master nodes are too high. Only enable this when dedicated master is enabled bool false no monitor_master_jvm_memory_pressure_too_high Enable monitoring of JVM memory pressure of master nodes are too high. Only enable this wwhen dedicated master is enabled bool false no monitor_min_available_nodes Enable monitoring of minimum available nodes bool true no alarm_automated_snapshot_failure_periods The number of periods to alert that automatic snapshots failed, raise this if desired to make less noisy number 1 no alarm_cluster_status_is_red_periods The number of periods to alert that cluster status is red, raise this to be less noisy number 1 no alarm_cluster_status_is_yellow_periods The number of periods before triggering the cluster status is yellow, raise this to be less noisy number 1 no alarm_cluster_index_writes_blocked_periods The number of periods to alert that cluster index writes are blocked, raise this if desired to make less noisy number 1 no alarm_cpu_utilization_too_high_periods The number of periods to alert that CPU usage is too high, raise this if desired to make less noisy number 3 no alarm_free_storage_space_too_low_periods The number of periods before triggering the disk space is low, raise this to be less noisy number 1 no alarm_free_storage_space_total_too_low_periods The number of periods before triggering the total disk space is low, raise this to be less noisy number 1 no alarm_jvm_memory_pressure_too_high_periods The number of periods which it must be in the alarmed state to alert, raise this if desired to make less noisy number 1 no alarm_kms_periods The number of periods to alert that kms has failed, raise this if desired to make less noisy number 1 no alarm_master_cpu_utilization_too_high_periods The number of periods to alert that masters CPU usage is too high, raise this if desired to make less noisy number 3 no alarm_master_jvm_memory_pressure_too_high_periods The number of periods which it must be in the alarmed state to alert, raise this if desired to make less noisy number 1 no alarm_min_available_nodes_periods The number of periods to alert that minimum number of available nodes dropped below a threshold, raise this if desired to make less noisy number 1 no alarm_min_available_nodes_period The period of the minimum available nodes should the statistics be applied in seconds string 86400 no alarm_automated_snapshot_failure_period The period of the automated snapshot failure should the statistics be applied in seconds string 60 no alarm_cluster_index_writes_blocked_period The period of the cluster index writes being blocked should the statistics be applied in seconds string 300 no alarm_cluster_status_is_red_period The period of the cluster status is in red should the statistics be applied in seconds string 60 no alarm_cluster_status_is_yellow_period The period of the cluster status is in yellow should the statistics be applied in seconds string 60 no alarm_cpu_utilization_too_high_period The period of the CPU utilization is too high should the statistics be applied in seconds string 900 no alarm_free_storage_space_too_low_period The period of the per-node minimum free storage is too low should the statistics be applied in seconds string 60 no alarm_free_storage_space_total_too_low_period The period of the cluster total free storage is too low should the statistics be applied in seconds string 60 no alarm_jvm_memory_pressure_too_high_period The period of the JVM memory pressure is too high should the statistics be applied in seconds string 900 no alarm_kms_period The period of the KMS-related metrics should the statistics be applied in seconds string 60 no alarm_master_cpu_utilization_too_high_period The period of the CPU utilization of master nodes are too high should the statistics be applied in seconds string 900 no alarm_master_jvm_memory_pressure_too_high_period The period of the JVM memory pressure of master nodes are too high should the statistics be applied in seconds string 900 no Outputs Name Description sns_topic_arn The ARN of the SNS topic sns_topic_name The SNS topic name Share the Love Like this project? Please give it a ★ on our GitHub! Help Got a question? File a GitHub issue. Bug Reports & Feature Requests Please use the issue tracker to report any bugs or file feature requests. License See LICENSE for full details. Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Read on GithubGithub Stats Page
repotechsstarsweeklyforksweekly
cloudposse/terraform-aws-s3-websiteHCLGoMakefile1130710
chris-short/awesome-eksMakefile650190
cloudposse/build-harnessMakefileShellPython33901260
ucb-bar/berkeley-hardfloatScalaC++C2300780
ReactorKit/ReactorKitSwiftObjective-CRuby2.6k02660
MIDL-Conference/MIDLLatexTemplateTeXMakefile360760
mattn/emmet-vimVim ScriptMakefile6.3k+5419-1
ihhub/fheroes2C++JavaCMake2.3k03170
NixOS/nixC++ShellNix9.2k01.2k0
zq2599/blog_demosJavaGoShell4k01.7k0