TUTORIALS

Dynamic Blocking

Anyone who works with log management is familiar with the situation where a rogue process or a bug introduced into an application creates a flood of logs that overloads your log management system and brings it to its quota limit. Dynamic Blocking can help prevent this situation.

Dynamic Blocking is a scripted solution that uses API calls in order to implement a cap on the amount of data sent to Coralogix from a specific App/Subsystem to prevent an account from reaching its daily quota and being blocked.

 

How to Implement Dynamic Blocking

We recommend that the script should run every 30 minutes. You can, of course, adapt this time frame but be aware that there are limitations on the number of API calls per minute. You can see the limitations in the Elastic API guide.

The script checks which subsystems have passed their defined threshold and will create an appropriate block rule. One rule will be created for all subsystems associated with the same application. 

An additional script needs to be run each day at midnight UTC. Its role is to disable the dynamic block rules that were created and restart the flow of logs into the team.

This document will provide you with the Elasticsearch (ES) and Rules API calls to use in order to implement Dynamic Blocking. You can wrap these calls with additional logic and use the implementation environment of your choice. 

 

Query the Offending Subsystems

{
"size": 0,
    "query": {
        "bool": {
            "filter":
            [
                {
                    "range": {
                        "coralogix.timestamp": {
                            "gte": "now/d",
                            "lt": "now"
                        }
                    }
                }
            ]
        }
    },
    "aggs": {
    "Application Name": {
    "terms": {
                "field": "coralogix.metadata.applicationName",
                "size": 30
            },
            "aggs" : {
               "subsystemName" : {
               "terms" : {
                  "field" : "coralogix.metadata.subsystemName",
                  "min_doc_count": 100000,
                  "size": 200
                  }
               }
            }
        }
    }
}

This is an ES API query. It uses min_doc_count as the daily log count threshold from a subsystem (I used 100,000 as a placeholder). This query returns all subsystems that sent more than min_doc_count logs since midnight UTC. The result is aggregated by application names and lists the subsystems to block. 

 

Get All Team Rules

curl --location --request GET 'https://api.coralogix.com/api/v1/rules' \
--header 'Content-Type: application/json' \
--header 'Cache-Control: no-cache' \
--header 'Authorization: Bearer API_KEY'

Create a Group of Rules

curl --location --request POST 'https://api.coralogix.com/api/v1/external/actions/rule' \
--header 'Content-Type: application/json' \
--header 'Cache-Control: no-cache' \
--header 'Authorization: Bearer API_KEY' \
--data-raw '{
	"Name":"auto_APP_NAME"
}'

 

Create a Rule

curl --location --request POST 'https://api.coralogix.com/api/v1/external/action/rule/GROUP_ID' \
--header 'Content-Type: application/json' \
--header 'Cache-Control: no-cache' \
--header 'Authorization: Bearer API_KEY' \
--data-raw '
{
    "Name": "auto_APP_NAME",
    "Description": "DO NOT CHANGE, Created via API",
    "Enabled": true,
    "RuleMatchers": [
        {
            "field": "applicationName",
            "constraint": "APP_NAME"
        }
    ],
    "Rule": "\\bSUBSYSTEM_NAME_01\\b",
    "SourceField": "subsystemName",
    "Type": "block",
    "KeepBlockedLogs":false
}'

 

Update a Rule

curl --location --request PUT 'https://api.coralogix.com/api/v1/external/action/RULE_ID/rule/GROUP_ID' \
--header 'Content-Type: application/json' \
--header 'Cache-Control: no-cache' \
--header 'Authorization: Bearer API_KEY' \
--data-raw '{
	"Name":"auto_APP_NAME",
	"Enabled": true,
	"Rule": "\\bSUBSYSTEM_NAME_01\\b|\\bSUBSYSTEM_NAME_02\\b|\\bSUBSYSTEM_NAME_03\\b",
	"SourceField": "subsystemName",
	"Type": "block"
}'

 

Disable a Rule

curl --location --request PUT 'https://api.coralogix.com/api/v1/external/action/RULE_ID/rule/GROUP_ID' \
--header 'Content-Type: application/json' \
--header 'Cache-Control: no-cache' \
--header 'Authorization: Bearer API_KEY' \
--data-raw '{
	"Name":"auto_APP_NAME",
	"Enabled": false,
	"Rule": "AUTODISABLED",
	"SourceField": "subsystemName",
	"Type": "block"
}'

 

Block Rule Example

Now let’s combine all the above building blocks and create the script. You can use the implementation environment of your choice (this is why we’re using pseudo-code). You can, of course, add additional logic. 

Run the “get all team rules” call to return a list of all existing team rules.

For each {application}  {

    Check if (application_dynamicBlock rules group exists)

Save the rule ID {}

Else use the “group create” call to create a new group of rules, name it: application_dynamicBlock. Use the “rule create” call to create the rule, name it application_dynamicBlock as well. Save the rule ID. {}

For each {subsystem} {

      Use the “rule update” call, add the subsystem name to the rule’s regex

Rule example:

\bSUB_SYSTEM_1\b|\bSUB_SYSTEM_2\b|\bSUB_SYSTEM_3\b

}

}

 

At 00:00 UTC

Run the “get all team rules” call to return a list of all existing team rules.

For each {rule} {

    If {rule name contains the const value in its name}

        Save the rule id

}

For each {rule}

    Run “disable rule” call to disable the rule

}

 

Notes:

  1. You can not have more than one block rule per group. Each group contains one rule that blocks all offending subsystems associated with one specific application.
  2. Coralogix has a limitation on the number of rules per team. The default is 30 rules. Make sure that you have enough rules left in your account.
  3. This script uses Coralogix’s RulesMatchers option which is currently not supported via the UI. Make sure that you don’t apply any change to the rule via the UI because, as it will disable the option.
  4. Blocked data is not necessarily lost. You can use our new soft block feature. It allows you to query these logs in real-time through the Livetail and archive them in your S3 bucket. This option saves you 70% of the blocked logs volume towards the quota. To enable this option you can change the value of “KeepBlockedLogs” parameter to true in the ‘create rule’ API call.

Start solving your production issues faster

Let's talk about how Coralogix can help you better understand your logs

Managed, Scaled and Compliant ELK Stack

No credit card required

Get a personalized demo

Jump on a call with one of our experts and get a live personalized demonstration