At PromptCloud, we deal with data on a massive scale daily, with nearly millions of records and logs being written into Elasticsearch in almost real-time. For the scalable architecture we have, this amount of data is ever increasing. We’ve had to encounter many undesirable situations in the past – unwanted behaviour from data, servers going down, exhausted disk space or extreme CPU load on the servers, and so on. And for a distributed system of such scale, gradually it turns out to be nearly impossible to detect any kind of anomaly and inconsistency in the data in stipulated time. We then realized that we need to add a monitoring tool to supervise the massive amount of data stored in Elasticsearch. ElastAlert slack was included in our technical stack as an alerting tool to handle the data monitoring part.
What is ElastAlert?
ElastAlert is a nifty framework used for sending out alerts on anomalies, spikes, or other patterns of interest from data stored in Elasticsearch. If your system demands writing data into Elasticsearch in near real-time and you want to be alerted based on preset rules, ElastAlert can be the best option for you. It even works with all versions of Elasticsearch.
Elastalert example is very easy to set up, purely event-driven, modular, and a highly reliable tool. A simple demonstration of how to install ElastAlert is given here.
In simple words, ElastAlert’s job is to search for a particular pattern from the bulk of data being written into Elasticsearch and send out alerts if it can detect such a pattern. That ‘pattern’ is written by the system administrator, which is termed as ‘Rule for Elastalert’.
A simple Elastalert rules examples for service monitoring is given below:
Say we like to know all the different URIs that took more than 20 seconds to serve the request over a period of 2 weeks.
# All requests in 2 weeks with more than 20 seconds to serve requests
There could be numerous other important use cases where we can use Elastalert for custom rule types.
Several rule types with common monitoring paradigms are pre-built into ElastAlert:
- Match instances where there are at least X events in Y time” (frequency type)
- Match if the rate of events increases or decreases” (spike type)
- Match if there are less than X events in Y time” (flatline type)
- Match if a certain field matches a blacklist/whitelist” (blacklist and whitelist type)
- Find and match any event matching a provided filter” (any type)
- Match if a single field has multiple values within some time” (change type)
- Match if a foreign term appears in a field” (new_term type)
- Match if the number of unique values for a field is above or below a certain threshold (cardinality type)
If the rules are written correctly and for genuine use cases, ElastAlert’s performance is great for any kind of unstructured data.
There are a host of other features that make Elastalert more useful:
- Alerts link to Kibana dashboards
- Aggregate counts for arbitrary fields
- Combine alerts into periodic reports
- Separate alerts by using a unique key field
- Intercept and enhance match data
As the number of data increases, so does the need for automation. It is practically impossible to manually search for patterns or anomalies in large chunks of data, and there’s always the risk of missing something crucial. Monitoring the data available in Elasticsearch in near real-time using Elastalert is the ultimate solution to this problem.
It makes for a great way to look through certain patterns to trigger alerts, keep duplicates at bay, and prevent unnecessarily triggering some heavyweight service that could eat up resources.
While Kibana is exceptionally good for querying and visualizing data, a companion tool like Elastalert makes it possible to be alerted when inconsistencies or patterns are detected in the data.