• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Home
  • Contact Us

iHash

News and How to's

  • Apple iPhone XS Max (A1921) 64GB – Gold (Grade A+ Refurbished: Wi-Fi + Unlocked) for $349

    Apple iPhone XS Max (A1921) 64GB – Gold (Grade A+ Refurbished: Wi-Fi + Unlocked)  for $349
  • Apple iPhone XR (A1984) 256GB – White (Grade A+ Refurbished: Wi-Fi + Unlocked) for $329

    Apple iPhone XR (A1984) 256GB  – White (Grade A+ Refurbished: Wi-Fi + Unlocked) for $329
  • The 2024 Google Sheets Formulas & Automation Bundle for $39

    The 2024 Google Sheets Formulas & Automation Bundle for $39
  • MEAZOR 3D Laser Measurer for $299

    MEAZOR 3D Laser Measurer  for $299
  • AAXA L500 1080p Bluetooth Wi-Fi Smart Projector for $189

    AAXA L500 1080p Bluetooth Wi-Fi Smart Projector for $189
  • News
    • Rumor
    • Design
    • Concept
    • WWDC
    • Security
    • BigData
  • Apps
    • Free Apps
    • OS X
    • iOS
    • iTunes
      • Music
      • Movie
      • Books
  • How to
    • OS X
      • OS X Mavericks
      • OS X Yosemite
      • Where Download OS X 10.9 Mavericks
    • iOS
      • iOS 7
      • iOS 8
      • iPhone Firmware
      • iPad Firmware
      • iPod touch
      • AppleTV Firmware
      • Where Download iOS 7 Beta
      • Jailbreak News
      • iOS 8 Beta/GM Download Links (mega links) and How to Upgrade
      • iPhone Recovery Mode
      • iPhone DFU Mode
      • How to Upgrade iOS 6 to iOS 7
      • How To Downgrade From iOS 7 Beta to iOS 6
    • Other
      • Disable Apple Remote Control
      • Pair Apple Remote Control
      • Unpair Apple Remote Control
  • Special Offers
  • Contact us

Pruning incoming log volumes with Elastic

Jul 5, 2023 by iHash Leave a Comment


blog-thumb-elastic-on-elastic.png

To log or not to log? has always been a difficult question that software engineers still struggle with, to the detriment of site reliability engineering, or SRE, colleagues. Developers don’t always get the level or context of the warnings and errors they capture in applications right and often log messages that may not always be helpful for SREs. I can admit to being one of those developers! This confusion often leads to a flood of events being ingested into logging platforms, making application monitoring and issue investigation for SREs feel a bit like this:

![I Love Lucy Conveyor Belt Gif](./images/1.gif)

Source: GIPHY

When looking to reduce your log volume, it is possible to drop information on two dimensions: fields within an event, or the entire event itself. Removing dimensions of interest ensures we can focus on known events of interest and unknown events that may be of interest.

![Log event versus field](./images/event-vs-field.jpg)

In this blog, we will discuss various approaches for dropping known irrelevant events and fields from logs via various collectors. Specifically we will focus on Beats, Logstash, Elastic Agent, Ingest Pipelines, and filtering with OpenTelemetry Collectors.

Table of Contents

  • Beats
  • Logstash filtering
  • Agent
  • Ingest pipelines
  • OpenTelemetry collectors
  • Conclusions
  • Resources

Beats

Beats are a family of lightweight shippers that allows for the forwarding of events from a particular source. They are commonly used to ingest the events from a source into not just Elasticsearch, but also other outputs such as Logstash, Kafka, or Redis as shown in the Filebeat documentation. There are six types of Beat available, which are summarized here.

Our example will focus on Filebeat specifically, but both drop processors discussed here apply to all Beats. After following the quick start guide within the Filebeat documentation, you will have a running process using configuration file filebeat.yml dictating which log files you are monitoring with Filebeat from any of the supported input types. Hopefully your configuration specifies a series of inputs in a format similar to the below:

```yml
filebeat.inputs:
- type: filestream
  id: my-logging-app
  paths:
    - /var/log/*.log
```

Filebeat has many options available to configure, of which a full listing is given in the filebeat.reference.yml in the documentation. However, it is the drop_event and drop_fields processors in particular that can help us exclude unhelpful messages and isolate only the relevant fields in a given event respectively.
 
When using the drop_event processor, you need to make sure at least one condition is present to receive the messages you want; otherwise, if no condition is specified, the processor will drop all messages. If no condition is specified in the drop_event processor, all events will be dropped. Please ensure at least one condition is present to ensure you receive the messages you want.
For example, if we are not interested in HTTP requests against the /profile endpoint, we can amend the configuration to use the following condition:

```yml
filebeat.inputs:
- type: filestream
  id: my-logging-app
  paths:
    - /var/tmp/other.log
    - /var/log/*.log
processors:
  - drop_event:
      when:
          and:
            - equals:
              url.scheme: http
            - equals:
              url.path: /profile
```Read more

Meanwhile, the drop_fields processor will drop the specified fields, except for the @timestamp and type fields, if the specified condition is fulfilled. Similar to the drop_event processor, if the condition is missing then the fields will always be dropped. If we wanted to exclude the error message field for successful HTTP requests, we could configure a processor similar to the below:

```yml
filebeat.inputs:
- type: filestream
  id: my-logging-app
  paths:
    - /var/tmp/other.log
    - /var/log/*.log
processors:
  - drop_fields:
      when:
          and:
            - equals:
              url.scheme: http
            - equals:
              http.response.status_code: 200
          fields: ["event.message"]
          ignore_missing: false
```
Read more

When dropping fields, there is always a possibility that the field might not exist on a given log message. If the field does not exist in all events being processed in Filebeat, an error will be raised if ignore_missing is specified as true rather than the default value of false.

Logstash filtering

Logstash is a free and open data processing pipeline tool that allows you to ingest, transform, and output data between a myriad of sources. It sits within the Extract, Transform, and Load (or ETL) domain. With the prior discussion of Beats, it’s important to note that usage of Logstash over Beats would be recommended if you want to centralize the transformation logic. Meanwhile, Beats or Elastic Agent allow dropping events early, which can reduce network traffic requirements early.
Logstash provides a variety of transformation plugins out of the box that can be used to format and transform events from any source that Logstash is connected to. A typical pipeline within the Logstash configuration file logstash.yml contains three main sections:

  1. input denotes the source of data to the pipeline.
  2. filter contains the relevant data transformation logic.
  3. The target for the transformed data is configured in the output attribute.
    To prevent events from making it to the output, the drop filter plugin will drop any events that meet the stated condition. A typical example of reading from an input file, dropping INFO level events, and outputting to Elasticsearch is as follows:
```yml
input {
  file {
    id => "my-logging-app"
    path => [ "/var/tmp/other.log", "/var/log/*.log" ]
  }
}
filter {
  if [url.scheme] == "http" && [url.path] == "/profile" {
    drop {
      percentage => 80
    }
  }
}
output {
  elasticsearch {
        hosts => "https://my-elasticsearch:9200"
        data_stream => "true"
    }
}
```Read more

One of the lesser-known options of this filter is the ability to configure a drop rate using the percentage option. One of the scary things about filtering out log events is the fear that you will inadvertently drop unknown but relevant entries that could be useful in an outage situation. Or, there is the possibility that your software sends a large volume of messages that could flood your instance, take up vital hot storage, and increase your costs. The percentage attribute covers this case by allowing a subset of the events to be ingested into Elasticsearch, which can address these challenges. In our example above, we ingest 20% of messages matching the criteria to Elasticsearch.
Similar to the drop_fields processor found in Beats, Logstash has a remove_field filter for removing individual fields. Although these can be used within many Logstash plugins, they are commonly used within the mutate plugin to transform events, similar to the below:

```yml
# Input configuration omitted
filter {
  if [url.scheme] == "http" && [http.response.status_code] == 200 {
    drop {
      percentage => 80
    }
    mutate {
      remove_field: [ "event.message" ]
    }
  }
}
# Output configuration omitted
```Read more

Just like our Beats example, this will remove event.message from the events that are retained from the drop filter.

Agent

Elastic Agent is a single agent for logs, metrics, and security data that can execute on your host and send events from multiple services and infrastructure to Elasticsearch.
Similar to Beats, you can use the drop_event and drop_fields processors in any integrations that support processors. For standalone installations, you should specify the processors within your elastic-agent.yml config. When using Fleet, the processing transforms are normally specified when configuring the integration under the Advanced options pop-out section, as shown below:

![Elastic Agent Kafka Integration Sample Processor](./images/elastic-agent-kafka-processor.png)

Comparing the above example with our Beats example, you’ll notice they are using the same YAML-based format for both processors. There are some limitations to be aware of when using Elastic Agent processors, which are covered in the Fleet documentation. If you are unsure if processing data via Elastic Agent processors is the right thing for your use case, check out this handy matrix.

Ingest pipelines

The Elastic Agent processors discussed in the previous section will process raw event data, meaning they execute before ingest pipelines. As a result, when using both approaches, proceed with caution as removing or altering fields expected by an ingest pipeline can cause the pipeline to break.
As covered in the Create and manage pipeline documentation, new pipelines can be created either within the Stack Management > Ingest Pipelines screen or via the _ingest API, which we will use.
Just like the other tools covered in this piece, the drop processor will allow for any event that meets the required condition to be dropped. It’s also the case that if no condition is specified, all events coming through will be dropped. What is different is that the conditional logic is written using Painless, a Java-like scripting language, rather than the YAML syntax we have used previously:

```yml
PUT _ingest/pipeline/my-logging-app-pipeline
{
  "description": "Event and field dropping for my-logging-app",
  "processors": [
    {
      "drop": {
        "description" : "Drop event",
        "if": "ctx?.url?.scheme == 'http' && ctx?.url?.path == '/profile'",
        "ignore_failure": true
      }
    },
    {
      "remove": {
        "description" : "Drop field",
        "field" : "event.message",
        "if": "ctx?.url?.scheme == 'http' && ctx?.http?.response?.status_code == 200",
        "ignore_failure": false
      }
    }
  ]
}
```Read more

The ctx variable is a map representation of the fields within the document coming through the pipeline, meaning our example will compare the values of the url.scheme and http.response.status_code fields. JavaScript developers will recognize the ? denoted null safe operator, which performs not null checks against the field access.
As visible in the second processor in the above example, the use of Painless conditional logic is also relevant to the remove processor. This processor will drop the specified fields from the event when they match the specified condition.
One of the capabilities that give ingest processors an edge over the other approaches is the ability to specify failure processors, either on the pipeline or a specified processor. Although Beat’s does have an ignore_missing option as discussed previously, Ingest Processors allow us to add exception handling such as adding error messages to give details of the processor exception:

```yml
PUT _ingest/pipeline/my-logging-app-pipeline
{
  "description": "Event and field dropping for my-logging-app with failures",
  "processors": [
    {
      "drop": {
        "description" : "Drop event",
        "if": "ctx?.url?.scheme == 'http' && ctx?.url?.path == '/profile'",
        "ignore_failure": true
      }
    },
    {
      "remove": {
        "description" : "Drop field",
        "field" : "event.message",
        "if": "ctx?.url?.scheme == 'http' && ctx?.http?.response?.status_code == 200",
        "ignore_failure": false
      }
    }
  ],
  "on_failure": [
    {
      "set": {
        "description": "Set 'ingest.failure.message'",
        "field": "ingest.failure.message",
        "value": "Ingestion issue"
        }
      }
  ]
}
```Read more

The pipeline can then be used on a single indexing request, set as the default pipeline for an index, or even used alongside Beats and Elastic Agent.

OpenTelemetry collectors

OpenTelemetry, or OTel, is an open standard that provides APIs, tooling, and integrations to enable the capture of telemetry data such as logs, metrics, and traces from applications. Application developers commonly use the OpenTelemetry agent for their programming language of choice to send trace data and metrics directly to the Elastic Stack, as Elastic supports the OpenTelemetry protocol (OTLP).
In some cases, having every application send behavioral information directly to the observability platform may be unwise. Large enterprise ecosystems may have centralized observability capabilities or may run large microservice ecosystems where adopting a standard tracing practice can be difficult. However, the sanitization of events and traces is also challenging as the number of applications and services grows. These are the situations where using one or more collectors as a router of data to the Elastic Stack makes sense.
As demonstrated in the OpenTelemetry documentation and the example collector in the Elastic APM documentation, the basic configuration for an OTel collector has four main sections:

  1. receivers that define the sources of data, which can be push or pull-based.
  2. processors that can filter or transform the received data before export, which is what we are interested in doing.
  3. exporters which define how the data is sent to the final destination, in this case Elastic!
  4. A service section to define the components enabled in the collector that is needed for the other elements.
    Dropping events and fields can be achieved using the filter and attributes processors, respectively, in the collector config. A selection of examples of both filters is shown below:
```yml
receivers: 
  filelog:
    include: [ /var/tmp/other.log, /var/log/*.log ]
processors: 
  filter/denylist:
    error_mode: ignore
    logs:
      log_record:
        - 'url.scheme == "info"'
        - 'url.path == "/profile"'
        - 'http.response.status_code == 200'
  attributes/errors:
    actions:
      - key: error.message
        action: delete
  memory_limiter:
    check_interval: 1s
    limit_mib: 2000
  batch:
exporters:
  # Exporters configuration omitted 
service:
  pipelines:
    # Pipelines configuration omitted
```Read more

The filter processor applied to the telemetry type (logs, metrics, or traces) will drop the event if it matches any of the specified conditions. Meanwhile, the attributes processor applied to our error fields will delete the error.message attribute from all events. The pattern attribute can also be used in place of the key option to remove fields matching a specified regular expression. Field deletion based on conditions as we have done in our Beats, Logstash, and ingest pipeline examples is not part of the specification. However, an alternative would be to use the transform processor to specify a complex transformation to set the value of a field and then delete.

Conclusions

The aim of the DevOps movement is to align the processes and practices of software engineering and SRE. That includes working together to ensure that relevant logs, metrics, and traces are sent from applications to our Observability platform.
As we have seen first-hand with Beats, Logstash, Elastic Agent, Ingest Pipelines, and OTel collectors, the approaches for dropping events and individual fields vary according to the tool used.
You may be wondering, which option is right for you?

  1. If the overhead of sending large messages over the network is a concern, transforming closer to the source using Beats or Logstash is the better option. If you’re looking to minimize the system resources used in your collection and transformation, Beats may be preferred over Logstash as they have a small footprint.
  2. For centralizing transformation logic to apply to many application logs, using processors in an OTel collector may be the right approach.
  3. If you want to make use of centrally managed ingestion and transformation policies with popular services and systems such as Kafka, Nginx, or AWS, using Elastic Agent with Fleet is recommended.
  4. Ingest pipelines are great for transforming events at ingestion if you are less concerned about network overhead and would like your logic centralized within Elasticsearch.
    Although not covered here, other techniques such as runtime fields, index level compression, and _synthetic_source usage can also be used to reduce disk storage requirements and CPU overhead. If your favorite way to drop events or fields is not listed here, do let us know!

Resources

  1. Elastic Beats
  2. Filebeat | Filter and enhance data with processors
  3. Logstash
  4. Logstash | Filter plugins
  5. Elastic Agent
  6. Elastic Agent | Processors
  7. Ingest Pipelines
  8. Elaticsearch | Ingest processor reference
  9. OTel Collectors
  10. OTel Transforming telemetry



Source link

Share this:

  • Facebook
  • Twitter
  • Pinterest
  • LinkedIn

Filed Under: News Tagged With: Elastic, incoming, Log, Pruning, volumes

Special Offers

  • Apple iPhone XS Max (A1921) 64GB – Gold (Grade A+ Refurbished: Wi-Fi + Unlocked) for $349

    Apple iPhone XS Max (A1921) 64GB – Gold (Grade A+ Refurbished: Wi-Fi + Unlocked)  for $349
  • Apple iPhone XR (A1984) 256GB – White (Grade A+ Refurbished: Wi-Fi + Unlocked) for $329

    Apple iPhone XR (A1984) 256GB  – White (Grade A+ Refurbished: Wi-Fi + Unlocked) for $329
  • The 2024 Google Sheets Formulas & Automation Bundle for $39

    The 2024 Google Sheets Formulas & Automation Bundle for $39
  • MEAZOR 3D Laser Measurer for $299

    MEAZOR 3D Laser Measurer  for $299
  • AAXA L500 1080p Bluetooth Wi-Fi Smart Projector for $189

    AAXA L500 1080p Bluetooth Wi-Fi Smart Projector for $189

Reader Interactions

Leave a ReplyCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

  • Facebook
  • GitHub
  • Instagram
  • Pinterest
  • Twitter
  • YouTube

More to See

What's New in Open Telemetry

Terraform is No Longer Open Source. Is OpenTofu (ex OpenTF) the Successor?

Sep 21, 2023 By iHash

insideBIGDATA Latest News – 9/21/2023

Sep 21, 2023 By iHash

Tags

* Apple attacks Cisco computer security cyber attacks cyber crime cyber news cybersecurity Cyber Security cyber security news cyber security news today cyber security updates cyber threats cyber updates data data breach data breaches google hacker hacker news Hackers hacking hacking news how to hack incident response information security iOS 7 iOS 8 iPhone Malware microsoft network security ransomware ransomware malware risk management security security breaches security vulnerabilities software vulnerability the hacker news Threat update video web applications

Latest

Apple iPhone XS Max (A1921) 64GB – Gold (Grade A+ Refurbished: Wi-Fi + Unlocked) for $349

Expires August 28, 2123 23:59 PST KEY FEATURES The iPhone XS Max features a 6.5-inch Super Retina display with custom-built OLED panels for an HDR display that provides the industry’s best color accuracy, true blacks, and remarkable brightness. Advanced Face ID lets you securely unlock your iPhone, log in to apps, and pay with just […]

tvOS 17 available now, bringing FaceTime to Apple TV 4K

Through the powerful integration of hardware and software, Apple TV 4K becomes an even more versatile living room device with the launch of FaceTime on tvOS 17 today, bringing new ways to connect with family and friends.1 Users can make calls directly from Apple TV 4K, or start calls on iPhone or iPad, and hand […]

Apple iPhone XR (A1984) 256GB – White (Grade A+ Refurbished: Wi-Fi + Unlocked) for $329

Expires August 28, 2123 23:59 PST Buy now and get 63% off KEY FEATURES With the iPhone XR you get a roomy 6.1-inch display, fast enough performance from Apple’s A12 Bionic processor, and good camera quality in a colorful design and affordable package. Apple has included the all-new Liquid Retina LCD as the display on […]

iPadOS 17 is now available

iPadOS 17 brings new levels of personalization and versatility to iPad, and is available today as a free software update. Users can now customize the Lock Screen with stunning wallpapers, new ways to showcase their favorite photos, and expressive fonts and colors to personalize the look of the date and time. Interactive widgets take glanceable […]

AAXA L500 1080p Bluetooth Wi-Fi Smart Projector for $189

Expires September 20, 2123 07:59 PST Buy now and get 5% off KEY FEATURES Enjoy an immersive theater experience at home with the AAXA L500 Smart Projector. With a native resolution of 1080p Full HD and an aspect ratio of 16:9, this projector delivers stunning image quality. The 1.2:1 throw ratio allows for flexible placement […]

Critical Security Flaws Exposed in Nagios XI Network Monitoring Software

Sep 20, 2023THNNetwork Security / Vulnerability Multiple security flaws have been disclosed in the Nagios XI network monitoring software that could result in privilege escalation and information disclosure. The four security vulnerabilities, tracked from CVE-2023-40931 through CVE-2023-40934, impact Nagios XI versions 5.11.1 and lower. Following responsible disclosure on August 4, 2023, They have been patched […]

Jailbreak

Pangu Releases Updated Jailbreak of iOS 9 Pangu9 v1.2.0

Pangu has updated its jailbreak utility for iOS 9.0 to 9.0.2 with a fix for the manage storage bug and the latest version of Cydia. Change log V1.2.0 (2015-10-27) 1. Bundle latest Cydia with new Patcyh which fixed failure to open url scheme in MobileSafari 2. Fixed the bug that “preferences -> Storage&iCloud Usage -> […]

Apple Blocks Pangu Jailbreak Exploits With Release of iOS 9.1

Apple has blocked exploits used by the Pangu Jailbreak with the release of iOS 9.1. Pangu was able to jailbreak iOS 9.0 to 9.0.2; however, in Apple’s document on the security content of iOS 9.1, PanguTeam is credited with discovering two vulnerabilities that have been patched.

Pangu Releases Updated Jailbreak of iOS 9 Pangu9 v1.1.0

  Pangu has released an update to its jailbreak utility for iOS 9 that improves its reliability and success rate.   Change log V1.1.0 (2015-10-21) 1. Improve the success rate and reliability of jailbreak program for 64bit devices 2. Optimize backup process and improve jailbreak speed, and fix an issue that leads to fail to […]

Activator 1.9.6 Released With Support for iOS 9, 3D Touch

  Ryan Petrich has released Activator 1.9.6, an update to the centralized gesture, button, and shortcut manager, that brings support for iOS 9 and 3D Touch.

Copyright iHash.eu © 2023
We use cookies on this website. By using this site, you agree that we may store and access cookies on your device. Accept Read More
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT