Skip to content

Implementing Filters

Osmany Montero edited this page Jan 16, 2026 · 1 revision

Implementing Filters

Filters are the building blocks of the data parsing pipeline. They transform raw, unstructured logs into standardized, structured objects ready for analysis.

Pipeline Architecture

A filter consists of a pipeline which contains one or more stages. Each stage matches specific dataTypes and executes a sequence of steps.

pipeline:
  - dataTypes:
      - wineventlog
    steps:
      - json:
          source: raw
      - rename:
          from: [log.host.ip]
          to: origin.ip

How Filters Work

  1. Selection: When a log arrives, the parsing plugin matches it to a filter stage based on its dataType.
  2. Sequential Execution: Steps are executed in the exact order they appear in the YAML.
  3. State Management: Each step modifies the current "Draft" of the log. If a step fails, the pipeline may continue depending on the step type and error.
  4. Standardization: The goal is to map vendor-specific fields (e.g., src_ip) to the common UTMStack schema (origin.ip).

Conditional Steps

Every step supports a where clause used to determine if the step should run:

- delete:
    fields: [temporary_meta]
    where: exists("action")

Example: Apache Log Processing

Here is a high-level view of an Apache parsing pipeline:

  1. JSON Parse: Extract structured metadata from the raw entry.
  2. Rename: Map Apache fields to UTMStack standards.
  3. Grok Patterns: Extract IP, User, and Path from the message string.
  4. Enrichment: Add Geolocation using the dynamic plugin.
  5. Normalization: Map HTTP status codes to standardized actions (e.g., accepted, denied).

For a detailed list of all available operations, see the Filter Steps Reference.

Clone this wiki locally