Skip to main content

MATCH: ordered pattern matching

The MATCH command finds specific sequences of events in your data. Pattern matching detects sequences like "10 failed logins followed by a successful login" or "error events followed by restart events within 5 minutes." Unlike windowed aggregations, pattern matching requires that event patterns happen in a specific sequence.

Basic pattern matching

The MATCH command searches for ordered sequences using named subqueries and regular expression-style quantifiers. You define what events you're looking for, then specify the pattern and timing constraints. Create named subqueries for the events you want to match, then specify the sequence pattern. This example finds sequences where 5 or more failed logins are followed by at least one successful login:

DEF failed_logins =
FROM security_logs
| WHERE action == 'login_failed';

DEF successful_logins =
FROM security_logs
| WHERE action == 'login_success';

MATCH failed_logins{5,} successful_logins+

Pattern quantifiers

Quantifiers control how many of each event type to match. These work like regular expression quantifiers but apply to your named event datasets.

Exact counts

Specify exact numbers of events to match. This example finds exactly 3 error events followed by exactly 1 restart event:

DEF errors =
FROM system_logs
| WHERE level == 'ERROR';

DEF restarts =
FROM system_logs
| WHERE action == 'service_restart';

MATCH errors{3} restarts{1}

Range quantifiers

Use ranges to specify minimum and maximum counts. This example finds between 2 and 5 high-severity alerts followed by 1 or 2 acknowledgment events:

DEF alerts =
FROM monitoring
| WHERE severity == 'HIGH';

DEF acknowledgments =
FROM monitoring
| WHERE action == 'acknowledge';

MATCH alerts{2,5} acknowledgments{1,2}

Open-ended quantifiers

Use + for "one or more" and * for "zero or more". This example finds one or more failed requests followed by one or more successful requests:

DEF failed_requests =
FROM api_logs
| WHERE status_code >= 500;

DEF success_requests =
FROM api_logs
| WHERE status_code < 400;

MATCH failed_requests+ success_requests+

Time constraints with WITHIN

The WITHIN clause adds constraints to patterns, measuring the distance from the first event to the last event in the matched sequence. For timestamp-based ordering, this represents a time window. For numeric ordering, this represents a numeric distance.

Time window constraints

Require that patterns complete within a specific time period. This example finds 10 or more failed logins followed by successful logins, but only when the entire sequence happens within 10 minutes:

DEF failed_logins =
FROM security_logs
| WHERE action == 'login_failed';

DEF successful_logins =
FROM security_logs
| WHERE action == 'login_success';

MATCH failed_logins{10,} successful_logins+ WITHIN 10m

The WITHIN clause measures on the actual SORT field. When you don't specify a SORT clause, timestamp ordering is applied automatically. When using WITHIN, you must have exactly one SORT expression, and the SORT field type must be compatible with the WITHIN type:

  • TIMESTAMP fields work with INTERVAL (e.g., 5m) or CALENDAR_INTERVAL (e.g., 1y, 3mon)
  • Numeric fields require matching numeric types (e.g., INT sort with INT within)

Numeric ordering with WITHIN

You can use WITHIN with numeric fields to constrain sequences by numeric distance rather than time:

DEF step_a =
FROM process_log
| WHERE step == 'A';

DEF step_b =
FROM process_log
| WHERE step == 'B';

MATCH step_a step_b SORT sequence_number WITHIN 100

This finds sequences where step A is followed by step B, and the sequence numbers differ by at most 100. The WITHIN constraint measures last(sequence_number) - first(sequence_number) <= 100.

Using first() and last() functions

The first() and last() functions access the earliest and latest events in each matched group. This example finds CPU spikes followed by memory alerts within 15 minutes:

DEF cpu_spikes =
FROM metrics
| WHERE cpu_usage > 90;

DEF memory_alerts =
FROM metrics
| WHERE memory_usage > 85;

MATCH cpu_spikes{3,} memory_alerts+ WITHIN 15m

Aggregating over matched rows

The AGG clause lets you compute aggregations over the rows that participated in each matched sequence. Match aggregations only operate on the rows that were part of the match — not all rows within the time window. This distinction matters when the WITHIN window contains more events than the pattern actually matched.

The available match aggregation functions are count(), count(x), sum(x), avg(x), min(x), max(x), first(x), last(x), array_agg(x), and count_distinct(x).

Counting matched events

Use count() to count how many events participated in each matched sequence. When a pattern like failed_logins{5,} successful_logins+ matches, count() returns the total number of rows across all pattern groups — not just one group. You can combine count() with first() and last() to capture both the size and time span of each match. This example detects brute force login attempts and counts how many events were in each attack sequence:

DEF failed_logins =
FROM security_logs
| WHERE action == 'login_failed';

DEF successful_logins =
FROM security_logs
| WHERE action == 'login_success';

MATCH failed_logins{5,} successful_logins+
AGG attempt_count = count(),
first_attempt = first(timestamp),
last_attempt = last(timestamp)
BY user_id
WITHIN 10m

The count() returns the total number of matched events — all the failed logins plus the successful login. The first() and last() functions return the timestamps of the earliest and latest events in the match.

Aggregating field values

You can use sum(), avg(), min(), and max() to aggregate numeric fields across matched events. These work exactly like their regular aggregation counterparts, but they only operate on the events that participated in the match. This is important when the WITHIN window contains events that didn't match the pattern — those events are excluded from the aggregation. This example detects data exfiltration patterns and computes the total bytes transferred during the matched sequence:

DEF access =
FROM file_logs
| WHERE action == 'read';

DEF transfer =
FROM file_logs
| WHERE action == 'upload';

MATCH access+ transfer
AGG total_bytes = sum(bytes),
avg_bytes = avg(bytes),
file_count = count()
BY user_id
WITHIN 30m

The sum(bytes) only totals the bytes from the events that participated in the matched pattern. If there were other file events in the 30-minute window that didn't match the access+ transfer pattern, they are excluded from the aggregation.

Complex pattern examples

Security incident detection

Look for suspicious login patterns that might indicate a brute force attack. This example detects external brute force attempts followed by successful logins and optional privilege escalations, all within 30 minutes:

DEF failed_logins =
FROM auth_logs
| WHERE outcome == 'FAILURE'
| WHERE source_ip NOT IN ('10.0.0.0/8', '192.168.0.0/16');

DEF successful_logins =
FROM auth_logs
| WHERE outcome == 'SUCCESS';

DEF privilege_escalations =
FROM audit_logs
| WHERE action == 'privilege_escalation';

MATCH failed_logins{5,} successful_logins{1,3} privilege_escalations* WITHIN 30m

When to use MATCH vs WINDOW

The key difference is that WINDOW performs unordered correlation while MATCH performs ordered correlation.

When you pull multiple event patterns into a sliding window, you can aggregate each individual pattern or aggregate across all the patterns together. However, you cannot require that certain subpatterns happen before others—the window treats all events within the time frame as unordered.

MATCH specifies that certain events must happen before others in a specific sequence. Ordered correlation matters when the timing and sequence of events affects your analysis.

Use MATCH when order matters:

  • Security attack sequences (failed logins → successful login → privilege escalation)
  • System failure cascades (errors → timeouts → circuit breaker trips)
  • User workflow analysis (page view → form submission → purchase)
  • Compliance violations (access → modification → deletion)