Skip to main content

DEF: using named statements

You use the DEF keyword to create named datasets that you can reference later in your query. This lets you break complex analysis into manageable steps, reuse filtered data multiple times, and make your queries more readable and maintainable.

Creating named datasets

The DEF command assigns a name to any dataset, letting you reference it throughout your query. This is essential when you need to combine data from multiple sources or apply different filters to the same base data:

DEF error_events =
FROM events
| WHERE status = 'ERROR';

DEF service_counts =
FROM logs
| AGG event_count = count()
BY service;

FROM error_events
| JOIN service_counts ON service == service_counts.service

Common patterns

Here are three scenarios where named subqueries make complex analysis straightforward and readable.

Combining filtered event types

You can combine different types of events after applying specific filters to each type. This lets you analyze related events together while maintaining distinct filtering criteria:

DEF login_events =
FROM security_logs
| WHERE event_type = 'user_login'
| WITHIN -1hr;

DEF logout_events =
FROM security_logs
| WHERE event_type = 'user_logout'
| WITHIN -1hr;

FROM login_events, logout_events
| AGG total_auth_events = count()

This creates separate filtered datasets for logins and logouts, then combines them to get a total count of authentication events from the past hour.

Security event correlation

You can analyze relationships between different security events by creating named datasets for each event type and then using window functions to find patterns:

DEF login_events =
FROM security_logs
| WHERE event_type = 'user_login'
| WITHIN -1hr;

DEF priv_escalation =
FROM security_logs
| WHERE event_type = 'privilege_escalation'
| WITHIN -1hr;

FROM login = login_events, priv = priv_escalation
| WINDOW count(login), count(priv)
WITHIN 5min
BY user_id

// Focus on windows where new events occur to avoid duplicate alerts
| WHERE `count(login)` == 1 AND `count(priv)` >= 1 OR
`count(login)` >= 1 AND `count(priv)` == 1

This creates separate datasets for login and privilege escalation events, then uses a sliding window to find 5-minute periods where both types occur for the same user.

Enrichment with filtered lookup data

You can enrich your main dataset with information from lookup tables that have been pre-filtered to include only relevant records:

DEF active_departments =
FROM departments
| WHERE status = 'active' AND budget > 100000;

DEF user_activity =
FROM user_events
| WITHIN -24hr
| AGG session_count = count() BY user_id;

FROM user_activity
| JOIN users ON user_id == users.user_id
| JOIN active_departments ON department_id == active_departments.department_id
| WHERE session_count > 10

This pattern filters the department lookup table first to include only active, well-funded departments, then enriches user activity data with this filtered information. This approach is more efficient than filtering departments after the join.