AGG: performing ordinary aggregation

The AGG command groups and aggregates datasets to create summary statistics and analytical insights. You can analyze user behavior patterns, system performance metrics, or security events by grouping related records together and applying mathematical functions to each group.

AGG syntax

The AGG command follows a simple pattern that groups data and applies aggregation functions to each group:

AGG result = function(expression), ... BY grouping_expression, ...

When you omit the BY clause, Hamelin aggregates all records into a single group. This calculates overall dataset statistics and global metrics that span all records, counting all events across the entire dataset without any grouping or partitioning:

FROM events
| AGG total_events = count()

When you omit explicit field names, Hamelin generates them automatically from the expressions you provide. Learn more about this feature in Automatic Field Names. This creates fields named count() and avg(response_time) that you can reference using backticks in subsequent commands:

FROM requests
| AGG count(), avg(response_time) BY service_name

You can also rename fields in the BY clause and use any expression for grouping. This example groups by renamed event_type, truncated timestamp, and extracted email domain, creating clear field names for downstream analysis:

FROM events
| AGG
    total_events = count(),
    avg_duration = avg(duration)
  BY event_category = event_type,
     hour_bucket = timestamp@hr,
     user_domain = split(email, '@')[1]

Simple aggregation examples

Basic counting

Event counting groups events by their characteristics and calculates how many events fall into each category. Notice that Hamelin uses count() with no arguments, not count(*) like SQL. The empty parentheses count all rows in each group, providing a clean syntax for the most common aggregation operation:

FROM events
| AGG event_count = count() BY event_type

Multiple aggregations

Calculating several metrics at once in a single AGG command ensures all metrics use consistent grouping logic:

FROM requests
| AGG
    total_requests = count(),
    avg_response_time = avg(response_time),
    max_response_time = max(response_time),
    error_count = count_if(status_code >= 400)
  BY service_name

Conditional aggregation

Conditional aggregation functions like count_if() let you count only rows that meet specific conditions without pre-filtering the dataset. Conditional aggregation maintains the full context of each group while applying different filters to different calculations:

FROM auth_logs
| AGG
    failures = count_if(outcome == 'FAILURE'),
    successes = count_if(outcome == 'SUCCESS')
  BY user_name

Time series aggregations

Time series aggregations combine time truncation with grouping to create time-based buckets for temporal analysis. Time-based grouping creates time-bucketed summaries for monitoring system performance, tracking business metrics, and understanding user behavior patterns across different time scales.

Hourly summaries

Hourly aggregations provide detailed views of system activity and user behavior throughout the day:

FROM logs
| AGG
    hourly_events = count(),
    avg_response = avg(response_time),
    error_rate = count_if(status >= 400) / count()
  BY timestamp@hr
| SORT timestamp@hr

Daily trends

Daily aggregations reveal longer-term trends and enable comparison across different time periods:

FROM events
| WITHIN -30d..now()
| AGG
    daily_events = count(),
    unique_users = count_distinct(user_name),
    high_severity = count_if(severity = 'HIGH')
  BY timestamp@d
| SORT timestamp@d DESC

AGG syntax​

Simple aggregation examples​

Basic counting​

Multiple aggregations​

Conditional aggregation​

Time series aggregations​

Hourly summaries​

Daily trends​