Five core commands

Hamelin uses five core commands to handle basic data operations: FROM, WHERE, SET, SELECT, and DROP. Each command operates narrowly and serves a specific purpose. You combine these commands using the pipe character |. These core commands serve the same purpose as SQL clauses, but you can use them in any order, with each command feeding its output to the next.

Command reference

`FROM` - Access the rows of a dataset

The FROM command pulls rows from one or more datasets. You start most queries with this command to specify the data sources.

Pull rows from the events dataset:

FROM events

Pull rows from both the users and orders datasets:

FROM users, orders

This operation translates to a SQL UNION ALL, not a join. It pulls all rows from both sources without performing any filtering or row alignment.

When you query multiple sources, Hamelin automatically expands types to accommodate all fields from both datasets. Fields with the same name get aligned, while unique fields are added with NULL values for rows that don't contain them. Learn more about how this works in Type Expansion.

You can organize results from multiple datasets by grouping fields into separate sections. This lets you keep track of which data comes from which source:

FROM
  allows = events.access_allowed,
  denys = events.access_denied

The allows field becomes a struct containing all fields from events.access_allowed (set to null for rows from events.access_denied). The denys field works the same way for events.access_denied. All other fields are aligned normally. This makes it easy to both reference a field's specific lineage and to ignore lineage when you don't need it.

`WHERE` - Filter rows

The WHERE command filters rows based on conditions you specify. You can stack multiple WHERE commands to apply multiple filters.

Only query rows whose action is 'login':

FROM events
| WHERE event.action == 'login'

This has the same effect as using AND between each condition:

FROM users
| WHERE user.role == 'admin'
| WHERE status == 'active'

`SET` - Add or modify fields

The SET command adds new fields or modifies existing ones without affecting other fields. This lets you create calculated fields and enrich datasets as you build queries.

Create a full name by concatenating first and last names:

FROM users
| SET full_name = user.first_name + ' ' + user.last_name

Calculate how many days ago an event occurred:

FROM events
| SET days_ago = (now() - timestamp) / 1d

Add a severity level based on the event action:

FROM events
| SET severity = if(event.action == 'login_failed', 'high', 'low')

You can set nested fields within existing structs to modify specific parts of records:

FROM events
| SET user.display_name = user.first_name + ' ' + user.last_name

You can also create entirely new nested structs by setting multiple nested fields:

FROM events
| SET metadata.processed_at = now()
| SET metadata.version = '2.1'

When creating new structs, using a struct literal is often more readable:

FROM events
| SET metadata = {
    processed_at: now(),
    version: '2.1'
  }

`SELECT` - Define output fields

The SELECT command completely redefines what fields appear in the results. This replaces all existing fields with only the ones you specify.

Select only the user's email, timestamp, and event action from the events dataset:

FROM events
| SELECT user.email, timestamp, event.action

Select existing fields and add new computed fields with custom names:

FROM metrics
| SELECT count, avg_time, category = 'security'

Create new fields using expressions and conditional logic:

FROM events
| SELECT user_id, severity = if(error_count > 10, 'high', 'low')

Transform existing fields while selecting them:

FROM logs
| SELECT timestamp, message, log_level = upper(level)

When you don't provide explicit field names in SELECT, Hamelin automatically generates sensible names based on your expressions. This reduces the need to think of names for simple calculations. Learn more about this in Automatic Field Names.

`DROP` - Remove fields

The DROP command removes specific fields from the results. This is useful when you want to exclude sensitive data or reduce the size of the output.

Remove unwanted fields from the dataset:

FROM events
| DROP internal_id, debug_info

Basic example

This example shows how you can combine the core commands to build a common query.

FROM events
| WHERE event.action != null
| SET days_ago = (now() - timestamp) / 1d
| DROP debug_data, internal_flags

This example demonstrates how the core commands work together in a typical workflow. You start by pulling rows from the events dataset, filter out records with missing action data, add a calculated field for how many days ago each event occurred, and remove unwanted debug fields

FROM events
| WHERE event.action != null
| SET days_ago = (now() - timestamp) / 1d
| SELECT event.action, timestamp, days_ago

This example shows a similar pattern. Rather than dropping specific fields, you select only the ones you want to keep in the final output.

Command reference​

FROM - Access the rows of a dataset​

WHERE - Filter rows​

SET - Add or modify fields​

SELECT - Define output fields​

DROP - Remove fields​

Basic example​