Type expansion

Hamelin expands types when you query multiple datasets. If you write FROM events, logs, Hamelin creates an expanded type that includes all fields from both sources. This lets you write queries that work across datasets without worrying about schema differences.

How type expansion works

Hamelin constructs expanded types by combining field names from all source types. Fields with the same name get aligned into a single field in the expanded type. This lets you write queries that work across datasets with consistent field naming. As an example, consider searching across different authentication logs with varying schemas:

FROM security_logs, audit_logs, access_logs
| WHERE action = 'login' OR event_type = 'authentication'
| SELECT timestamp, user_id, source_ip, action, event_type, session_id

Each logging system has its own schema:

Security logs track authentication attempts with IP addresses:

{timestamp: timestamp, user_id: string, source_ip: string, action: string}

Audit logs capture detailed session information:

{timestamp: timestamp, user_id: string, event_type: string, session_id: string}

Access logs record basic user activity:

{timestamp: timestamp, user_id: string, source_ip: string}

The expanded result type becomes:

{
  timestamp: timestamp,
  user_id: string,
  source_ip: string,
  action: string,
  event_type: string,
  session_id: string
}

Rows from security_logs will have NULL for the event_type and session_id fields. Rows from audit_logs will have NULL for the source_ip and action fields. Rows from access_logs will have NULL for the action, event_type, and session_id fields.

The result contains rows like:

timestamp	user_id	source_ip	action	event_type	session_id
2024-01-15 10:30:00	alice.smith	192.168.1.100	login	NULL	NULL
2024-01-15 10:31:00	alice.smith	NULL	NULL	authentication	sess_abc123
2024-01-15 10:32:00	alice.smith	192.168.1.100	NULL	NULL	NULL

All three datasets contribute to the same timestamp and user_id fields because they use identical field names. You can filter and select on shared fields without knowing which source contributed each row. This lets you write a single query to search for authentication events across all systems, even though each system logs different fields. The expanded type accommodates all possible fields, and you can filter on any field that exists in any source.

Nested type expansion

Type expansion works with nested structures. Hamelin expands the type hierarchy to accommodate nested fields from different sources. As an example, consider combining user data from different systems:

FROM user_profiles, account_settings
| SELECT user.name, user.email, user.preferences

Each system has its own nested user structure:

User profiles contain basic identity information:

{user: {name: string, email: string}}

Account settings store user preferences:

{user: {preferences: string}}

The expanded result type becomes:

{user: {name: string, email: string, preferences: string}}

This means you can access user.name from profile data and user.preferences from settings data in the same query, even though the original sources have different nested structures.

note

Hamelin maintains stable field ordering when merging nested structures. Fields from the first source appear first, then fields from the second source are added in their original order. This consistent ordering means you can rely on the structure of expanded types being predictable across queries.

Array literal expansion

Type expansion also happens when you create array literals containing struct types with different schemas. Just like FROM clauses, Hamelin creates an expanded type that accommodates all fields from every struct in the array. As an example, consider creating an array mixing user records with different available fields:

SET mixed_users = [
  {name: 'Alice', age: 30, department: 'Engineering'},
  {name: 'Bob', email: '[email protected]', age: 25},
  {name: 'Carol', email: '[email protected]', department: 'Sales'}
]

Each struct has its own schema:

The first user record has name, age, and department:

{name: string, age: number, department: string}

The second user record has name, email, and age:

{name: string, email: string, age: number}

The third user record has name, email, and department:

{name: string, email: string, department: string}

The expanded array type becomes:

[{name: string, age: number, department: string, email: string}]

Each element gets NULL values for missing fields. The first element has NULL for email. The second element has NULL for department. The third element has NULL for age. This lets you create arrays from structs with different schemas while maintaining type consistency across all elements.

You can then query the expanded array just like any other dataset:

SET mixed_users = [
  {name: 'Alice', age: 30, department: 'Engineering'},
  {name: 'Bob', email: '[email protected]', age: 25},
  {name: 'Carol', email: '[email protected]', department: 'Sales'}
]
| UNNEST mixed_users

The query works across all elements regardless of which fields were originally present in each struct. Missing fields appear as NULL in the results, just like with FROM clause expansion.

The results would be:

name	age	department	email
Alice	30	Engineering	NULL
Bob	25	NULL	[email protected]
Carol	NULL	Sales	[email protected]

How type expansion works​

Nested type expansion​

Array literal expansion​

How type expansion works

Nested type expansion

Array literal expansion