Type expansion
Hamelin expands types when you query multiple datasets. If you write FROM events, logs, Hamelin creates an expanded type that includes all fields from
both sources. This lets you write queries that work across datasets without
worrying about schema differences.
How type expansion works
Hamelin constructs expanded types by combining field names from all source types. Fields with the same name get aligned into a single field in the expanded type. This lets you write queries that work across datasets with consistent field naming. As an example, consider searching across different authentication logs with varying schemas:
FROM security_logs, audit_logs, access_logs
| WHERE action = 'login' OR event_type = 'authentication'
| SELECT timestamp, user_id, source_ip, action, event_type, session_id
Each logging system has its own schema:
Security logs track authentication attempts with IP addresses:
{timestamp: timestamp, user_id: string, source_ip: string, action: string}
Audit logs capture detailed session information:
{timestamp: timestamp, user_id: string, event_type: string, session_id: string}
Access logs record basic user activity:
{timestamp: timestamp, user_id: string, source_ip: string}
The expanded result type becomes:
{
timestamp: timestamp,
user_id: string,
source_ip: string,
action: string,
event_type: string,
session_id: string
}
Rows from security_logs will have NULL for the event_type and
session_id fields. Rows from audit_logs will have NULL for the
source_ip and action fields. Rows from access_logs will have NULL for
the action, event_type, and session_id fields.
The result contains rows like:
| timestamp | user_id | source_ip | action | event_type | session_id |
|---|---|---|---|---|---|
| 2024-01-15 10:30:00 | alice.smith | 192.168.1.100 | login | NULL | NULL |
| 2024-01-15 10:31:00 | alice.smith | NULL | NULL | authentication | sess_abc123 |
| 2024-01-15 10:32:00 | alice.smith | 192.168.1.100 | NULL | NULL | NULL |
All three datasets contribute to the same timestamp and user_id fields
because they use identical field names. You can filter and select on shared
fields without knowing which source contributed each row. This lets you write a
single query to search for authentication events across all systems, even
though each system logs different fields. The expanded type accommodates all
possible fields, and you can filter on any field that exists in any source.
Nested type expansion
Type expansion works with nested structures. Hamelin expands the type hierarchy to accommodate nested fields from different sources. As an example, consider combining user data from different systems:
FROM user_profiles, account_settings
| SELECT user.name, user.email, user.preferences
Each system has its own nested user structure:
User profiles contain basic identity information:
{user: {name: string, email: string}}
Account settings store user preferences:
{user: {preferences: string}}
The expanded result type becomes:
{user: {name: string, email: string, preferences: string}}
This means you can access user.name from profile data and user.preferences
from settings data in the same query, even though the original sources have
different nested structures.
Hamelin maintains stable field ordering when merging nested structures. Fields from the first source appear first, then fields from the second source are added in their original order. This consistent ordering means you can rely on the structure of expanded types being predictable across queries.
Array literal expansion
Type expansion also happens when you create array literals containing struct
types with different schemas. Just like FROM clauses, Hamelin creates an
expanded type that accommodates all fields from every struct in the array. As
an example, consider creating an array mixing user records with different
available fields:
SET mixed_users = [
{name: 'Alice', age: 30, department: 'Engineering'},
{name: 'Bob', email: '[email protected]', age: 25},
{name: 'Carol', email: '[email protected]', department: 'Sales'}
]
Each struct has its own schema:
The first user record has name, age, and department:
{name: string, age: number, department: string}
The second user record has name, email, and age:
{name: string, email: string, age: number}
The third user record has name, email, and department:
{name: string, email: string, department: string}
The expanded array type becomes:
[{name: string, age: number, department: string, email: string}]
Each element gets NULL values for missing fields. The first element has
NULL for email. The second element has NULL for department. The third
element has NULL for age. This lets you create arrays from structs with
different schemas while maintaining type consistency across all elements.
You can then query the expanded array just like any other dataset:
SET mixed_users = [
{name: 'Alice', age: 30, department: 'Engineering'},
{name: 'Bob', email: '[email protected]', age: 25},
{name: 'Carol', email: '[email protected]', department: 'Sales'}
]
| UNNEST mixed_users
The query works across all elements regardless of which fields were originally
present in each struct. Missing fields appear as NULL in the results, just
like with FROM clause expansion.
The results would be:
| name | age | department | |
|---|---|---|---|
| Alice | 30 | Engineering | NULL |
| Bob | 25 | NULL | [email protected] |
| Carol | NULL | Sales | [email protected] |