Skip to main content

Schemas

Schemas

Schemas are recommended for any component or node where the structure of the data is known ahead of time. Schemas allow users to get clean and documented data, and allow other components to safely interoperate. Here’s a (truncated) example schema file:

schemas/Order.yml
name: Order
description: E-commerce order
unique_on:
- id
field_roles:
created_ordering: created_at
updated_ordering: updated_at
strictly_monotonic_ordering: id
immutable: false
fields:
id:
type: Integer
customer_id:
type: Text
amount:
type: Decimal(12, 2)
created_at:
type: DateTime
updated_at:
type: DateTime
info

N.B. field_roles are used by components and the Patterns protocol to provided semantic information for automatic behavior. A primary use case for field roles is to provide the default stream ordering of a table (i.e. used when calling table.as_stream() without an order_by argument), which uses strictly_montonic_ordering by default and falls back to created_ordering otherwise. If neither are defined, calling as_stream() with no order_by argument will result in an error.

In our component graph.yml we can then include the schema and expose it:

graph.yml
title: Augment w/ Timestamp
slug: augment-with-timestamp
version: 0.1.0
exposes:
inputs:
- input_table
outputs:
- output_table
parameters:
- timestamp_field_name
schemas:
- Order
functions:
- node_file: augment_with_timestamp.py
schemas:
- schema_file: schemas/Order.yml

This schema will now be available to nodes within the component as well as usable by external graphs:

...
schemas:
- uses: my-org/my-component@v0
schema: Order

Apply a schema to a Table by specifying it on the Table declaration in the node code:

from patterns import *

new_products = Table("new_products", schema="products")

...