Skip to main content

Core Concepts

Patterns is a unified platform for building end-to-end data systems — an operating system and API for building pipelines and applications on top of your data warehouse. Patterns makes developing data applications simple and fast, whether automating flows with Python, building ETL with SQL, or visualizing with dashboards. It hides the complexity of data infrastructure and lets you focus on solving the specific problem at hand.


A Patterns App is a bundle of code and configuration that defines a specific project. Like a Terraform project, an app can be thought of as “data infrastructure as code” — fully defined by git-controllable files. Apps can be developed in the Patterns Studio UI or locally using your own dev tools and the Patterns devkit.

An app contains a set of Nodes connected in a directed graph that define the flow of data. There are set of specific node types that combine to enable powerful end-to-end applications. There are operator nodes (Python, SQL, Webhook), storage nodes (Table store, Stream store), and presentation nodes (Chart, Markdown).

An app also contains dashboards, which is a view of a set of nodes in your app. You can assemble tables and charts together in a dashboard to create user friendly displays of data and analytics.

Reactive Node Graph

The Graph is the execution and orchestration engine that powers all data flow for an app. When you write function nodes (Python, SQL), you define their storage inputs/outputs (Stream, Table) -- this is the how the graph is built. As an example, below we're reading data from a stream and writing data to a table in Python.

Python: message stream to table
from datetime.datetime import now
from patterns import Stream, Table

# An input (readable) Stream
stream = Stream("messages")

# An output (writeable) Table
table = Table("historical_messages", "w")

records = []
for record in stream.consume_records():
record["consumed_at"] = now()


Through this process of defining where to read/write data, Patterns automatically infers node relationships and will manage execution dependencies between the nodes.

The graph is reactive because nodes update automatically when upstream nodes generate new data, so your App is always up-to-date and data fresh. For root nodes, and for expensive nodes, you can set an explicit schedule to manually control when the node runs, e.g. once a day, or every hour at 5 past the hour. Execution logs and errors are available in the UI for monitoring and debugging your runs.

Node types


Python nodes execute a python script which optionally interacts with any number of Table or Stream Store inputs and outputs. It executes and runs on dedicated and isolated compute containers and has access to standard data libraries. Here’s a simple python node that augments streamed records with a new field:

from datetime.datetime import now
from patterns import Stream

leads = Stream("new_leads")
enriched = Stream("enriched_leads", mode="w")

for lead in leads.consume_records():
lead["processed_at"] = now()


SQL nodes interact with Table stores and execute against the respective database of the stores. They utilize jinja templating to specify tables and parameters. By default, a select statement will result in the query creating a new table version on the connected output Table. Here’s a simple SQL node:

, sum(amount) as total_sales
from {{ Table("transactions") }} as t
group by 1


Table nodes are storage nodes that are similar to standard database tables, but provide a layer of abstraction that enables data versioning, retention, and cross-platform support. They are attached to a specific database and create actual database tables based on the operations of Python and SQL nodes.


Stream nodes are storage nodes built for processing a stream of JSON records, in a one-at-a-time fashion. Streams enable real-time processing of records with Python nodes. Streams can be accumulated and flattened into a Table with the standard Convert Stream to Table marketplace component, and vice-versa, a Table can be converted into a Stream with the respective component.


Webhook nodes provide an endpoint that you can POST JSON data to, which is then processed into a Stream store. Webhooks are a convenient way to stream data into an app and process in real-time or flatten into a Table.


Chart nodes allow you to visualize data from a Table and display it directly in the graph. Charts are defined by a JSON file specifying a valid vega-lite chart. The easiest way to get started is to explore the library of example charts here:

Note: Charts are currently a beta feature that you should expect to change in backwards incompatible ways.


Markdown nodes allow you to document and explain your graph in-place. They accept standard markdown syntax.


Component nodes are sub-graphs that encapsulate functionality and expose an interface of inputs and outputs. You can browse and add components in the Patterns Marketplace, and then hook them up in your graph. They provide powerful re-usable functionality in a few clicks.


By default, your Patterns organization is provided a Patterns DB managed postgres instance to get started. You can also connect your own databases to Patterns to use them in your Apps, both in a read-only mode or a read-write mode, depending on your use case. External tables from your databases can be added directly to any app as an input. Currently Patterns supports Postgres and BigQuery databases, with beta support for most other postgres-dialect-compatible databases.

Secrets and Connections

Secrets give you a place to manage sensitive organization-wide parameters (e.g. API keys) and use them in components and nodes. Similarly, Connections provide an explicit OAuth connection to various vendors that can be used in components.


All data that moves through Patterns is associated with a Schema , a description of the data’s fields and semantic details, much like a traditional database DDL “CREATE TABLE” statement, but made more general. Schemas enable components and nodes to interoperate safely, since each know the structure of the data to expect. See the Common Model project for more details on the Schema specification. It’s always optional to declare explicit schemas on your stores — Patterns knows how to automatically infer them from existing data, but is encouraged for re-usable component development.

Local development and Version Control

Patterns ships with a command line devkit that enables you to pull and push your Apps to the Patterns platform and work locally on the node and graph files in your own development tools. This is also the recommended way to provide robust version control for complex and production Apps — pulling files locally and using git to version control.

Data Retention and Versioning

By default, Patterns retains copies of old versions of Table stores for up to 7 days. Upgraded accounts can configure the retention period.


The Patterns Marketplace is a repository of pre-built components and apps that you can use and fork in your own projects. The modular nature of the Patterns platform means that often your use case has already been partially or fully solved and exists in the marketplace, so there’s no need to re-invent the wheel.


Patterns Dashboards provide a UI for organizing and displaying the outputs of your App — charts, markdown, data tables — and making them easily shareable. Currently, dashboards are only buildable in the Patterns Studio UI.