What is the Activity Schema?
An activity schema is a single time series data structure, modeled as a table in your data warehouse, that describes events occurring within your business. From this single table, nearly all of your business analytics can be computed with ease. The activity schema approach is in contrast to a dimensional data warehouse modeling technique that structures data as objects with relationships to produce fact and dimension tables.
The structure of an activity schema is simple, universal, and intuitive: Who → What → When
|user123||Opened email||10-23-2022 12:30:20|
|user123||Signed up||10-23-2022 12:31:25|
That is, someone or something, taking an action or producing an event, and a datetime for when it happened. You can optionally add additional columns for more detailed analytics. From this single schema, you can analyze with ease the most common business questions facing your business such as:
- Conversion funnel analysis
- Retention rates
- Cohort performance
- Monthly recurring revenue
- and many many more ...
Why use an EventSchema?
- Faster time-to-value
- Leverage re-useable queries and analysis
- Easier to integrate, model, maintain, and update
- Vastly simplified data catalog and lineage
How do I build an Activity Schema?
Step 1 - Identify the data sources required
The best way to build an activity schema is by leveraging the data that you already have. This data typically comes from the following sources:
- Event collection tools - such as Posthog, Snowplow, Amplitude, Mixpanel, Segment, etc.
- Databases - your production database (MySQL, Postgres, MongoDB, etc.) will already contain most of the data required to build an event schema.
- SaaS APIs - such as Stripe, Intercom, Salesforce, or really any other operational tools that track data about the customer or business operation you wish to measure
Step 2 - Import data into your data warehouse
If this data is not already in a centralized database, you will need to extract it from it's source and import it into a data warehouse. If you want your analytics to be up to date and fresh, you will need to automate this job. This is a core feature of the Patterns platform -- we have webhooks for ingesting arbitrary data, managed integrations with databases, APIs, and event collection tools. You can explore all of them in our marketplace, or contact us if you can't find what you need.
Step 3 - Model your raw data into an activity schema
This is the most difficult part about implementing an activity schema for analytics. Once you’ve got all your data in the same place, you need to normalize your data into the Who -> What -> When format. Because your data will come from a number of different sources, likely with bespoke structures (unless coming from standard SaaS APIs like Stripe or Salesforce), you will need to write custom SQL data pipelines to arrive at an activity schema. Building and automating the execution of data pipelines is another core feature of Patterns.
What can I do once I have an Activity Schema?
Build analytical queries against the activity schema
There are hundreds of analytical questions that the activity schema can answer. However, most businesses have the same objectives and so operate the same way -- acquire, monetize, and retain customers -- most company analytics also look the same:
- Conversion funnel analysis
- Modeling revenue, monthly recurring or other
- Calculating cohort retention and churn by count and by revenue
- Calculating customer lifetime value (LTV)
Here is a Patterns app that you can clone to play around with an activity schema and investigate the calculations for each of the analytical questions above.