Nash's Operational Intelligence Bot, Satoshi II

December 21, 2023

Nash: Bringing equilibrium to the last-mile delivery market

Nash, a San Francisco based logistics tech company, empowers businesses of any size to provide dependable local delivery services for their customers. This solution is ideal for businesses, such as restaurants, groceries or local dry-cleaning businesses, which can effortlessly implement pickup and drop-off services at times convenient for their customers, without the need to manage their own delivery fleet.

Established in 2021, Nash gained early traction and joined YCombinator, followed by a $20 million Series A funding led by a16z. Leveraging this investment, Nash has impressively tripled its workforce, demonstrating a strong commitment to growth and service excellence.

The results speak for themselves: the business has expanded quickly internationally and into multiple verticals with corresponding volume growth. This exponential growth not only underscores Nash's operational efficiency but also cements its position as a leader in reliable and scalable delivery solutions for businesses aiming to enhance their customer service experience.

The Challenge: Efficiently Scaling Complex Operations

Navigating the intricacies of the on-demand delivery market, known for its operational complexity and substantial capital requirements, the Nash team continuously seeks innovative solutions to enhance their operational scalability.

At the forefront of this endeavor is Jerry Shen, Nash's Head of Data. Faced with the challenge of making data more accessible to operational teams while avoiding the expansion of the analyst team, Jerry discovered Patterns, a transformative solution that addressed his needs.

Prior to adopting Patterns, Nash utilized Mode Analytics in conjunction with a Fivetran and Snowflake infrastructure. Although Mode Analytics proved effective for creating static dashboards and tracking key performance indicators, it fell short in its ability to anticipate and address the diverse and dynamic queries arising from the various operational teams Jerry oversees.

Recognizing the potential of Large Language Models (LLMs) in developing AI agents capable of automatically responding to queries, Jerry initially considered creating a bespoke solution using LangChain. However, Patterns emerged as a more suitable choice, offering a comprehensive, specialized solution for end-to-end analytical question answering. This strategic shift promises to significantly streamline Nash's data management, ultimately enhancing their operational efficiency in the fast-paced delivery market.

Welcoming Satoshi II - Nash’s AI Bot

Jerry recently introduced Satoshi II to their team, named affectionately after a colleague's dog, Satoshi. The onboarding process for Satoshi II was akin to welcoming a new team member. Over the course of a week, Jerry provided Satoshi II with context on Nash’s business operations and metrics — all the context needed to do its job.

Below are the tailored instructions Jerry formulated to ensure Satoshi II's quick onboarding and proficiency in understanding Nash's unique business environment.

Creating Satoshi II's Global Context

Global Context is appended to Patterns' build in LLM context and enables anyone to customize the bot's functionality. Jerry broke down Global Context for Satoshi II into three different categories 1) business context 2) table join examples and 3) response hints. Read about these below:

General business context:

As a data analyst at Nash Technologies, you specialize in optimizing last-mile delivery for sectors like food, pharmacy, and retail. Your role involves organizing orders, coordinating with local providers, and facilitating operational management and payment processes. You serve a broad range of merchants and collaborate with delivery providers using Nash's platform.

Your key responsibility is to leverage data for strategic decision-making, adhering strictly to the company's SQL guidelines and data dictionary for query writing.

Nash's service enables order placement through multiple channels, using algorithms to assign providers based on various factors. It offers real-time delivery tracking, notification systems, and automated driver redispatching in case of delays or cancellations.

Example 1:
SELECT *
FROM estimate e
JOIN task t ON e.task_id = t.task_id
JOIN job j ON j.job_id = t.job_id


Example 2:
SELECT *
FROM transaction tr
JOIN task t ON t.task_id = tr.task_id
JOIN job j ON t.job_id = j.job_id


Analyst rules

- Start with "woof-woof, Satoshi II coming to help"
- When there is OR clause in sql, always wrap them with paratheses
- Summarize the latest requirement first before giving your response
- When you see a word that sounds like our customer, use primary_organization_name in the filter, when you see a word that sounds like our provider (fleets the delivers), use parent_provider in the filter, if you are not sure, ask before writing any sql
- Always add these filters in the where clause when asked about metrics: djc.is_test_organization = 0 and djc.is_test_job_configuration = 0 and djc.is_effective_job_configuration = 1, except when asked about supply availability or total price quotes
- When asked about completion rate or cancellation rate, don’t use is_completed = 1 in the where clause - When asked about performance by geo (i.e. city, country etc.) always use pickup_city/state/country etc. unless specified otherwise
- When asked about providers (fleet), use dt.parent_provider
- When asked about time range use djc.pickup_timestamp to do filtering and partition
- When asked about time range, make sure to exclude incompleted time unites, i.e. when asked about last week, exclude data of this week (incomplete)
- When asked about channel partner, always convert to lower case
- Use 2 digit code for US states
- Use 2 digit country codes. US for United States, GB for United Kingdom, AU for Australia
- When asked about any rates, include both denominator and numerator as additional columns in the final select statement
- When asked about supply availability, always add filter package_delivery_mode <> 'batch'
- When asked about job failure reason or cancellation reason, include djc.is_completed = 0 to only pull failed job_configurations, use reverse_task_rank_of_job_configuration = 1 to pull latest job, use djc.is_active = 'False' to exclude active jobs and use delivery_status of ai_view_dim_task as the failure reason code, and ask if a detailed version is needed, if yes, then pull the task_failure_reason of ai_view_dim_job_configuration
- When choosing color scheme of visualization, use: tableau 20

Teaching Satoshi II metrics and calculations

Nash's team relies on Satoshi to generate correct metrics. To be sure Satoshi isn't making anything up, Jerry created a metrics document that defines metrics in pseudo-SQL. Essentially, he describes in natural language how you would go about calculating a metric from the given tables. Here is an example of some of the metrics that drive Nash's business:

Below are our primary metrics and sql samples (aka Data dictionary) that the company cares about

# Completed deliveries
SQL: “sum(is_completed) as completed_deliveries from job”


#Gross Merchandize Value (GMV)
SQL: “sum of delivery_price_cents/100 from job where is_completed = 1


# Active organizations
SQL: “count (distinct organization_id) from job_configuration where is_completed = 1


# Active (parent) providers (aka fleets)
SQL: “count(distinct parent_provider) from dim_task where is_completed = 1

Bottom-line

Treat Satoshi II as a junior analyst. Follow these principles when you interact with LLM powered AI bots (just like how you interact with a junior analyst)

  1. Be articulative and describing your need accurately. AI analysts are much better pulling well defined data pulls (i.e. pull this metric by this dimension by these filters) vs. vague questions (what’s the performance of xyz).

  2. Clean up your data model and only expose useful tables and columns to the AI analyst. AI analysts make more mistakes when interacting with complex joins, success rate is much higher if AI analyst works with cleanly modeled tables

  3. Start simple and iterate. It is not always guaranteed AI analysts can do well on ambiguous tasks, success rate would be much higher if you follow these steps (each step is a prompt): pull a simple statistic → repeat the analysis by adding filters and dimensions → make visualizations → Ask AI what conclusion it can draw from the analysis? For example instead of asking AI a question like “how is changing pickup window going to be impacting the drop off on-time rate”, I would ask AI “first, make a plan to analyze: how is changing pickup window going to be impacting the drop off ontime rate.”, get some inspiration from the plan.

Satoshi II’s Performance Review

In its first month of operation, Satoshi II, Nash's AI-bot, has demonstrated remarkable efficiency and effectiveness. By autonomously answering 162 questions for Jerry, the tool has proven to be a significant time-saver. Considering each ad hoc question typically requires about 30 minutes to address, Satoshi II has effectively saved Jerry approximately 75 hours of work in just one month.

This achievement sheds light on a common challenge within operational teams: the reluctance to inundate the data team with queries, despite having numerous questions that could enhance decision-making and operations. In many cases, the solution to this issue has been to hire an analyst solely for responding to queries and proactively building dashboards. However, this approach is not always feasible or efficient.

Patterns, through its generative bots like Satoshi II, offers an innovative solution to this problem. As demonstrated by Nash's experience, these AI tools can effectively fill the gap, providing timely and accurate data analysis without overburdening the data team. This approach not only optimizes the workflow but also ensures that operational teams have access to the insights they need without hesitation or delay.

Analytics in natural language