← Back to articles

How to Use SQL for Startup Analytics: The Queries Every Founder Should Know


Key Takeaways

SQL is the fastest way to extract the metrics that drive your financial model from your actual data. You do not need to be an engineer. You need to know 10-15 queries that pull the numbers investors will ask for: MRR, churn, cohort retention, CAC by channel, and LTV. Most startup databases (PostgreSQL, MySQL, BigQuery) use the same core syntax. This article teaches non-technical founders enough SQL to go from raw data to investor-ready metrics without waiting for an engineer to build a dashboard.

Author: Yanni Papoutsi - Fractional VP of Finance and Strategy for early-stage startups - Author, Raise Ready Published: 2025-04-04 - Last updated: 2025-04-04

Reading time: \~9 min

Why Founders Should Know SQL

There is a gap in most startups between the financial model and the actual data. The model says churn is 4%. But where did that number come from? Someone pulled it from a dashboard, or estimated it from a report, or "just knows." The problem: when an investor asks for churn by cohort, by segment, or by channel, the founder has to go back to the engineering team, wait 3 days, and hope the query is right. Learning basic SQL closes that gap. Not because the founder should be the data team, but because the founder who can pull their own metrics from the database understands their business at a different level than the one who waits for a report. Every number in your financial model should be traceable to a query. When it is, the model becomes trustworthy.

At the platform, I ran SQL queries directly against the production database to validate model assumptions. When the acquirer challenged a specific churn metric during the exit, I could rerun the query with different parameters in real time. That responsiveness was only possible because the analysis lived in SQL, not in someone's head.

What You Need Before You Start

Access to your company's database (ask your CTO or engineering lead). A SQL client: free options include DBeaver, pgAdmin, or Google BigQuery's web interface. The schema: a list of tables and what they contain. At minimum, you need a customers table, a transactions or subscriptions table, and ideally a marketing spend table.

Start with read-only access. You should never modify the production database. Read-only means you can query data but cannot accidentally change or delete it.

The 8 Queries That Feed Your Financial Model

Query 1: Monthly Recurring Revenue (MRR)

This query calculates total MRR for any given month by summing active subscription values.

SELECT DATE_TRUNC('month', billing_date) AS month, SUM(monthly_amount) AS mrr FROM subscriptions WHERE status = 'active' GROUP BY 1 ORDER BY 1;

This gives you a time series of MRR. If your subscriptions table records amount differently (annual contracts, usage-based), adjust the calculation to normalize everything to a monthly equivalent. Query 2: Monthly Logo Churn

Count customers who were active last month but not this month. WITH active_this_month AS (SELECT DISTINCT customer_id FROM subscriptions WHERE status = 'active' AND billing_date >= DATE_TRUNC('month', CURRENT_DATE)), active_last_month AS (SELECT DISTINCT customer_id FROM subscriptions WHERE status = 'active' AND billing_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month') AND billing_date < DATE_TRUNC('month', CURRENT_DATE)) SELECT COUNT(alm.customer_id) AS churned, (SELECT COUNT(\*) FROM

active_last_month) AS total_start, ROUND(COUNT(alm.customer_id)::DECIMAL / (SELECT COUNT(\) FROM active_last_month) \ 100, 2) AS churn_pct FROM active_last_month alm LEFT JOIN active_this_month atm ON alm.customer_id = atm.customer_id WHERE atm.customer_id IS NULL;

The LEFT JOIN with a NULL check is the standard SQL pattern for finding records that exist in one set but not another. This is a pattern you will use constantly in churn and retention analysis.

Query 3: Cohort Retention Table

This is the most valuable query for your financial model. It groups customers by their acquisition month and tracks what percentage are still active in each subsequent month.

WITH cohorts AS (SELECT customer_id, DATE_TRUNC('month',

MIN(created_at)) AS cohort_month FROM customers GROUP BY 1), activity AS (SELECT customer_id, DATE_TRUNC('month', transaction_date) AS active_month FROM transactions GROUP BY 1, 2) SELECT c.cohort_month, a.active_month, COUNT(DISTINCT a.customer_id) AS active_customers, COUNT(DISTINCT a.customer_id)::DECIMAL / MAX(cohort_size.cnt) AS retention_rate FROM cohorts c JOIN activity a ON c.customer_id = a.customer_id JOIN (SELECT cohort_month, COUNT(\*) AS cnt FROM cohorts GROUP BY 1) cohort_size ON c.cohort_month = cohort_size.cohort_month GROUP BY 1, 2 ORDER BY 1, 2;

This produces the triangle-shaped cohort retention table that investors love. Export it to a spreadsheet, and you have the raw data for your retention curves, LTV calculations, and cohort-based revenue projections.

Query 4: CAC by Channel

If you track the acquisition source for each customer and your marketing spend by channel, you can calculate CAC per channel directly. SELECT c.acquisition_channel, COUNT(DISTINCT c.customer_id) AS new_customers, SUM(m.spend) AS total_spend, ROUND(SUM(m.spend) / NULLIF(COUNT(DISTINCT c.customer_id), 0), 2) AS cac FROM customers c JOIN marketing_spend m ON c.acquisition_channel = m.channel AND DATE_TRUNC('month', c.created_at) = m.month WHERE c.created_at >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month') GROUP BY 1 ORDER BY cac DESC;

NULLIF prevents division by zero for channels with spend but no conversions. Sort by CAC descending to immediately see which channels are expensive.

Query 5: Revenue by Customer Segment

Segment-level revenue tells you where your money actually comes from. Useful for investor conversations and for prioritizing retention efforts.

SELECT c.segment, COUNT(DISTINCT c.customer_id) AS customers, SUM(t.amount) AS total_revenue, ROUND(SUM(t.amount) /

NULLIF(COUNT(DISTINCT c.customer_id), 0), 2) AS avg_revenue_per_customer FROM customers c JOIN transactions t ON c.customer_id = t.customer_id WHERE t.transaction_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '3 months') GROUP BY 1 ORDER BY total_revenue DESC; Query 6: Net Revenue Retention

Compare revenue from a set of customers in one period to the same set of customers in a later period (including expansion, excluding new customers).

WITH base_period AS (SELECT customer_id, SUM(amount) AS base_rev FROM transactions WHERE transaction_date BETWEEN '2024-01-01' AND '2024-03-31' GROUP BY 1), current_period AS (SELECT customer_id, SUM(amount) AS current_rev FROM transactions WHERE transaction_date BETWEEN '2024-04-01' AND '2024-06-30' GROUP BY 1) SELECT

ROUND(SUM(COALESCE(cp.current_rev, 0)) / NULLIF(SUM(bp.base_rev), 0) \* 100, 1) AS nrr_pct FROM base_period bp LEFT JOIN current_period cp ON bp.customer_id = cp.customer_id;

COALESCE handles customers who churned (no revenue in current period = 0). The LEFT JOIN ensures churned customers are counted as zero rather than excluded.

Query 7: Monthly Active Users / Customers

A simple but essential metric: how many unique customers transacted or logged in each month.

SELECT DATE_TRUNC('month', activity_date) AS month, COUNT(DISTINCT customer_id) AS active_customers FROM user_activity GROUP BY 1 ORDER BY 1;

Query 8: Average Revenue Per Account (ARPA)

SELECT DATE_TRUNC('month', transaction_date) AS month, COUNT(DISTINCT customer_id) AS active_customers, SUM(amount) AS total_revenue, ROUND(SUM(amount) / NULLIF(COUNT(DISTINCT customer_id), 0), 2) AS arpa FROM transactions GROUP BY 1 ORDER BY 1;

ARPA trending up means your customers are spending more over time (expansion). ARPA trending down might mean you are acquiring smaller customers or existing ones are contracting. Both signals feed directly into your revenue model assumptions.

*Key insight: The goal is not to become a data engineer. It is to be able to pull the 8-10 numbers that drive your financial model from actual data, verify them, and update them monthly. A founder who can do this operates at a fundamentally different level of credibility than one who relies entirely on reports generated by someone else.*

Practical Tips for Non-Technical Founders

Start with SELECT \ FROM table_name LIMIT 10.** Look at the raw data before writing complex queries. Understanding what your tables contain and how they are structured saves hours of debugging.

Use DATE_TRUNC everywhere. Most startup metrics are monthly. DATE_TRUNC('month', date_column) groups dates into months and is the most commonly used function in startup analytics.

Save your queries. Create a folder of saved SQL files, one per metric. When an investor asks for updated cohort data, you run the saved query instead of rebuilding from scratch.

Ask your CTO to set up a read replica. Running heavy queries against the production database can slow down the application. A read replica is a copy of the database designed for analytics queries. It is a standard setup that any competent engineering team can implement in a day.

Frequently Asked Questions

What if our data is messy or incomplete?

It always is. Start with what you have and improve it over time. If acquisition channel is not tracked, that is a problem to fix now, not a reason to skip the analysis. Every month you wait to start tracking clean data is a month of history you will never recover.

Should I use SQL or a BI tool like Metabase or Looker?

Both. SQL is the foundation: it gives you the raw queries. A BI tool turns those queries into dashboards that update automatically. Learn SQL first so you understand what the dashboard is actually showing. Then set up the BI tool for ongoing monitoring.

Summary

SQL gives founders direct access to the metrics that drive their financial model. Eight core queries cover MRR, churn, cohort retention, CAC by channel, revenue by segment, NRR, active customers, and ARPA. These are the exact numbers investors ask for during diligence. Learning basic SQL is a one-time investment of a few hours that pays back every time you need to pull, verify, or update a metric. The model becomes traceable, the assumptions become verifiable, and the founder becomes someone who knows their numbers from the data up, not from a report down.

Get the complete guide with all 16 chapters, exercises, and model templates.

Get Raise Ready - $9.99
YP
Yanni Papoutsi

VP Finance & Strategy. Author of Raise Ready. Has supported fundraising across 5 rounds backed by Creandum, Profounders, B2Ventures, and Boost Capital. Experience spanning UK, US, and Dubai markets with multiple funding rounds and exits.