How do we pick between Looker, Metabase, and Superset?

Choose based on governance needs and existing skills. If you need strong modeling, governance, and a mature semantic layer, Looker can be worth the spend. Metabase is great for quick wins with permissions and ease of use. Superset + a headless semantic layer (Cube/dbt metrics) is flexible for engineering‑heavy teams. In all cases, enforce RLS in the warehouse and keep definitions out of the BI UI.

Do we need both dbt tests and Great Expectations?

For critical domains, yes. dbt tests are great for schema and column‑level assertions tied to transformations. Great Expectations (or Soda) complements with cross‑table business rules and profile‑based anomaly detection. Many teams start with dbt tests and add a suite where incidents have actually happened.

What SLOs should we start with?

Begin with freshness (e.g., max 15 minutes lag for fct_orders during business hours) and accuracy (e.g., reconciling totals within 0.5% to the ledger). Add completeness (expected rows vs. source) once you have contracts. Make them visible in the catalog and page the owning team on breach.

How do we keep analysts from creating shadow datasets?

Give them certified, well‑modeled explores and the ability to prototype in sandboxes with quotas. Enforce deprecation rules and cost visibility. Most shadow data is a symptom of slow pipelines or missing definitions—fix those first, then require all production dashboards to depend on certified models via dbt exposures.

Can we do this without a data catalog?

You can start, but you won’t scale trust. At minimum, stand up OpenMetadata or DataHub, ingest dbt docs and lineage, and attach owners/SLOs. It becomes the social contract that makes self‑service viable.

Data-engineering · Oct 23, 2025 · 10 minute read

Self‑Service Analytics Without the Data Hangover: How We Built a Trustworthy Visualization Platform That Scales

Your BI tool isn’t the bottleneck—your reliability is. Here’s the playbook we use to turn “spreadsheet theater” into a self‑service platform leaders can actually bet the quarter on.

Ben Ramirez

Principal Data Platform Engineer, GitPlumbers

20 years building and fixing data stacks at fintechs, marketplaces, and SaaS—from on‑prem MPPs and Hadoop to dbt, SLOs, and headless BI. Formerly at a unicorn you’ve definitely argued with about ARR.

Self‑service isn’t a BI tool choice—it’s a reliability program with a dashboard on top.

Back to all posts

The dashboard that broke the quarter

I walked into a war room at a fintech—Sales swore revenue was up 7%, Finance had it down 3%, and the CEO had a board deck due in 2 hours. Three dashboards, three truths. The root cause wasn’t Tableau vs. Looker turf wars. It was lack of reliability and shared definitions. Pipelines silently failed, schemas drifted, and every analyst reinvented “active customer.”

Self‑service is easy to promise and expensive to maintain. The trick is shipping guardrails that make “move fast” safe: contracts at the edges, tests in the middle, and a semantic layer on top. Here’s how we build platforms at GitPlumbers that leaders actually trust.

What self‑service actually means (with guardrails)

Self‑service isn’t “anyone can build any chart.” It’s:

Certified datasets that are modeled, tested, and documented
SLOs on freshness and accuracy with on‑call ownership
A semantic layer so “Gross Margin” doesn’t change per dashboard
Row‑level security enforced in the warehouse, not sprinkled per chart
Discoverability via a catalog with lineage and contacts
GitOps for analytics: everything versioned; no production logic trapped in a BI UI

If you can’t check all six boxes, your “self‑service” is just decentralized chaos.

Blueprint: a reliable self‑service stack

A stack we’ve deployed repeatedly (Snowflake/BigQuery flavor):

Ingest: Fivetran or Debezium CDC into Snowflake/BigQuery
Transform: dbt models with tests and exposures
Quality: Great Expectations or Soda for suite‑level checks
Orchestration: Airflow or Dagster with OpenLineage
Catalog/Lineage: DataHub or OpenMetadata
Semantic: dbt metrics layer or Cube for headless BI
Visualization: Looker, Metabase, Superset
Observability: Prometheus + alerting, incident runbooks

dbt: make correctness boring

# models/marts/orders/schema.yml
version: 2
models:
  - name: fct_orders
    description: "Certified orders fact table"
    config:
      tags: [certified, finance]
    columns:
      - name: order_id
        tests: [unique, not_null]
      - name: total_amount
        tests:
          - not_null
          - dbt_utils.expression_is_true:
              expression: "total_amount >= 0"
      - name: loaded_at
        tests: [not_null]

Tie models to business‑facing exposures so you know what breaks when.

# models/marts/orders/exposures.yml
version: 2
exposures:
  - name: exec_revenue_dashboard
    type: dashboard
    maturity: high
    url: https://looker.yourco.com/dashboards/123
    depends_on:
      - ref('fct_orders')
    owner:
      name: Finance Analytics
      email: finance-analytics@yourco.com

Orchestrate and test in the DAG

# dags/analytics_orders.py
from airflow import DAG
from airflow.operators.bash import BashOperator
from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
from datetime import datetime

with DAG(
    dag_id="analytics_orders",
    start_date=datetime(2024, 1, 1),
    schedule_interval="*/15 * * * *",
    catchup=False,
    tags=["analytics"],
) as dag:
    dbt_run = BashOperator(
        task_id="dbt_run",
        bash_command="dbt build --project-dir /dbt --select fct_orders",
        env={"DBT_PROFILES_DIR": "/dbt"},
    )

    ge_check = KubernetesPodOperator(
        name="ge-check",
        task_id="great_expectations",
        namespace="data-platform",
        image="public.ecr.aws/yourco/ge:latest",
        cmds=["great_expectations"],
        arguments=["checkpoint", "run", "orders_checkpoint"],
    )

    dbt_run >> ge_check

Expectations that catch business drift

# great_expectations/expectations/orders_suite.yml
expectations:
  - expect_table_row_count_to_be_between:
      min_value: 1000
  - expect_column_values_to_not_be_null:
      column: order_id
  - expect_column_values_to_be_between:
      column: total_amount
      min_value: 0
  - expect_column_max_to_be_between:
      column: loaded_at
      min_value: now - 15m

Governance in the warehouse, not the BI tool

-- BigQuery: row-level access policy
CREATE ROW ACCESS POLICY region_filter
ON `analytics.fct_orders`
GRANT TO ("group:na-analysts@yourco.com")
FILTER USING (region IN ("US", "CA"));

Access and infra should be code

# terraform for dataset and access
resource "google_bigquery_dataset" "analytics" {
  dataset_id = "analytics"
  location   = "US"
}

resource "google_bigquery_dataset_iam_member" "analyst_viewer" {
  dataset_id = google_bigquery_dataset.analytics.dataset_id
  role       = "roles/bigquery.dataViewer"
  member     = "group:analysts@yourco.com"
}

Alert on data SLOs like you mean it

# prometheus rules: data freshness SLO
groups:
- name: data-freshness
  rules:
  - alert: FctOrdersStale
    expr: (time() - fct_orders_max_loaded_at_seconds) > 900
    for: 10m
    labels:
      severity: page
    annotations:
      summary: "fct_orders freshness SLO violated"
      description: "No new data in 15m; check Airflow run and upstream source."

Expose fct_orders_max_loaded_at_seconds by logging max loaded_at as a custom metric from your DAG, or scrape it from the warehouse via an exporter.

Data contracts and SLAs: stop surprise breakages

I’ve seen Kafka topics mutate from amount to total_amount_cents on a Friday and take out every dashboard. Contracts stop that.

Define a JSON Schema (or Avro/Protobuf) for each source event/table
Validate at ingestion and again in CI for downstream models
Negotiate SLOs with source teams: freshness, completeness, late‑arriving policy

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "order_created",
  "type": "object",
  "required": ["order_id", "customer_id", "amount", "currency", "created_at"],
  "properties": {
    "order_id": {"type": "string"},
    "customer_id": {"type": "string"},
    "amount": {"type": "number", "minimum": 0},
    "currency": {"type": "string", "enum": ["USD", "EUR", "GBP"]},
    "created_at": {"type": "string", "format": "date-time"}
  },
  "additionalProperties": false
}

When a producer PR changes the schema, CI runs validation and fails loudly. Downstream, dbt tests enforce nullability and ranges. For evolution, use additive changes + versioned topics or views. We routinely ship a v2 view and deprecate v1 with a 30‑day window.

The semantic layer: one truth, many tools

If every analyst defines ARR differently, you’re sunk. Centralize definitions: dbt metrics layer or a headless BI like Cube; then expose to Looker, Superset, notebooks. Example with Looker LookML on top of dbt models:

# views/orders.view.lkml
view: fct_orders {
  sql_table_name: analytics.fct_orders ;;

  dimension: order_id { primary_key: yes; type: string }
  dimension: created_date { type: date; sql: DATE(${created_at}) ;; }

  measure: gross_revenue { type: sum; sql: ${total_amount} ;; }
  measure: orders { type: count }
}

explore: fct_orders {
  label: "Orders (Certified)"
  joins: []
}

Or keep it headless with dbt metrics and query via the Metrics API or Cube. The principle is the same: one place to define metrics, many ways to consume them.

Discoverability, ownership, and trust

Good data is useless if nobody can find or trust it. Ship the social layer.

Catalog lineage with OpenLineage into DataHub/OpenMetadata
Mark models as certified and show owners, SLOs, and docs
Add incident history and status badges to the catalog entry

# openlineage/airflow.cfg
[openlineage]
enabled=True
transport.type=http
transport.url=http://datahub:8080/api/openlineage
namespace=analytics

Set adoption goals you can measure:

Time to first dashboard on a certified dataset: target < 1 day
% of dashboards using certified models: target > 80%
MTTR for data incidents: target < 2 hours
Number of “spreadsheet exports” per week: trending down

Rollout that actually works (90 days)

You can’t boil the ocean. We run this sequence:

Week 1–2: Pick one domain (Billing). Define SLOs: freshness 15m, accuracy ≥ 99.5%. Identify owners. Instrument Prometheus metrics.
Week 2–4: Stand up dbt models and tests. Add Great Expectations on the critical joins and totals. Publish to a staging project.
Week 4–5: Add row‑level policies in the warehouse and Terraform IAM. Onboard Finance to certified explores in Looker/Metabase.
Week 5–6: Wire OpenLineage to DataHub and annotate owners, runbooks, and SLOs. Tag certified datasets.
Week 6–8: Move dashboards behind the semantic layer. Migrate existing Finance dashboards—no re‑invented metrics allowed.
Week 8–10: Expand to Product Analytics. Rinse and repeat with contracts and SLOs. Add anomaly alerts for volume/freshness.
Week 10–12: Retire legacy tables; enforce deprecation with dbt state comparison. Start monthly review of certification and incidents.

Results we’ve seen in the wild:

65% reduction in dashboard incidents within 60 days
MTTR from “next business day” to < 60 minutes
Time to ship a new KPI dashboard: 2 weeks → 3 days
Query spend down 20% after deduping “rogue” datasets and pushing RLS to the warehouse

Anti‑patterns to avoid

I’ve watched teams burn months on these:

BI‑first: modeling and quality live in dashboards. You’ll never scale.
Catalog‑as‑paperwork: no owners, no SLOs, no trust. It’s a wiki graveyard.
Everything’s a mart: thousands of slightly different tables. Certify a few, kill the rest.
Over‑governance: tickets for every column. Ship guardrails, not red tape.
No on‑call: “Data Reliability Engineer” is a hat, not a title—assign rotation and page it.

What we’d do differently next time

Start SLOs earlier. Freshness/accuracy targets force real conversations with source teams.
Push more checks to producers. Contracts at the edge catch 80% of pain.
Invest in enablement: office hours, examples, and “golden paths” beat documentation alone.
Track adoption KPIs from day one; it changes behavior faster than policy.

Related Resources

Key takeaways

Self‑service fails without reliability. Bake in SLOs, tests, and ownership before you hand analysts the keys.
Use a thin semantic layer to keep metrics consistent across Looker, Superset, and notebooks.
Codify data contracts at the source to reduce breakages and MTTR.
Alert on freshness and volume anomalies with Prometheus; don’t rely on “the CFO noticed.”
Ship with guardrails: certified datasets, row‑level security, lineage, and clear owners.

Implementation checklist

Define freshness and accuracy SLOs per domain table before onboarding to BI.
Implement data contracts (JSON Schema/Avro) at ingestion and verify in CI/CD.
Adopt dbt tests + Great Expectations on critical paths; fail fast, fail loudly.
Expose lineage (OpenLineage) and a catalog (DataHub/OpenMetadata) with owners and SLAs.
Enforce RLS in the warehouse, not the BI tool. Keep governance close to the data.
Centralize metrics in a semantic layer; deploy models via GitOps, not the BI UI.
Instrument Prometheus alerts for data SLO breaches; page the owning team.
Measure business outcomes: time to first dashboard, adoption, MTTR, cost per query.

Questions we hear from teams

How do we pick between Looker, Metabase, and Superset?: Choose based on governance needs and existing skills. If you need strong modeling, governance, and a mature semantic layer, Looker can be worth the spend. Metabase is great for quick wins with permissions and ease of use. Superset + a headless semantic layer (Cube/dbt metrics) is flexible for engineering‑heavy teams. In all cases, enforce RLS in the warehouse and keep definitions out of the BI UI.
Do we need both dbt tests and Great Expectations?: For critical domains, yes. dbt tests are great for schema and column‑level assertions tied to transformations. Great Expectations (or Soda) complements with cross‑table business rules and profile‑based anomaly detection. Many teams start with dbt tests and add a suite where incidents have actually happened.
What SLOs should we start with?: Begin with freshness (e.g., max 15 minutes lag for fct_orders during business hours) and accuracy (e.g., reconciling totals within 0.5% to the ledger). Add completeness (expected rows vs. source) once you have contracts. Make them visible in the catalog and page the owning team on breach.
How do we keep analysts from creating shadow datasets?: Give them certified, well‑modeled explores and the ability to prototype in sandboxes with quotas. Enforce deprecation rules and cost visibility. Most shadow data is a symptom of slow pipelines or missing definitions—fix those first, then require all production dashboards to depend on certified models via dbt exposures.
Can we do this without a data catalog?: You can start, but you won’t scale trust. At minimum, stand up OpenMetadata or DataHub, ingest dbt docs and lineage, and attach owners/SLOs. It becomes the social contract that makes self‑service viable.

Ready to modernize your codebase?

Let GitPlumbers help you transform AI-generated chaos into clean, scalable applications.

Talk to GitPlumbers about fixing your self‑service analytics Download the Data Reliability SLO checklist