Introducing Thinkbox — the neuro-symbolic reasoning engine behind AirQuery Read more →
Thinkbox

An analytics AI harness, forged for reasoning.

Thinkbox is the framework that turns raw data into a structure AI can actually reason over. At its core is the Thinkmap — a hypergraph that fuses three first-class models into one. Around it: every analytics tool your agent needs to deliver an answer worth trusting.

The Thinkmap

Three models. One graph.

Most "AI for data" tools see only one layer — usually the schema. Thinkbox sees three and binds them together as a single navigable hypergraph. The result is the Thinkmap — the language AI uses to reason about your business, in your terms, against your data.

Layer 01 — Data Models

The literal truth of your warehouse.

Tables. Columns. Types. Primary keys. Foreign keys. Lineage. The raw, exact shape of your data — captured into the Thinkmap with zero interpretation. This is what a database administrator would draw on a whiteboard, encoded as graph nodes the agent can walk.

In our Superstore demo Thinkmap:

  • Two fact tables: fact_sales and fact_returns
  • Eight dimension tables: dim_customer, dim_product, dim_date, dim_region, dim_state, dim_segment, dim_ship_mode, dim_category, dim_sub_category
  • Foreign keys traced from every fact row to the dimension that defines it — orders point at customers, customers point at segments
  • Column types, nullability, default values, and source lineage all bound to the graph as node properties
fact_sales FACT
order_idstring
customer_idstring
product_idstring
ship_mode_idint
order_date_idint
salesdecimal
quantityint
discountdecimal
profitdecimal
dim_customer DIM
customer_idstring
customer_namestring
segment_idint
dim_product DIM
product_idstring
product_namestring
sub_category_idint
Layer 02 — Semantic Models

What the numbers actually mean.

Tables and columns by themselves don't say anything about your business. The semantic layer fixes that — it names the entities your team talks about, defines the metrics your reports rely on, and binds each one to the exact data-layer computation that produces it. Once defined here, a metric means the same thing in every answer, forever.

In the Superstore demo:

  • First-class entities: Order, Order Item, Customer, Product — each backed by data tables, each addressable by name
  • Verified metrics: revenue, profit_margin, avg_order_value, avg_discount, total_quantity, order_count
  • Relationships modeled with semantics: Customer PLACES Order, Order CONTAINS Order Item, Order Item REFERENCES Product
  • Every metric has a single signed-off definition — finance and sales no longer argue about whose "revenue" is correct
revenue
SUM(fact_sales.sales)
WHERE NOT returned
profit_margin
SUM(profit)
รท SUM(sales)
order_count
COUNT(DISTINCT
fact_sales.order_id)
↓ ↓ ↓
fact_sales
sales
profit
order_id
fact_returns
order_id
return_id
Layer 03 — Applied Ontology Models

The shape of your business world.

An ontology captures the way concepts are organised in your domain: which products roll up into which categories, how regions decompose into states and cities, what counts as a "shipping class," how a customer becomes a "premium" customer. This is the layer of knowledge an AI cannot infer from your schema alone — you have to teach it. Once taught, the agent can reason in your language, not the database's.

In the Superstore demo:

  • Product hierarchy — three top-level Category nodes (Furniture, Office Supplies, Technology), each with their own sub-categories and SKUs
  • Geography hierarchyRegionStateCity, so the agent knows "the Midwest" without you spelling out the states
  • Customer segment taxonomy — Consumer, Corporate, Home Office — each with its own profitability profile baked in as a domain rule
  • Ship-mode service tiers — ordered from slowest to fastest, so "fastest shipping option" resolves correctly without anyone hand-coding it
Product Hierarchy
Category
Furniture
Chairs Tables Bookcases
Office Supplies
Binders Paper Storage
Technology
Phones Accessories Copiers
Ship Mode — service tier (slowest → fastest)
Standard Class Second Class First Class Same Day
Why a hypergraph

Built for relationships a regular graph can't hold.

A regular graph connects two nodes at a time. The Thinkmap is a hypergraph — a single edge can connect many nodes across all three model layers at once. A metric, the tables it sums, the business rule that defines it, and the ontology concept it serves are all bound together in one hyperedge. Ask a question, and the agent traverses these multi-way relationships natively — no joining required, no inference gymnastics.

Analytics tools

Every building block you need to deliver an answer.

A Thinkmap on its own is just a graph. Thinkbox ships with the analytics tools that actually execute against it — so the agent has everything it needs to turn a question into a defensible result.

Query Planner

Decomposes a question into a step-by-step plan over the Thinkmap, with cost estimates.

Metric Compiler

Resolves a metric name to its verified definition, every time. No drift across teams.

Time-Series Engine

First-class support for windowed aggregations, period-over-period, rolling stats, seasonality.

Comparison Engine

Same-grain compare across cohorts, regions, products. Tells you what's different, not just what's there.

Anomaly Detection

Surfaces what shouldn't be there. Thresholds, z-scores, change-point detection, contribution analysis.

Root-Cause Search

Walks the Thinkmap looking for the smallest set of nodes that explain a deviation.

SQL Execution

Compiles plans to SQL. Runs against your warehouse. Streams results.

Confidence Scoring

Every answer comes with a confidence score, the evidence behind it, and the parts of the Thinkmap it touched.

Audit Trail

Full reasoning trace stored for every question. Reproducible, citable, regulator-ready.

Evals, built in

Quality you can see improve.

A Thinkbox isn't a one-shot model you train and forget. It's a living harness that has to stay accurate as your data, your business, and the questions people ask all change. Thinkbox ships with a full evals framework — the same discipline software teams use to test code, applied to your analytics knowledge. Every change you make to the Thinkmap can be measured, regression-tested, and shipped with confidence.

Golden-set evals

A curated set of questions with known-correct answers. Run it on every change — track accuracy as a percentage over time.

Regression evals

Did the metric you just refined break a question that used to work? The eval suite catches it before your CFO does.

Coverage evals

Are the metrics, entities, and ontology concepts in your Thinkmap actually covering the questions your team asks? Surface the gaps.

Calibration evals

Confidence scores are only useful if they correlate with accuracy. Calibration evals verify a 95%-confidence answer is right 95% of the time.

Performance evals

Latency, token cost, and DAF complexity per question. Catch regressions in speed and spend, not just accuracy.

Ontology completeness

Find columns in your data with no ontology binding, metrics with no business definition, entities the model can't recognise. Then fix them.

Evals run continuously, on every Thinkmap change, and on a schedule against your live data. Quality stops being a hope — it becomes a number on a dashboard, trending up.

Why a harness, not a model

AI without a harness is improvisation.

You can hand an LLM a database and ask it questions. It will produce plausible-sounding answers. Some will be right. Some will hallucinate metrics, misjoin tables, or quietly contradict last month's report. Thinkbox is the harness that makes the LLM's reasoning structured, grounded, and reproducible. Same question, same answer, always. Cited to the metric, the rule, the row.

Get Started Free See Thinkbox on the homepage