From Transactional AI to Analytical AI

For all the data warehouse enthusiasts out there, this analogy is for you.

AI is in its OLTP era. The world needs AI OLAP.

OLTP = online transaction processing (think Postgres)
OLAP = online analytical processing (think ClickHouse)

Most AI inference out there is transactional processing: one prompt request, one model response. Think chatbots, copilots, agents.

But what if we want to run AI as a big data processor, scanning large volumes of raw, unstructured data and generating structured, database-ready tables for downstream analytics? And by the way, we also want it to be fast, accurate, and cost-effective. Transactional AI inference was not built for this. What we need for this workload is Analytical AI inference. AI OLAP.

A quick history lesson

In the early days of databases, everything ran on a single system. Your point-of-sale transactions, your inventory lookups, and your end-of-quarter reporting all hit the same database. It worked until it didn't.

OLTP systems were optimized for what they were designed to do: process individual transactions quickly and reliably. Insert a row. Update a record. Look up an account. Low latency, high concurrency, one operation at a time.

But when a business analyst needed to ask “what were our top-selling products across all regions last quarter?” that query would scan across millions of rows, lock tables, and bring the transactional system to its knees. The workloads were fundamentally different, and the architecture needed to reflect that.

That's why OLAP was born. Data warehouses. Star schemas. Pre-aggregated cubes. Systems purpose-built not for one record at a time, but for scanning, aggregating, and analyzing massive datasets. Eventually, columnar storage largely replaced the cubes, but the core insight never changed: the same data needed a completely different engine to serve a completely different job.

Sound familiar?

AI inference today looks a lot like databases did before the OLTP/OLAP split.

Right now, the entire AI infrastructure stack—the models, the serving frameworks, the APIs, the tooling—is built around a single pattern: one request in, one response out. A user asks a question, the model answers. An agent gets a task, the model reasons through it. It's transactional. And for that use case, it works beautifully.

But enterprises are sitting on oceans of unstructured data: customer support tickets, compliance communications, user-generated content, call transcripts, contracts, claims. By most estimates, 80% or more of enterprise data is unstructured. And the question they're asking isn't “can AI answer one question?” It's “can AI process all of this?”

That's an analytical workload. And trying to force it through transactional infrastructure is the modern equivalent of running your quarterly reporting queries on your production OLTP database. You can technically do it. But it'll be slow, expensive, brittle, and operationally painful.

What makes the workloads different

The distinction isn't just about volume. It's about the entire shape of the work.

Transactional AI inference is interactive. A human (or an agent) is in the loop, waiting for a response. Latency matters because someone is on the other end. The input is typically a single prompt or a short conversation. The output is freeform, natural language meant to be read by a person.

Analytical AI inference is batch-oriented. No one is sitting there waiting for row 47,312 to finish processing. What matters is throughput, cost-efficiency, and consistency across the entire dataset. The inputs are thousands or millions of records, often with the same analytical question applied uniformly. And the output isn't a paragraph. It's structured data. Labels, scores, extracted fields, classifications, summarizations. Data that flows into dashboards, databases, and downstream systems.

When you frame it this way, the architectural requirements diverge sharply:

	Transactional AI	Analytical AI
Interaction	Interactive, human-in-the-loop	Batch, system-to-system
Priority	Low latency	High throughput, low cost
Input	Single prompt / conversation	Single prompt / millions of records
Output	Freeform natural language	Structured data (labels, scores, fields)
Consistency	Per-response quality	Cross-dataset uniformity
Integration	Chat UI, agent framework	Data pipeline, data lake, data warehouse

The gap in the market

Today, if an enterprise wants to run AI on a million documents, here are their options:

Option 1: Loop over a chat API. Take your dataset, write a script, and call GPT/Claude/Gemini for each record. This technically works, but even with concurrent calls you're hitting rate limits, paying conversational model prices for batch processing, managing retries and backoff logic, getting inconsistent outputs across records, and praying your structured output parsing doesn't break at row 500,000. And you're paying for general-purpose models that were built to do everything—creative writing, open-ended reasoning, world knowledge—when all you need is accurate, structured extraction from your specific data. Every API call is burning compute and money on billions of extra model parameters that we don't actually need for the task.

Even with batch APIs from the major providers, we still need to construct a prompt for every record, format JSONL files, manage job submissions, parse outputs, and validate schemas ourselves. You're building a DIY analytical pipeline every time. You've essentially duct-taped OLTP infrastructure into an OLAP job.

Option 2: Build it yourself. Stand up your own inference cluster, fine-tune a model, build the orchestration layer, handle schema validation, manage distributed processing, and maintain it all. This is what well-resourced ML teams at big tech companies do. For everyone else, it's a 6–12 month project before you process your first production record.

Option 3: Don't use AI at all. Stick with keyword matching, regex rules, and manual review. This is what most enterprises are still doing, not because they don't want AI, but because the infrastructure to use it at analytical scale doesn't exist as a product.

None of these options are ideal. We can do better. The market is missing the analytical layer.

What AI OLAP actually looks like

Just as data warehouses weren't just “bigger databases”, AI OLAP isn't just “more API calls”. It's a fundamentally different architecture optimized for a fundamentally different workload.

An analytical AI engine should be built from the ground up for batch, structured processing of unstructured data. That means purpose-built models that trade conversational fluency for structured output accuracy. Smaller, faster, and more cost-effective because they're not carrying the overhead of capabilities the workload doesn't need. You don't need a model that can write poetry and recall ancient philosophy to classify a support ticket.

It means distributed inference that can process 50,000 records in seconds, not 50,000 API calls over hours. It means schema-aware processing that guarantees your output conforms to the exact structure your downstream systems expect, without the performance degradation that comes from forcing structured output at decoding time. And it means a delivery pattern that fits into existing data infrastructure. Data in, structured results out. Not a chat window.

In fact, if we're being precise, analytical AI inference sits at the intersection of ETL and OLAP. It transforms unstructured data into structured output. That's the extraction and transformation. But it does so at analytical scale, on demand, across millions of records. That's the OLAP. It's the part of the AI data stack that turns raw text into queryable, structured datasets.

The value of OLAP was never the query engine alone. It was that it unlocked an entirely new category of work. Before data warehouses, business intelligence was a niche discipline limited to IT specialists running custom reports. Once you could reliably scan and aggregate millions of records, BI became accessible to the entire organization, and an entire ecosystem of dashboards, reports, and data-driven decision-making emerged.

AI OLAP unlocks the same kind of shift. When you can reliably process millions of unstructured records with AI, you don't just automate existing workflows. You enable entirely new ones. Compliance teams can monitor every communication, not a 5% sample. Customer experience teams can analyze every ticket, not just the ones that get escalated. Trust and safety teams can classify every piece of content, not just the ones flagged by keyword rules.

The split is inevitable

The database world learned this lesson decades ago. You don't run analytical workloads on transactional infrastructure. Different jobs need different tools.

AI is learning the same lesson right now. The industry has spent the last few years building increasingly powerful transactional inference: better chatbots, better copilots, better agents. That work is valuable and will continue.

But the next frontier isn't making chatbots smarter. It's making AI infrastructure that can process the vast majority of enterprise data that never enters a chat window. The unstructured data sitting in data lakes, flowing through pipelines, accumulating in storage buckets. Waiting to be understood.

The transactional era gave us AI we can talk to. The analytical era will give us AI that works for us. Quietly, at scale, on the data that actually runs the business.

The OLTP/OLAP split is coming to AI. And just like in the database world, the companies that get the analytical layer right will define the next era of the industry.

This is what we're building at Queryboost. The analytical AI layer for enterprise data.

Johnson Kuan

Founder & CEO, Queryboost