OpenAI's AI Data Agent: 2 Engineers, 4,000 Users, 3 Months

Why Does OpenAI's Internal AI Data Agent Matter for Your Business?

Learn more about alibaba's qwen3.5-9b beats openai's model at 13x smaller size

A finance analyst at OpenAI used to spend hours hunting through 70,000 datasets, writing SQL queries, and verifying table schemas just to compare revenue across different regions. Today, that same analyst types a plain-English question into Slack and receives a finished chart in minutes.

The transformation came from an AI data agent built by just two engineers in three months. Seventy percent of its code was written by AI itself.

Now more than 4,000 of OpenAI's roughly 5,000 employees use it daily, making it one of the most aggressive AI agent deployments inside any company worldwide. This is not just an internal efficiency story. It is a blueprint for how enterprises will compete in the next decade, and OpenAI says anyone can replicate it.

What Scale of Data Chaos Do Enterprises Face?

OpenAI's data platform spans more than 600 petabytes across 70,000 datasets. Emma Tang, head of data infrastructure at OpenAI, leads the team that built the agent. She oversees big data systems, streaming, and data tooling that serves over 4,000 internal users.

The sheer volume creates a fundamental problem. Even locating the correct table can consume hours of a data scientist's time.

Traditional business intelligence tools require technical expertise, knowledge of complex schemas, and institutional memory about which data sources to trust. The AI data agent eliminates these barriers.

It accepts plain-English questions through Slack, a web interface, IDEs, the Codex CLI, and OpenAI's internal ChatGPT app. It returns charts, dashboards, and long-form analytical reports that previously required specialized skills.

How Does This Differ from Traditional BI Tools?

Most enterprise AI agents today operate in departmental silos. You might have a finance bot here, an HR bot there. OpenAI's agent cuts horizontally across the entire organization.

Tang's team launched department by department, curating specific memory and context for each group. But ultimately, everything lives in the same database.

A senior leader can combine sales data with engineering metrics and product analytics in a single query. The team estimates the agent saves two to four hours of work per query.

But Tang emphasized a larger benefit: the agent gives people access to analysis they simply could not have done before, regardless of available time.

For a deep dive on bc adopts permanent daylight time: tech impact & digital shift, see our full guide

What Real-World Use Cases Span Every Department?

OpenAI's finance team queries the agent for revenue comparisons across geographies and customer cohorts. Product managers use it to understand feature adoption. Engineers diagnose performance regressions by asking whether a specific ChatGPT component is slower than yesterday, and which latency components explain the change.

For a deep dive on meta's ai smart glasses: innovation meets privacy concerns, see our full guide

The agent handles strategic, multi-step analysis that would take humans days. Tang described a recent case where a user spotted discrepancies between two dashboards tracking Plus subscriber growth.

"The data agent can give you a chart and show you, stack rank by stack rank, exactly what the differences are," she said. "There turned out to be five different factors. For a human, that would take hours, if not days, but the agent can do it in a few minutes."

This capability extends beyond technical teams. Engineers, growth teams, product managers, and non-technical employees who don't know the ins and outs of company data systems can now pull sophisticated insights independently.

How Does Codex Solve the Table Discovery Problem?

Finding the right table among 70,000 datasets is the single hardest technical challenge, according to Tang. This is where Codex, OpenAI's AI coding agent, plays its most inventive role.

Codex serves triple duty in the system:

Users access the data agent through Codex via MCP
The team used Codex to generate more than 70% of the agent's own code
A daily asynchronous process where Codex examines important data tables, analyzes pipeline code, and determines dependencies, ownership, granularity, and join keys

When a user asks about revenue, the agent searches a vector database to find which tables Codex has already mapped to that concept. This "Codex Enrichment" is one of six context layers the agent uses.

The layers range from basic schema metadata and curated expert descriptions to institutional knowledge pulled from Slack, Google Docs, and Notion. The system also includes a learning memory that stores corrections from previous conversations.

What Prompt Engineering Forces Better Thinking?

Tang was remarkably candid about the agent's biggest behavioral flaw: overconfidence. The model often says "this is the right table" and immediately starts analysis without proper validation.

The fix came through prompt engineering that forces the agent to linger in a discovery phase. The prompt reads like coaching a junior analyst: "Before you run ahead with this, I really want you to do more validation on whether this is the right table. So please check more sources before you go and create actual data."

The team learned through rigorous evaluation that less context can produce better results. Dumping everything into the system doesn't improve performance.

Fewer, more curated and accurate context sources deliver superior outcomes. To build trust, the agent streams its intermediate reasoning to users in real time.

It exposes which tables it selected and why, linking directly to underlying query results. Users can interrupt the agent mid-analysis to redirect it.

What About Safety and Access Control?

Tang took a pragmatic approach to safety that may surprise enterprises expecting sophisticated AI alignment techniques. "I think you just have to have even more dumb guardrails," she said.

The system relies on strong access control. The agent always uses your personal token, so you only access what you're already authorized to see.

It operates purely as an interface layer, inheriting the same permissions that govern OpenAI's data. The agent never appears in public channels, only in private channels or a user's own interface.

Write access is restricted to a temporary test schema that gets wiped periodically and can't be shared. User feedback closes the loop. Employees flag incorrect results directly, and the team investigates.

The model's self-evaluation adds another check. At the end of every task, the model evaluates its own performance.

Why Won't OpenAI Sell This Tool?

Despite obvious commercial potential, OpenAI has no plans to productize its internal data agent. The strategy is to provide building blocks and let enterprises construct their own.

Tang made clear that everything her team used to build the system is already available externally. "We use all the same APIs that are available externally," she said. "The Responses API, the Evals API. We don't have a fine-tuned model. We just use 5.2. So you can definitely build this."

This message aligns with OpenAI's broader enterprise push. The company launched OpenAI Frontier in early February, an end-to-end platform for enterprises to build and manage AI agents.

It has enlisted McKinsey, Boston Consulting Group, Accenture, and Capgemini to help sell and implement the platform. Codex is now used by 95% of engineers at OpenAI and reviews all pull requests before they're merged. Its global weekly active user base has tripled since the start of the year, surpassing one million users.

How Has Codex Evolved Beyond Coding?

Tang described a shift in how employees use Codex that transcends coding entirely. "Codex isn't even a coding tool anymore. It's much more than that," she said.

Non-technical teams use it to organize thoughts, create slides, and generate daily summaries. One of her engineering managers has Codex review her notes each morning, identify the most important tasks, pull in Slack messages and DMs, and draft responses.

This evolution points to a broader trend. AI agents are becoming general-purpose productivity multipliers rather than narrow task-specific tools.

What Unsexy Prerequisite Determines Success?

When asked what other enterprises should take away from OpenAI's experience, Tang didn't point to model capabilities or clever prompt engineering. She pointed to something far more mundane.

"This is not sexy, but data governance is really important for data agents to work well," she said. "Your data needs to be clean enough and annotated enough, and there needs to be a source of truth somewhere for the agent to crawl."

The underlying infrastructure hasn't been replaced by the agent. Storage, compute, orchestration, and business intelligence layers still do their jobs.

But the agent serves as a fundamentally new entry point for data intelligence, one that is more autonomous and accessible than anything that came before.

What Competitive Warning Should Enterprises Heed?

Tang closed with a warning for companies that hesitate. "Companies that adopt this are going to see the benefits very rapidly," she said. "And companies that don't are going to fall behind. It's going to pull apart. The companies who use it are going to advance very, very quickly."

When asked whether that acceleration worried her own colleagues, especially after recent layoffs at companies implementing AI, Tang paused. "How much we're able to do as a company has accelerated," she said, "but it still doesn't match our ambitions, not even one bit."

What Should Business Leaders Take Away?

OpenAI's AI data agent demonstrates that the bottleneck to smarter organizations is not better models. It is better data infrastructure and governance.

Two engineers built a system serving 4,000 users in three months because they leveraged AI to write 70% of the code. They used publicly available APIs, standard models, and thoughtful prompt engineering. No proprietary technology or massive development teams required.

The lesson for enterprises is clear. Start with data governance. Establish sources of truth.

Build access controls. Then deploy AI agents as interface layers that democratize data access across your organization.

Continue learning: Next, explore galaxy s27 ultra camera upgrade: samsung's 200mp sensor revolution

The companies that move quickly on this transformation will create compounding advantages. Those that wait will find themselves competing with organizations that operate at fundamentally different speeds.