Almost all AI in business right now is reactive. You ask it something, it answers, you decide what to do. Chatbots, copilots, RAG, whatever. The human always starts the conversation.
I think that's a local maximum. The thing I keep seeing people talk about on Twitter and in engineering orgs is proactive intelligence. Agents with no UI that just... run. They watch what's happening across your business and do stuff without anyone asking them to.
Karpathy's "autoresearch" project captures the vibe really well. He's got an agent running LLM training experiments in a loop, committing improvements to a git branch, forever, with nobody involved:
That's ML research, but the same idea maps directly to running a business.
Paperclip is another one that's been going around. It's an open-source framework for running entire companies with AI agents:

You define a goal like "build the #1 AI note-taking app to $1M MRR," spin up a team of agents (CEO, CTO, engineers, designers, marketers), give them budgets, and hit go. They delegate work to each other, wake up on a schedule to check on things, and coordinate on their own. You just watch from a dashboard and jump in when you feel like it.
It's early and rough around the edges, but the fact that this exists as an open-source repo right now tells you something about where we're going.
What this looks like day to day
Say you run an e-commerce company. You set up a few agents and go about your day:
- Inventory agent watches stock levels and supplier lead times. A TikTok goes viral on Tuesday and demand spikes 400% on one SKU. The agent notices, checks that the supplier's port is backed up, sees marketing has another push scheduled for next week, and reorders from a backup supplier before you even hear about the TikTok.
- Support agent reads every incoming ticket in real time. It spots that 30 customers in the last hour all mentioned the same defect on the same product batch. Flags it to the quality team, pauses the listing, starts drafting responses.
- Pricing agent tracks competitor prices across a dozen SKUs. One competitor drops their price by 15% on a Friday night. The agent adjusts your margins within the bounds you set, logs why, and moves on.
Nobody kicked any of this off. Nobody's checking dashboards. The agents just run.
The key thing here is that this isn't Zapier. These aren't if-then rules. Each agent is reasoning about context, not matching patterns. The inventory agent doesn't fire because "stock < 100." It fires because it connected a viral moment, a port delay, and an upcoming campaign and decided to act.
The world model is the interesting part
Individual agents doing individual tasks? Not that exciting honestly. The exciting part is when they share a brain. Your sales agent closes a big deal, the supply chain agent already knows. Support starts getting a bunch of complaints about a specific batch, the quality agent picks it up and the comms agent starts drafting a response. All before anyone checks a dashboard.
For decades companies have been chasing the "single pane of glass" dashboard. But dashboards just put data in front of people and hope they look at it. A world model puts data in front of agents that actually act on it. Your job shifts from staring at charts to setting goals and guardrails.
Why now
A year ago, having an LLM monitor your entire support queue around the clock would've been stupidly expensive. Now it costs almost nothing. But cheap inference alone wouldn't matter if the models couldn't reason about messy real-world data or reliably use tools. Both of those crossed the "good enough" line in roughly the same window. The timing just worked out.
The scary part
How much autonomy do you actually give these things? It's easy to say "the agent can issue refunds under $50" until it issues 2,000 of them in one hour because it found a shipping problem you didn't know about. It was technically right. But that's the kind of thing that gets someone fired when a human does it without asking.
You can't just wrap everything in approval flows though. That just rebuilds the bottleneck. You need actual guardrails: spending caps, rate limits, escalation rules, clear lines between "handle it" and "flag it." The teams that figure this out early are going to move at a different speed than everyone else. Like, noticeably.
Where I think this goes
Give it a couple years and I think the best-run companies won't be the ones with the best analytics teams or the fanciest dashboards. They'll be the ones whose agents have the richest world model and the right amount of autonomy. The CEO job starts looking less like making decisions and more like tuning the system that makes them.
We went from "ask AI a question" to "give AI a task" to "AI just runs parts of your business." Most orgs aren't ready for that last one. But some are, and I'm really curious what happens when everyone else catches on.