Small Input, Massive Output

Part 2 of the High Impact, Low Understanding series.

The prompt-to-production pipeline

Here's the workflow: you type a sentence. Maybe two. AI generates an entire application, a workflow, a report, an automation that touches real business data. You skim the output, or you don't, and put it to work. A one-line prompt just became something your team depends on.

The ratio is the problem. A sentence of effort in, a system of consequences out. Somewhere in that gap lives every bug and every design decision you didn't make but are now responsible for. This isn't just a developer problem. In most traditional orgs, the person typing that prompt isn't a developer at all. They're a department head, a project manager, a business analyst who saw a demo and thought "I could build that."

I wrote about the personal side of this a couple weeks ago. The instant gratification of completing a time-consuming task quickly can quietly chip away at independent thought. But that was about my own discipline. This is about what happens at scale, when entire organizations operate in that gap, and nobody has written a policy for it yet.

You think you're faster. You're not.

In mid-2025, METR ran a controlled experiment that should make anyone pause. Sixteen experienced software developers, people with an average of five years on their respective projects, completed 246 tasks. Half the time they could use AI tools. Half the time they couldn't.

AI tools made them 19% slower.

But here's what got me. Even after experiencing that slowdown firsthand, the developers still believed AI had made them 20% faster. Not a small miscalibration. A complete inversion. They felt faster while measurably losing time to reviewing, debugging, and fixing AI output.

These were experts. People who do this for a living. Now imagine that same confidence gap in a stakeholder who has never written code, has no framework for evaluating what AI produced, and has enough organizational authority that nobody is going to second-guess them.

Confidence scales inversely with competence

Stanford researchers found that developers using AI code assistants wrote less secure code while reporting higher confidence that their code was secure. Read that again. The tool made the output worse and made them more sure of it. And these are people trained to evaluate code.

Now extend that outside engineering entirely. A VP building an automated reporting pipeline, a sales lead spinning up a customer-facing chatbot, an operations manager wiring together a data workflow. These people aren't even in the conversation about code review. They don't know it's a conversation that should be happening.

There's a name for this: automation complacency. People stop checking automated systems because the systems are usually right. When users aren't familiar with a task, they lean on AI and trust its outputs more than they should because they don't know enough to question it. In a traditional business environment with no AI policy and no technical oversight, that complacency goes completely unchecked.

Nobody's reading the output. And in most orgs, nobody's even asking if someone should be.

AI slop's defining formula

There's a term for the broader pattern: AI slop. It started as a way to describe the flood of low-quality AI-generated content online, but the mechanics apply to code and beyond too. AI slop has three defining properties: superficial competence, asymmetric effort, and mass producibility.

Superficial competence: it looks right. The app loads, the report generates, the workflow runs. Asymmetric effort: the person who generated it spent seconds, but the person who'd need to evaluate it properly would need hours. Mass producibility: one person can generate more in a day than a team could review in a week.

That asymmetry drives everything. When the cost of producing something approaches zero but the cost of evaluating it stays high, the default behavior is to skip the evaluation. Gartner is already predicting atrophy of critical-thinking skills from GenAI use through 2026. We're not just failing to check the output. We're losing the muscle for it.

The deadly formula

Small input. Massive output. Zero comprehension.

The person generating the solution doesn't have the expertise to evaluate it. The people who could evaluate it either don't know it exists (because it's a shadow project) or don't have the time, because the volume of AI-generated work has already overwhelmed existing review processes. So you end up with a growing stack of business-critical systems that work well enough to depend on but that nobody actually understands.

The scariest version of this isn't in a tech company. It's in a mid-size insurance firm, or a hospital system, or a manufacturing company where the person building the tool is also the person who decides whether it's ready. No code review. No security audit. No policy that says they need one. Just a senior stakeholder who typed a prompt, got a result, and rolled it out to the team.

This isn't about AI being bad at writing code. Sometimes it's great. It's about the feedback loop. The easier it is to produce output, the less we look at it. The less we look at it, the more we trust it. That cycle doesn't correct itself.

Instant feedback is still not the same as instant gratification. We're just getting worse at telling the difference.