The paradox of code

Since ChatGPT was released in late 2022, learning to code has become dramatically easier. I’ve always been interested in learning, but LLMs opened the door for me. I taught myself to code with the help of ChatGPT, Claude, and Gemini. It feels like a genuine revolution in accessibility.

But three years in, I keep wondering: Where is all the new software? And where are all the new features in existing software?

If coding has become possible, let alone easier, for people like me, why haven’t we seen an explosion of new applications? Where are the new features and innovations?

Where are all the other people like me, shipping products they’ve built with AI assistance?

Let’s call it the Paradox of Code.

My coding journey

I couldn’t really code at all three years ago. ChatGPT changed that.

I could ask the same set of basic, embarrassing questions, over and over again, without judgment. I could get the LLM to generate boilerplate code and I’d figure out how to connect it together.

The early ChatGPT context window was tiny – around 200 lines of code before it started to struggle – forcing me to learn about modularity and architectural thinking, just to work around its limitations.

The tools got better. Claude brought massive context windows. Cursor automated the process of updating code rather than having to copy and paste massive blocks of code (and checking whether things were omitted or quietly updated). Claude Code accelerated everything further. Each iteration made me more capable.

I built things I actually use. A clipboard redaction tool. A task manager customised to how I think. A podcast generator that creates audio from URLs or prompts and sends them to a personal podcast feed. An app that monitors my son’s YouTube history and links his viewing habits and interests with the NZ Curriculum. Tools for iterating on personal goals.

I also built two finance applications: Snoopy (a privacy-focused personal finance app that runs entirely on your computer) and Flawcastr (a free browser-based scenario planning tool). These came from my background running Fairhaven Wealth: I knew the problems intimately because I’d lived them as both adviser and user.

These aren’t prototypes. They’re tools I personally use and that I’m now comfortable releasing to others. Getting to this point took me far longer than I expected. My first attempt at Flawcastr was in 2022, and I’m on the fifth or sixth complete rewrite of the app. But I got here. From zero coding knowledge to shipping applications.

If I can do this, why aren’t tens of thousands of others doing the same?

The obvious answers (and why they don’t satisfy me)

My first instinct is to credit myself: maybe I’m more capable than I think, maybe I dedicated more time and resources than most people would (or could!), and maybe I have advantages I’m not seeing.

That’s probably partly true. But it doesn’t explain the scale of the discrepancy.

If AI had genuinely democratised coding, we’d expect to see at least some visible increase in new software. I don’t really see it.

The second explanation: AI is great for prototypes but terrible for production.

This is definitely part of the answer. I experienced it directly: getting a working demo took hours, if not minutes. But getting something that worked reliably, let alone I could ship to strangers, seemed to take forever.

In some ways, this was dangerous: if it had been harder for me to generate a prototype/mock-up, I probably wouldn’t have tried. But instead it was like giving a dog a bone, and I just couldn’t put it down.

That final stretch of getting the apps ready for production involved handling edge cases, cross-platform compatibility, packaging for distribution, error handling for unexpected user behaviour. Lots of unglamorous work that takes ages while feeling like you’re making little or no progress.

I started digging into what developers and researchers are finding, and the picture gets more interesting—and more complicated.

A productivity illusion?

Some recent studies suggest that AI-assisted coding might not actually make developers faster!

A 2025 study by METR (with a very small sample size!) had experienced developers complete tasks with and without AI assistance. The AI-assisted group was, on average, considerably slower. But they believed they had been faster! They predicted before the study that AI would speed them up significantly, and even after seeing the results, they maintained they’d been faster.

The researchers called it a “productivity placebo.” AI feels fast. You get immediate feedback. Code appears on your screen instantly. It’s psychologically satisfying in a way that writing code manually isn’t.

But that dopamine hit doesn’t necessarily translate to finishing projects faster.

Stack Overflow also conducted a developer survey and found similar patterns. The majority of experienced developers reported minimal or no improvement. Common frustrations were that code was almost correct but had subtle errors, and that a lot of time was lost to debugging that wasn’t previously necessary.

This matches my experience. I felt incredibly productive during those early dopamine-hit phases. But when I step back and look at my calendar, these projects took much longer than I could have imagined. The feeling of speed didn’t correlate with actual delivery speed.

I’m not entirely convinced by these studies. They’re early, methodologies vary, new models and frameworks for using AI continue to improve. “Productivity” in software is hard to measure. But they raise an uncomfortable question: have I been experiencing the same illusion? Did AI actually make me faster, or did it just feel that way? Could I have learnt the traditional, pre-AI way, and if I did would I actually have been faster with these projects? I can’t run the counterfactual and disentangle the variables.

Is coding speed the actual bottleneck?

Maybe: typing code was only one limiting factor in software development, among many others.

Design discussions, code reviews, fixing test failures in deployment, context switching between tasks.

If that’s true, then making the “typing code” part 10× faster might only improve overall delivery time by a small percentage. It’s like getting a faster saw when most of your time is spent measuring, planning, and finishing work.

This resonates with my experience. I wasn’t slowed down by typing. I was slowed down by architectural decisions, discovering my initial approach was wrong, rebuilding systems after I understood the problem better, and handling all the integration work that AI couldn’t help with. I would get a working prototype then hit an issue that resulted in a plateau, until I got my head around that issue.

An example relates to making an app available for other people to use. Initially, I created a Python-based app which I deployed into an .exe file. It triggered all sorts of alarms on friends’ computers when they tried to use it, and it looked terrible. When I realised I might have to use functionality available via the browser, then I had to get my head around HTML and CSS – and how to deploy the apps so that other people could access them in their browser. In some cases this also meant I had to change the way things operated in the back-end. The original version of Flawcastr was exclusively Python-based. The current browser-based version doesn’t use Python at all.

I rebuilt Snoopy’s categorisation system three times. Not because coding was hard, but because I didn’t understand the problem well enough the first two times. AI helped me implement each version quickly, but it couldn’t tell me version two was over-complicated or that version three would strike the right balance. I had to discover that through use.

Coding speed was never my bottleneck. Understanding was.

The quality question

What if AI-assisted code is faster to write but slower to finish because it requires more debugging and refinement?

Industry research suggests this might be happening. For example, AI-generated code tends to contain significantly more security vulnerabilities than human-written code.

AI-generated code often looks right but has subtle issues. It might use patterns that work but don’t fit the specific context. It might handle the happy path but miss edge cases.

I spent months debugging and refining. A lot of that was learning, and accepting that I was initially doing things sub-optimally because I realised I had to limit my focus in order to avoid overwhelm. But some of it was discovering that the AI’s confident-sounding suggestions didn’t actually work in my specific situation.

I’d need to understand why something failed, which meant developing mental models that the AI couldn’t provide. In fact, I needed to develop mental models of where AI would often fail and how to manage that.

I’ve read AI-developed code described as “instant tech-debt”, and that feels correct. It is something that needs to be managed in terms of trying to avoid/minimise this in the first place, and managing it after the fact.

In short: code might be produced faster, but quality assurance is taking longer.

What if LLMs can’t create anything genuinely new?

Here’s a perspective I find both compelling and troubling: LLMs are exceptional at reproducing patterns they’ve seen, but terrible at innovation.

An LLM trained on GitHub can generate another CRUD app, another todo list, another clone of an existing game. This is because it has seen thousands of examples and can replicate these patterns. But asking it to design something genuinely novel, with unique architecture or original approaches? This is a harder ask.

One developer put it this way: “If your system has zero uniqueness, AI can nail it…. But why build something that’s already everywhere? For products that iterate, compete, and stand out, keep humans in the loop. We provide the business smarts, creative flair, and high-entropy info that AI can’t fake.”

This would explain why I managed to build Snoopy and Flawcastr but needed to provide the “vision”. The AIs helped implement a local-first finance app with specific privacy architecture, but I had to specify that architecture. It couldn’t have invented it. Similarly with Flawcastr’s approach to scenario planning: that came from my years of doing financial planning with clients, not from anything an LLM could generate.

Beyond a certain level of complexity, every software project becomes somewhat unique. I’ve found that as a project becomes more complex, I’ve had to hold the reins more tightly in terms of how LLMs work on the codebase.

If this is right, AI might flood the world with more implementations of existing patterns, but not generate new categories of software. The paradox isn’t about volume of software. It’s about lack of innovation, in terms of new combinations of doing things. New ideas are limited by many of the same factors they always were.

It’s early, and maybe that’s all this is

There’s a simpler explanation: it’s only been three years. That’s not much time.

There’s a saying that we often overestimate what we can do in a given day, but we underestimate what we can do in a year or a decade. Similarly, exponential growth looks slow before it starts to look very fast. And perhaps I’m falling into this trap: I’ve overestimated where we “should” be and am underestimating where we will get to.

Even if AI made certain developers significantly more productive, how long would it take for that to translate into visibly different software in the market? Product development involves market research, user testing, iteration based on feedback, building trust and distribution. A solo developer might ship faster, but a two-year project becoming a one-year project doesn’t necessarily change what software exists in 2025.

There’s also uneven adoption. It’s conceivable that AI coding tools help less experienced developers more than senior ones (who can often write clean code faster from scratch than fixing AI output). So the biggest impact might be enabling newer developers to reach “normal” productivity, not dramatically accelerating experienced teams.

That’s valuable for democratisation, but doesn’t necessarily yield better software faster.

Large organisations move slowly. They’re cautious about quality, security, compliance. They’re not going to overhaul development processes based on tools that are three years old. Industry change is evolutionary, not revolutionary.

Maybe the paradox resolves itself in a few years as practices mature and as more people like me work through the learning curve. Maybe we’re in the early adopter phase and mainstream impact comes later.

Or maybe not.

What I think (weak opinions, weakly held)

I’m writing this at an odd moment. I’ve built applications I use daily. I’m about to find out if anyone else cares enough to use some of the apps I’ve built.

I don’t know if I’m an early signal of a coming wave, or an outlier. My guess is that it’s a bit of both.

After three years of living inside this paradox, my suspicions are that we will see more software and better features eventually as a result of AI. It will just take longer than I initially expected.

I believe we will see new products that come out and compete with existing products, which will increase competition resulting in (a) more innovation and (b) better and cheaper options for consumers.

Even if the only benefit of AI was that it made prototyping dramatically faster, this would be significant. It would mean people who are considering a particular project can get a better sense of whether an idea is worth pursuing. All else being equal, people will focus on better projects, which is a huge win.

To the extent there are other bottlenecks, I think AI will erode many of these as well. I’ve found that a lot of the initial issues I had with upfront technical debt have reduced significantly. I’ve become much better at guiding LLMs so they don’t create as much tech debt, and newer models such as Claude Opus 4.5 and Gemini 3.0 Pro don’t create as much. I’ve also become better at debugging LLM-generated tech debt. This is still a sticking point, but not nearly to the same extent as before. Other bottlenecks exist, but many of them will fall away as well.

Personally, as I’ve worked on each project, I’ve also developed something of a “tech stack” that I can use across projects, meaning that the time from idea to something I can share with others has reduced dramatically. Flawcastr took a long time and many different iterations. Snoopy took me less time and I didn’t need to start from scratch at all. Some of the other projects I’ve worked on I’ve been able to get running in days or hours.

Frankly, I can imagine a time where I could be releasing a new project every week. Instead of publishing a new article on the NZ Wealth & Risk blog each week, I could be publishing a new software project that other people can use.

It’s an empirical question, really, and there’s evidence that AI is having an impact

From an empirical perspective: I am evidence that AI will impact software development.

Three years ago I had an interest in coding but I couldn’t do it. Now, I can code. And not only that: I’m starting a software business, making software tools that wouldn ‘t otherwise have existed.

AI removed the barrier of “I don’t know which language to use” and “I don’t know syntax” and “I can’t remember how to do X.” It let me focus on problems instead of implementation details. It compressed the learning curve from years to months. It made coding accessible in a way it never was before.

That’s pretty clear evidence in support of AI changing things.

Not everyone is going to pick it up

You could say a similar set of things about blogging or starting a podcast. With those projects, the barriers to entry have fallen over time. Lots of people consider it, a small amount of people give it a go, and very, very few people actually do it consistently. I’ve never felt that there were too many good blogs or podcasts. And perhaps that will be the same with software.

Perhaps another bottleneck is how many people have the willingness and ability to work on new projects. Although I would have liked to have dived even deeper into coding over the past few years, I guess I was still uniquely positioned to take advantage of these new tools and ways of doing things. My circumstances were such that I had more time to spend on learning. It also turns out that the line-by-line aspect of coding is what I find hardest, and the bigger picture stuff, like structure and architecture, comes a lot more naturally to me.

I also have the unique advantage of having a variety of personal and professional experiences and interests which have informed the problems I want to work on, and the solutions to these problems. I didn’t need to search for problems. I had a backlog, which has only increased over time.

How many people are in that position? Most domain experts don’t code. Most people learning to code lack deep domain expertise outside of software itself. People with the propensity to learn and build are probably busy working on other things. AI lowered the barrier to coding, but it still requires people to learn how to use the tools.

I don’t know the answer to the question above, and I suspect that is the bottleneck.

One thing I do know: if you’re thinking about building something, don’t let “I can’t code” stop you. That barrier is lower than it’s ever been. The question is whether you have something worth building and whether you’re willing to learn, and do the work that AI can’t do for you.

I suspect that describes fewer people than the hype, and my initial intuition, might suggest. But I hope it describes enough for us to see cool new products and features, sooner than later.