Not this again. That study had serious problems. But I’m not even going to argue...

troupo · 2025-09-25T21:02:33 1758834153

> Not this again. That study had serious problems.

The problem is, there are very few if any other studies.

All the hype around LLMs we are supposed to just believe. Any criticism is "this study has serious problems".

> It’s like, a robot vacuum might take way longer

> Coding work that I used to procrastinate

Note how your answer to "the study had serious problems" is totally problem-free analogies and personal anecdotes.

keeda · 2025-09-25T21:36:58 1758836218

> The problem is, there are very few if any other studies.

Not at all, the METR study just got a ton of attention. There are tons out there at much larger scales, almost all of them showing significant productivity boosts for various measures of "productivity".

If you stick to the standard of "Randomly controlled trials on real-world tasks" here are a few:

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4945566 (4867 developers across 3 large companies including Microsoft, measuring closed PRs)

https://www.bis.org/publ/work1208.pdf (1219 programmers at a Chinese BigTech, measuring LoC)

https://www.youtube.com/watch?v=tbDDYKRFjhk (from Stanford, not an RCT, but the largest scale with actual commits from 100K developers across 600+ companies, and tries to account for reworking AI output. Same guys behind the "ghost engineers" story.)

If you look beyond real-world tasks and consider things like standardized tasks, there are a few more:

https://ieeexplore.ieee.org/abstract/document/11121676 (96 Google engineers, but same "enterprise grade" task rather than different tasks.)

https://aaltodoc.aalto.fi/server/api/core/bitstreams/dfab4e9... (25 professional developers across 7 tasks at a Finnish technology consultancy.)

They all find productivity boosts in the 15 - 30% range -- with a ton of nuance, of course. If you look beyond these at things like open source commits, code reviews, developer surveys etc. you'll find even more evidence of positive impacts from AI.

troupo · 2025-09-26T07:44:32 1758872672

Thank you!

> https://www.youtube.com/watch?v=tbDDYKRFjhk (from Stanford, not an RCT, but the largest scale with actual commits from 100K developers across 600+ companies, and tries to account for reworking AI output. Same guys behind the "ghost engineers" story.)

I like this one a lot, though I just skimmed through it. At 11:58 they talk about what many find correlates with their personal experience. It talks about easy vs complex in greenfield vs brownfield.

> They all find productivity boosts in the 15 - 30% range -- with a ton of nuance, of course.

Or 5-30% with "Ai is likely to reduce productivity in high complexity tasks" ;) But yeah, a ton nuance is needed

keeda · 2025-09-26T17:04:37 1758906277

Yeah that's why I like that one too, they address a number of points that come up in AI-related discussions. E.g. they even find negative productivity (-5%) in legacy / non-popular languages, which aligns with what a lot of folks here report.

However even these levels are surprising to me. One of my common refrains is that harnessing AI effectively has a deceptively steep learning curve, and often individuals need to figure out for themselves what works best for them and their current project. Took me many months, personally.

Yet many of these studies show immediate boosts in productivity, hinting that even novice AI users are seeing significant improvements. Many of the engineers involved didn't even get additional training, so it's likely a lot of them simply used the autocompletion features and never even touched the powerful chat-based features. Furthermore, current workflows, codebases and tools are not suited for this new modality.

As things are figured out and adopted, I expect we'll see even more gains.

risyachka · 2025-09-25T21:54:22 1758837262

Closed PRs, commits, loc etc are useless vanity metrics.

With ai code you have more loc and NEED more PRs to fix all its slop.

In the end you have increased numbers with net negative effect

keeda · 2025-09-25T22:31:15 1758839475

Most of those studies call this out and try to control for it (edit: "it" here being the usual limitations of LoC and PRs as measures of productivity) where possible. But to your point, no, there is still a strong net positive effect:

> https://www.youtube.com/watch?v=tbDDYKRFjhk (from Stanford, not an RCT, but the largest scale with actual commits from 100K developers across 600+ companies, and tries to account for reworking AI output. Same guys behind the "ghost engineers" story.)

Emphasis added. They modeled a way to detect when AI output is being reworked, and still find a 15-20% increase in throughput. Specific timestamp: https://youtu.be/tbDDYKRFjhk?t=590&si=63qBzP6jc7OLtGyk

tbossanova · 2025-09-25T23:28:00 1758842880

Could you try to avoid uncertainties like this by measuring something like revenue growth before and after AI? Given enough data.

keeda · 2025-09-26T00:30:04 1758846604

Hmm, not an economist but I have seen other studies that look at things at the firm level, so definitely should be possible. A quick search on Google and SSRN didn't turn up some studies but they seem to focus on productivity rather than revenues, not sure why. Maybe because such studies depend on the available data, however, so a lot of key information may be hidden, e.g. revenues of privately held companies which constitute a large part of the economy.

tbossanova · 2025-09-26T00:38:28 1758847108

True it probably would be difficult to gather representative data. Also might be hard to seperate out broader economic effects e.g. overall upturns.

jama211 · 2025-09-26T19:10:57 1758913857

No, you misunderstood me. Those other points aren’t related to any criticism of the study. Those are points that backup my other point.

I did say I wasn’t going to argue the point that study made, and I didn’t.

blks · 2025-09-25T19:54:54 1758830094

Often someone’s personal productivity with AI means someone else have to dig through their piles of rubbish to review PR they committed.

In your particular case it sounds like you’re rapidly loosing your developer skills, and enjoy that now you have to put less effort and think less.

wussboy · 2025-09-25T20:12:05 1758831125

We know that relying heavily on Google Maps makes you less able to navigate without Google Maps. I don't think there's research on this yet, but I would be stunned if the same process isn't at play here.

fbxio · 2025-09-25T20:41:00 1758832860

Whatever your mind believes it doesn’t need to hold on to that what is expensive to maintain and run, it’ll let go of. This isn’t entirely accurate from a neuroscience perspective but it’s kinda ballpark.

Pretty much like muscles decay when we stop using them.

lxgr · 2025-09-25T20:52:16 1758833536

Sure, but sticking with that analogy, bicycles haven’t caused the muscles of people that used to go for walks and runs to atrophy either – they now just go much longer distances in the same time, with less joint damage and more change in scenery :)

AdieuToLogic · 2025-09-26T02:54:38 1758855278

>> Whatever your mind believes it doesn’t need to hold on to that what is expensive to maintain and run, it’ll let go of. This isn’t entirely accurate from a neuroscience perspective but it’s kinda ballpark.

>> Pretty much like muscles decay when we stop using them.

> Sure, but sticking with that analogy, bicycles haven’t caused the muscles of people that used to go for walks and runs to atrophy either ...

This is an invalid continuation of the analogy, as bicycling involves the same muscles used for walking. A better analogy to describe the effect of no longer using learned skills could be:

  Asking Amazon's Alexa to play videos of people
  bicycling the Tour de France[0] and then walking
  from the couch to the your car every workday
  does not equate to being able to participate in
  the Tour de France[0], even if years ago you
  once did.

0 - https://www.letour.fr/en/

GLdRH · 2025-09-26T09:14:09 1758878049

Thanks for putting the citation for the Tour de France. I wouldn't have believed you otherwise.

AdieuToLogic · 2025-09-27T00:57:04 1758934624

> Thanks for putting the citation for the Tour de France. I wouldn't have believed you otherwise.

Then the citation served its purpose.

You're welcome.

Taganov · 2025-09-25T21:47:19 1758836839

Oh, but they do atrophy, and in devious ways. Though the muscles under linear load may stay healthy, the ability of the body to handle the knee, ankle, and hip joints under dynamic and twisting motion does atrophy. Worse yet, one may think that they are healthy and strong, due to years of biking, and unintentionally injure themselves when doing more dynamic sports.

Take my personal experience for whatever it is worth, but my knees do not lie.

lxgr · 2025-09-26T12:09:24 1758888564

Sure, only cycling sounds bad, as does only jogging. And thousands of people hike the AT or the Way of St. James every year, despite the existence of bicycles and even cars. You've got to mix it up!

I believe the same holds true for cognitive tasks. If you enjoy going through weird build file errors, or it feels like it helps you understand the build system better, by all means, go ahead!

I just don't like the idea of somehow branding it as a moral failing to outsource these things to an LLM.

darkwater · 2025-09-26T14:02:20 1758895340

Yeah, but what's going to happen with LLMs is that the majority will just outsource thinking to the LLM. If something has a high visible reward with hidden, dangerous risks, people will just go for the reward.

lanstin · 2025-09-27T00:26:59 1758932819

Ok Socrates, let’s go back to memorizing epic poems.

contrariety · 2025-09-25T21:06:30 1758834390

To extend the analogy further, people who replace all their walking and other impact exercises with cycling tend to end up with low bone density and then have a much higher risk of broken legs when they get older.

gf000 · 2025-09-25T21:37:29 1758836249

Well, you still walk in most indoor places, even if you are on the bike as much as humanly possible.

But if you were to be literally chained to a bike, and could not move in any other way than surely you would "forget"/atrophy in specific ways that you wouldn't be able to walk without relearning/practicing.

AdieuToLogic · 2025-09-26T03:37:24 1758857844

> Whatever your mind believes it doesn’t need to hold on to that what is expensive to maintain and run, it’ll let go of. This isn’t entirely accurate from a neuroscience perspective but it’s kinda ballpark.

A similar phenomena occurs when people see or hear information and whether they record it in writing or not. The act of writing the percepts, in and of itself, assists in short-term to long-term memory transference.

lukan · 2025-09-25T20:28:15 1758832095

I know that I am better at navigating with google maps than average people, because I navigated for years without it (partly on purpose). I know when not to trust it. I know when to ignore recommendations on recalculated routes.

Same with LLMs. I am better with it, because I know how to solve things without the help of it. I understand the problem space and the limitations. Also I understand how hype works and why they think they need it (investors money).

In other words, no, just using google maps or ChatGPT does not make me dumb. Only using it and blindly trusting it would.

Zagreus2142 · 2025-09-26T15:11:17 1758899477

Yeah this definitely matches my experience and guess what? Google maps sucks for public transit and isn't actually that good for pedestrian directions (often pointing people to "technically" accessible paths like sketchy sidewalks on busy arterial roads signed for 35mph where people go 50mph). I stopped using Google maps instinctually and now only use it for public transit or drives outside of my city. Doing so has made me a more attentive driver, less lazy, less stressed when unexpected issues on the road occur, restored my navigation skills, and made me a little less of, frankly, an adult man child.

Applying all of this to LLMs has felt similar.

al_be_back · 2025-09-26T00:37:07 1758847027

Gets worse for projects outsourced to 1+ Consultancy firms, where staff costs are prohibitively high, now you've got another layer of complexity to factor in (risks, costs).

Consultancy A submit work, Consultancy B reviews/tests. As A increases the use of AI, B will have to match with more staff or more AI. More staff for B, mean higher costs, at slower pace. More AI for B, means higher burden of proof, an A vs B race condition is likely.

Ultimately clients will suffer from AI fatigue and inadvertently incur more costs at later stage (post-delivery).

danenania · 2025-09-25T22:56:13 1758840973

My own code quality is better with AI, because it makes it feasible to indulge my perfectionism to a much greater degree. Before AI, I usually needed to stop sooner than I would have liked to and call it good enough. Now I can justify making everything much more robust because it doesn’t take a lot longer.

It’s the same story with UI/UX. Previously, I’d often have to skip little UI niceties because they take time and aren’t that important. Now even relatively minor user flows can be very well polished because there isn’t much cost to doing so.

Zagreus2142 · 2025-09-26T15:26:11 1758900371

https://github.com/plandex-ai/plandex/blob/9017ba33a627c518a...

Well your perfectionism needs to be pointed towards this line. If you get truly large numbers of users this will either slow down token checking directly or your process for removing ancient expired tokens (I'm assuming there is such a process...) much slower and more problematic.

danenania · 2025-09-26T15:46:39 1758901599

Lol is that really the best example you could find?

Zagreus2142 · 2025-09-26T15:49:04 1758901744

Truly the response of someone who is a perfectionist using llms the right way and not a slop coder

danenania · 2025-09-26T15:58:15 1758902295

It's just funny because there are definitely examples of bad code in that repo (as there are in any real project), but you picked something totally routine. And your critique is wrong fwiw—it would easily scale to millions of users. Perhaps you could find something better if you used AI to help you...

fbxio · 2025-09-25T20:38:05 1758832685

I’d love not to have to be great at programming, as much as I enjoy not being great at cleaning the canalization. But I get what you mean, we do lose some potentially valuable skills if we outsource them too often for too long.

lxgr · 2025-09-25T20:50:25 1758833425

It’s probably roughly as problematic as most people not being able to fix even simple problems with their cars themselves these days (i.e., not very).

kyykky · 2025-09-26T04:39:43 1758861583

Everyone needs to have AI to do some minor modification in Excel file?

lxgr · 2025-09-26T12:07:17 1758888437

Of course not. Who is arguing for that?

darkwater · 2025-09-26T14:04:44 1758895484

Give it time. They will, eventually.

jama211 · 2025-09-26T19:13:51 1758914031

This is so baseless and insulting and makes so many assumptions I don’t think you deserve a response from me at all.

rapind · 2025-09-26T01:11:50 1758849110

> In your particular case it sounds like you’re rapidly loosing your developer skills, and enjoy that now you have to put less effort and think less.

Just the other day I was complaining that no one knows how to use a slide rule anymore...

Also C++ is producing bytecode that's hot garbage. It's like no one understands assembly anymore...

Even simple tools are often misused (like hammering a screw). Sometimes they are extremely useful in right hands though. I think we'll discover that the actual writing of code isn't as meaningful as thinking about code.

jama211 · 2025-09-26T19:15:09 1758914109

Hahaha well said, thank you. I feel like I’m taking crazy pills reading some of the comments around here. Serious old man shakes fist at cloud moments.

stavros · 2025-09-25T20:29:27 1758832167

I'm losing my developer skills like I lost my writing skills when I got a keyboard. Yes, I can no longer write with a pen, but that doesn't mean I can't write.

jama211 · 2025-09-26T19:18:18 1758914298

Also I don’t know about you but despite the fact that I basically never write with a pen, the occasional time I have to I’m a little slow sure but it’s not like I physically can’t do it. It’s no big deal.

Imagine telling someone with a typewriter that they’d be unable to write if they don’t write by hand all the time lol. I write by hand maybe a few times a year - usually writing a birthday card or something - but I haven’t forgotten.

stavros · 2025-09-26T19:31:18 1758915078

Yep, same. I might have forgotten some function names off the top of my head, but I still know how to program, and I do every day.

jama211 · 2025-09-27T13:50:22 1758981022

Exactly

lxgr · 2025-09-25T20:49:14 1758833354

Another way of viewing it would be that LLMs allow software developers to focus their development skills where it actually matters (correctness, architecture etc.), rather than wasting hours catering to the framework or library of the day’s configuration idiosyncrasies.

That stuff kills my motivation to solve actual problems like nothing else. Being able to send off an agent to e.g. fix some build script bug so that I can get to the actual problem is amazing even with only a 50% success rate.

iLoveOncall · 2025-09-25T22:29:13 1758839353

The path forward here is to have better frameworks and libraries, not to rely on a random token generator.

lxgr · 2025-09-26T06:30:16 1758868216

Sure, will you write them for me?

Otherwise, I’ll continue using what works for me now.

djeastm · 2025-09-25T23:14:52 1758842092

>better frameworks and libraries

I feel like the past few decades of framework churn has shown that we're really never going to agree on what this means

blks · 2025-09-28T10:53:26 1759056806

You still have to review and understand changes that your “AI agent” did. If you don’t review and fully understand everything it does, then I fear for your project.

majormajor · 2025-09-26T02:59:16 1758855556

> But I’m not even going to argue about that. I want to raise something no one else seems to mention about AI in coding work. I do a lot of work now with AI that I used to code by hand, and if you told me I was 20% slower on average, I would say “that’s totally fine it’s still worth it” because the EFFORT level from my end feels so much less.

I completely get this and I often have an LLM do boring stupid crap that I just don't wanna do. I frequently find myself thinking "wow I could've done it by hand faster." But I would've burned some energy that could be better put towards other stuff.

I don't know if that's a net positive, though.

On one hand, my being lazy may be less of a hindrance compared to someone willing to grind more boring crap for longer.

On the other hand, will it lessen my edge in more complicated or intricate stuff that keeps the boring-crap-grinders from being able to take my job?

jama211 · 2025-09-26T19:12:32 1758913952

Exactly, but I don’t think you lose much edge, or anything that can’t be picked up again quickly if it’s truely boring easy stuff. I think it’s a net positive because I can guarantee you there are afternoons where if I couldn’t have done the boring thing with AI I just wouldn’t have done it that afternoon at all haha.