Yup, everyone (including me) strongly wants to deny the usefulness of ai, but the fact is that ai is already quite useful, and is only becoming more useful over time. There are a zillion moral problems with ai, but the usefulness of its output is obvious.
E.g. For years I’ve been considering paying someone to make a small app for me that does one specific thing, but recently i asked ai to do it and boom - it created an app that did exactly what i wanted. It even suggested some good features which i then said yes to and it made the app even better. And when i think of a new feature i just say “add this new feature”, and it does. Occasionally the outputted app doesn’t work, and i just say “now the audio doesn’t work, fix it” and it does. So far there was only one one feature i asked it to do that it failed at.
Is your app as efficient as what an experienced developer would create? If you released the source code, would it have security vulnerabilities? These are just a couple of the more hidden issues that fly under the radar when shipping LLM-generated code.
Is your app as efficient as what an experienced developer would create?
One of the earliest uses we had for LLMs was literally just asking it to optimize several large codebases. Lots of pointless changes suggested; several huge performance wins we had overlooked.
And all done – implemented, tested, and human-reviewed – in about a person-week, compared to at least half a dozen person-months to go through all that by hand.
I mean, sometimes the LLMs generate slow algos. But less often than human coders.
If you released the source code, would it have security vulnerabilities?
You’re not gonna believe this, but another of the first things we did was ask the LLMs to review the codebase for security issues (and review any new PRs)
OFC the code also gets reviewed for security vulns like it always has, by old-school automation (eg valgrind, fortify, yadda), human review, and red-teaming exercises. I don’t think I’ve seen enough data yet to say whether it’s got more/worse security issues than human-generated code (which, need I remind you, is often highly insecure)
These are just a couple of the more hidden issues that fly under the radar when shipping LLM-generated code.
Ummm… those would be issues if you didn’t use good orchestration, didn’t have good tools and docs for the LLMs to use, didn’t have follow good software engineering practices to begin with…
Yup, everyone (including me) strongly wants to deny the usefulness of ai, but the fact is that ai is already quite useful, and is only becoming more useful over time. There are a zillion moral problems with ai, but the usefulness of its output is obvious.
E.g. For years I’ve been considering paying someone to make a small app for me that does one specific thing, but recently i asked ai to do it and boom - it created an app that did exactly what i wanted. It even suggested some good features which i then said yes to and it made the app even better. And when i think of a new feature i just say “add this new feature”, and it does. Occasionally the outputted app doesn’t work, and i just say “now the audio doesn’t work, fix it” and it does. So far there was only one one feature i asked it to do that it failed at.
Is your app as efficient as what an experienced developer would create? If you released the source code, would it have security vulnerabilities? These are just a couple of the more hidden issues that fly under the radar when shipping LLM-generated code.
One of the earliest uses we had for LLMs was literally just asking it to optimize several large codebases. Lots of pointless changes suggested; several huge performance wins we had overlooked.
And all done – implemented, tested, and human-reviewed – in about a person-week, compared to at least half a dozen person-months to go through all that by hand.
I mean, sometimes the LLMs generate slow algos. But less often than human coders.
You’re not gonna believe this, but another of the first things we did was ask the LLMs to review the codebase for security issues (and review any new PRs)
OFC the code also gets reviewed for security vulns like it always has, by old-school automation (eg valgrind, fortify, yadda), human review, and red-teaming exercises. I don’t think I’ve seen enough data yet to say whether it’s got more/worse security issues than human-generated code (which, need I remind you, is often highly insecure)
Quite possibly solving the majority of human diseases is rather more than “quite useful”
2024 Nobel Prize lecture 2024 https://www.youtube.com/watch?v=qX1aYUckvnY
2025 lecture: Deep Protein Space. If this doesn’t blow your fucking mind… you haven’t heard of DNA https://www.youtube.com/watch?v=_enkgH6Vrxk