On the Usefulness of LLMs and Other Deep Learning Models

Lately I’ve been thinking a lot about the state of “AI” and its implications for us embodied “human intelligences”. Hardly a week (or even a day) goes by without some Silicon Valley titan proclaiming that “AI is smarter than humans” and arguing about whether this is good or bad for us as a species. “It’s a white-collar apocalypse”, “There will be all kinds of new jobs!”, “We are now confident we know how to build AGI as we have traditionally understood it.” What’s missing from these statements are clear definitions for terms like “smarter”and “intelligent”, and when they are provided, they conflict with what we already know. Consider Sam Altman’s definition of AGI:“AI systems that can perform most economically valuable work as well as or better than humans.” Or think about the well-respected Turing Test, which judges machine intelligence based on a human’s ability to distinguish the behavior of a machine from a human, based on specific intellectual tasks. That reduces human intelligence to competence at tasks that can be completed at a keyboard. I find the narrow scope of such definitions unsatisfying.

I recently returned from a mission trip to Guatemala, where I worked side-by-side with local masons, who mix concrete and plaster by hand and improvise solutions to deal with tricky build sites and keep homes dry in the rainy season. That was a humbling lesson in the limits of the kind of “intelligence” my PhD and digital skills afford me. Those guys are performing intelligent, economically valuable work. Then there are the nurses at the clinics I’ve visited recently whose reading of a patient’s physical and emotional state include levels of cultural and social nuance in addition to the complex medical conditions of the human body. In fact, scientists, engineers, technicians, nurses, farmers, and floral designers who solve problems all the time in environments full of uncertainty and human need are applying forms of intelligence and performing economically valuable work that no LLM can touch. These are embodied, culturally embedded, and morally aware practices—not lines of text on a screen.

“But wait,” you say, “LLMs like ChatGPT and Claude are amazing! Why are you being such a curmudgeon?” I agree. In fact, ChatGPT helped me draft this piece, and although I ended up throwing away most of what it wrote, its ability to do research and summary is excellent. It also pointed me to some resources faster than I would have found them on my own. So is ChatGPT “smarter” than me? I think the more interesting question is “When does ChatGPT, LLM or other AI, have an advantage over me?”

What started me down this path was a couple of articles I came across recently. Bruce Schneier and Will Anderson wrote at The Conversation about 4 axes, what they call “The 4 S’s”, of technology’s advantages over humans. The article is not long and worth a read; in short they point out that AI often has an advantage over humans when it comes to speed, scale, scope, and sophistication. When those are the barriers, it can make sense to implement AI. When they’re not, introduction of AI can feel gratuitous, or even downright annoying; witness auto-completion for text messages, or the many customer service chatbots. Schneier and Anderson point out that companies implement them seeking to benefit from scale, but customers don’t see benefits from speed or sophistication, and they suffer from the loss of human communication in terms of empathy, sincerity, context, and problem solving ability. But there are many contexts where AIs are able to surpass the performance of humans, such as when playing Chess or Go, analyzing protein folding structures, and identifying promising materials for engineering applications.

However, there are contexts and situations where the perception of speed up is actually illusory. In July 2025, the folks at Model Evaluation & Threat Research (METR) published a study of 16 experienced senior developers of large open source software projects in which they recorded and analyzed their activity as they resolved issues from the issue tracker on their project. The study controlled their use of the AI tool of their choice. The key finding was that the developers generally reported believing that AI had sped them up by 20% or more, when in fact it took them on average 19% longer to resolve the issues. They point out that often the benchmarks used to measure the productivity gains of AI coding tools don’t reflect the kinds of tasks found “in the wild” and thus aren’t helpful. Even self-reporting by experienced developers are not a reliable guide to productivity impacts. Also of interest is this white paper from GitClear on the decline in code quality with the use of AI coding tools.

Developers generally reported believing that AI had sped them up by 20% or more, when in fact it took them on average 19% longer to resolve issues from large, mature, open-source projects.

Furthermore, there are limits to the level of sophistication even “reasoning” models can attain. In a refreshingly honest piece from Apple, published in June 2025, the authors discuss the strengths and weaknesses of standard models (LLMs) and large reasoning models (LRMS) in performing tasks of varying complexity. They find a hard limit on the complexity of problems for which LLMs and LRMs are capable of finding solutions, even given arbitrarily more computing power.

The real danger of technology is not that it will become too intelligent and take over, but that it will become too convenient and seduce us into delegating the most human parts of our lives.
Andy Crouch, The Life We’re Looking For

So what’s my point in all of this? It’s surely not to reject the amazing tools available to us in the era of LLMs. It’s to recognize them as tools with strengths and weaknesses. And it’s also to remember something that Andy Crouch, an author whose commentary on the relationship of humans to technology I respect, talks about in his book The Life We’re Looking For, that superpowers often take something of our humanity when we assume them. When we step on an airplane to assume the superpower of crossing a continent in a matter of hours, we have to remain very still and give up exercise and mobility for the time it takes to travel. When by using our mobile phone we assume the superpower of navigating a city we’ve never been to before, we erode our human ability to find our way on our own (with consequences for cognitive decline, as it turns out, see this book and this article among others for nuance on the subject and what to do about it). And perhaps most relevant for this post, when you hand over the job of writing (code, or blog posts, or novels) to an LLM, you are eroding your ability to think about problems. As I’ve said before, learning to code is really learning to think about problems, and writing code is actively engaging with the problem in constructive ways.

This is why I founded Diller Digital, and why I still passionately believe in teaching coding skills. This principle guides the way we teach: starting with foundational principles and building up practical knowledge through examples and exercises with increasing independence. This is why by the end of a class, we are teaching you how to find out the answers to your questions for yourself using the knowledge framework we’ve developed together. We value human intelligence—not because it’s flawless, but because it’s rooted in judgment, context, and a lived understanding of the world. We believe machine learning is most powerful when it extends what humans can already do well. We build our courses to empower you to apply these tools responsibly, creatively, and critically.

Author: tim

Founder and Principal at Diller Digital View all posts by tim

On the Usefulness of LLMs and Other Deep Learning Models

Author: tim

One thought on “On the Usefulness of LLMs and Other Deep Learning Models”

Leave a Reply Cancel reply