Why AGI is not coming

I have been thinking about where we are at with AI in 2026; I think most models have now reached their intelligence ceiling. They’ll get incrementally better over time, sure, but there’s a big difference between 10% to 80% and 80% to 100%.

In most tech projects, the first 80% is relatively straightforward: CRUD operations, database schema design, and boilerplate code. The real challenge lies in the remaining 20%, the core business logic, and the actual problem being solved. That part demands deep domain knowledge, experience, and sustained effort ~ things LLMs cannot simply replicate.

LLMs have gotten bigger, more resource-intensive, and are eating up GPUs like there’s no tomorrow! They’ve swallowed up the entire internet, but still nowhere close to replacing human beings. AGI is not coming; it’s a get-rich-quick pipe dream that was concocted by all these billionaire tech CEOs.

In reality, throwing more compute and data at models is not making them that much smarter. In fact, we are probably going to run out of good-quality public data because of the fall of Stack Overflow and smaller publishers that are going out of business.

In this article, let’s talk about a potential solution to many of these AI problems.

AGI is not coming

Even though I’m not a data science expert, I have worked with LLMs for a better part of 3-4 years now and have done tons of research into how they work and the field in general. This gives me some basis to comment on AGI and the future of AI, I think?

AGI is a pipe dream because LLMs are just prediction machines at the end of the day. Don’t get me wrong, the current iteration of this tech is really amazing and works really well in certain use cases, but it’s far from perfect and nowhere near what a human can do.

They have no worldview or real understanding of the data. They are trained on billions of examples, questions, answers, and fragments of information. They operate by breaking sentences into tokens, mathematically comparing inputs to learned patterns, and returning a human-like response through statistical grounding.

It’s a fancy algorithm; this is why AGI with the current architecture is impossible.

Secretly, AI companies have accepted this fact. They say many things in the media, but in reality, models are becoming commodities, and AI companies are just building ecosystems around these, so that they can lock in those subscriptions and keep you in their environment.

Think about it, why build Claude Code, a tool for developers, if AGI is going to replace all these developers, who then is going to pay the subscription fees?

AGI is not coming, and LLMs are becoming commodities; so, what’s next?

AGI is fueling public anxiety

People are scared of AI, let alone AGI, because every time you tune into the news, YouTube, TikTok, or some other media platform, you see big tech CEO’s making outlandish claims about how AI is so close to becoming AGI, and how they want everyone to use this tech, and if you don’t, you’re missing out and will likely be replaced by AI.

These claims alienate potential users rather than attracting them. Additionally, companies like Microsoft have been criticized for poor implementations of their AI offerings and basically just forcing AI into everything! even in contexts where it really adds no value, and makes the user experience suck so much more!

This pushes the average Joe towards two extremes: either using the technology inappropriately, such as relying on it for critical medical advice, or dismissing it altogether as yet another unreliable app that only works some of the time.

The truth is that AI is not as important as big tech CEOs would like you to think. Now, don’t get me wrong, it’s a powerful tool and can be really useful when used correctly, but it’s not as impactful as something like mobile phones or even the web.

Mobile phones are part of everyday life; we use them to text, communicate, socialize, consume media, and also as a business tool. The use cases are endless and widespread, and while I was born in a world where mobile phones didn’t originally exist, I still value this device enough to confidently say I just cannot function day-to-day without it!

I cannot say the same about AI. Sure, it makes my life a little easier, but its use cases are very niche, and for most people, they’re just using it as a Google++ or social media content generator, hardly life-changing!

I would consider it as a value-add rather than a full-blown platform on its own.

Tiny LLMs are the future

One of the biggest problems with LLMs is that they are GPU hungry. Your typical LLM, like Claude Opus, has been trained on trillions of tokens containing data from the entire public web and then some.

Yet I’m just using Claude code to generate some PHP or Python code; do I really need a model trained on trillions of tokens for that? The energy and infrastructure costs are significantly high, and it’s hard not to question whether this arms race toward AGI is worth it.

Instead, we should pursue a different approach: treat frontier models primarily as statistical engines, paired with a strong baseline of intelligence and tool-calling capabilities.

So basically, a kind of “Flash” or “Mini” model similar to GPT 5 MINI or Gemini 2.5 Flash. The goal would be a model capable of basic probabilistic generation and lightweight reasoning, without the cost and complexity of full frontier-scale model like Opus.

This will reduce the amount of GPU resources we need, making these models more scalable and affordable for mass adoption.

Knowledge packs can solve most issues

This brings me to the central question: could tiny LLMs replace larger models like Sonnet or Opus for most use cases? The challenge is that smaller models lack the extensive knowledge corpus of their larger counterparts, constraining their performance across many tasks. They're better suited for narrow, specialized use cases.

This is where knowledge packs come in. A knowledge pack is like a black box of pretrained weights; think of it like a plugin. You don’t need to know what makes up the plugin, or how it even works; all you need is a consistent API that the LLM can talk to, to retrieve the knowledge it needs as the need arises.

For example, a knowledge pack for the PHP programming language might be 20-30GB in size, which can comfortably fit on a modern laptop or computer. When you’re working in Laravel, you would then point the LLM to that knowledge pack, and it should naturally then scope its knowledge to that pack.

Whenever it needs to generate code or answer a question, it then only needs to look up information in that pack and not get confused by information that may look similar but is not related.

To give you a simple example: Let's take a random name: "Bob Barker." If I ask a large LLM, "Who is Bob Barker?", the model will generate a response based on patterns it learned during training from sources like LinkedIn, Wikipedia, Google searches, and countless other documents.

A frontier model might have 100B+ parameters because it was trained to handle everything from PHP programming to celebrity trivia to medical questions. Each time you prompt this model, it processes the input by running computations across all these parameters, requiring significant compute.

Now, imagine instead you have a much smaller base model, say 2B parameters, that was trained primarily on Wikipedia. This model would use a fraction of the computational resources at inference time simply because there are fewer parameters to process. It would also likely give more accurate answers because its training was concentrated on a specific domain, using a much narrower and more focused dataset (almost like fine-tuning).

This is where knowledge packs come in. Instead of one massive model trying to encode all human knowledge in its weights, you'd have a lean base model paired with specialized knowledge modules that can be loaded as needed. When you're asking about Bob Barker, you'd activate the "general knowledge" pack. When you're coding in Laravel, you'd load the "PHP/Laravel" pack and so on.

This isn't quite RAG - you're not doing runtime retrieval from a vector database, instead, you're swapping in pretrained weights or specialized model components that give the base model deep expertise in specific domains, while keeping the active computational footprint small.

The only problem is if you loaded a PHP knowledge pack, but the user asks a CSS question. What does the model do?

I think we can extend the tool-calling capabilities of these models to understand knowledge pack routing. Each knowledge pack would contain metadata that the model uses to determine how to route each prompt request. However, this mechanism would be built into the model itself and wouldn't pollute the context window (similar to how MCP can pollute the context window by publishing too many tools or resources).

Knowledge packs can fix funding roadblocks

Whether you talk about Google AI overview or ChatGPT, it’s a reality that most of this data was kind of “stolen” from publishers, even if they cite the authors, most end-users won’t even bother visiting the actual website. Everyone is obsessed with instant gratification.

LLM companies used public data that publishers, developers, artists, and authors, in general, have spent thousands of hours creating; models usually obscure the information enough so that it cannot be easily associated with the original work, and when it can be, they cite the content created as a source.

These small content creators never get a dime for their hard work, and many have to either downsized or shut down their website entirely because they just aren’t getting enough direct traffic anymore, and this is only going to get worse as the trust in LLMs grow.

Allowing independent publishers to publish their own knowledge packs provides a mechanism for them to get back some of this lost revenue.

I, as an individual, can choose to buy a knowledge pack from an open-source shop or subscribe to a monthly subscription to get the latest content for a small fee, instead of paying $20-$100 to these big tech companies that just never contribute back to these communities or individual artisans.

Sure, this probably will inflate the monthly $20 Claude subscription, but with the big players on board, I’m sure we can work out a competitive funding model.

Conclusion

Not to be repetitive, but AGI is not coming!

We need to move away from this big LLM idea and focus on building smaller, narrowly focused tiny LLMs that can be paired with specialized knowledge packs. This approach offers a path toward more sustainable, affordable, and ethically sound AI.

One that reduces computational waste, empowers content creators, and delivers better results for specific use cases. The future of AI isn't about building a giant AGI model that replaces humans, it's about using AI as a tool, optimized to help both free and commercial users drive real value from the product in their daily lives.

It’s about moving past the hype and capitalist ideals and building a future with AI responsibly!

Why AGI is a Pipe Dream and what we should build instead

AGI is not coming

AGI is fueling public anxiety

Tiny LLMs are the future

Knowledge packs can solve most issues

Knowledge packs can fix funding roadblocks

Conclusion

Comments

More from this blog

Golang: Building a windows parental control app using wails

Django pocket reference guide

Workhorse AI models you probably ignored

~Zod and React Hook Form

Command Palette

AGI is not coming

AGI is fueling public anxiety

Tiny LLMs are the future

Knowledge packs can solve most issues

Knowledge packs can fix funding roadblocks

Conclusion

Comments

More from this blog