February | 2023 | Stuff I like

Running ChatGPT costs millions of dollars a day, which is why OpenAI, the company behind the viral natural-language processing artificial intelligence has started ChatGPT Plus, a $20/month subscription plan. But our brains are a million times more efficient than the GPUs, CPUs, and memory that make up ChatGPT’s cloud hardware. And neuromorphic computing researchers are working hard to make the miracles that big server farms in the clouds can do today much simpler and cheaper, bringing them down to the small devices in our hands, our homes, our hospitals, and our workplaces.

One of the keys: modeling computing hardware after the computing wetware in human brains.

Including — surprisingly — modeling a characteristic about our own wetware that we really don’t like: death.

“We have to give up immortality,” the CEO of Rain AI, Gordon Wilson, told me in a recent TechFirst podcast. “We have to give up the idea that, you know, we can save software, we can save the memory of the system after the hardware dies.”

Wilson is quoting Geoff Hinton, a cognitive psychologist and computer scientist, author or co-author of over 200 peer-reviewed publications, current Google employee working on Google Brain, and one of the “godfathers” of deep learning. At a recent NeurIPS machine learning conference, he talked about the need for a different kind of hardware substrate to form the foundation of AI that is both smarter and more efficient. It’s analog and neuromorphic — built with artificial neurons in a very human style — and it’s co-designed with software to form a tight blend of hardware and software that is massively more efficient than current AI hardware.

Achieving this is not just a nice-to-have, or a vague theoretical dream.

Building a next-generation foundation for artificial intelligence is literally a multi-billion-dollar concern in the coming age of generative AI and search. One reason is that when training large language models (LLM) in the real world, there are two sets of costs to consider.

Training a large language model like that used by ChatGPT is expensive — likely in the tens of millions of dollars — but running it is the true expense. Running the model, responding to people’s questions and queries, uses what AI experts call “inference.”

That’s precisely what runs ChatGPT compute costs into the millions regularly. But it will cost Microsoft’s AI-enhanced Bing much more.

And the costs for Google to respond to the competitive threat and duplicate this capability could be literally astronomical.

“Inference costs far exceed training costs when deploying a model at any reasonable scale,” say Dylan Patel and Afzal Ahmad in SemiAnalysis. “In fact, the costs to inference ChatGPT exceed the training costs on a weekly basis. If ChatGPT-like LLMs are deployed into search, that represents a direct transfer of $30 billion of Google’s profit into the hands of the picks and shovels of the computing industry.”

If you run the numbers like they have, the implications are staggering.

“Deploying current ChatGPT into every search done by Google would require 512,820 A100 HGX servers with a total of 4,102,568 A100 GPUs,” they write. “The total cost of these servers and networking exceeds $100 billion of Capex alone, of which Nvidia would receive a large portion.”

Assuming that’s not going to happen (likely a good assumption), Google has to find another way to approach similar capability. In fact, Microsoft, which has only released its new ChatGPT-enhanced Bing in very limited availability for very good reasons probably including hardware and cost, needs another way.

Perhaps that other way is analogous to something we already have a lot of familiarity with.

According to Rain AI’s Wilson, we have to learn from the most efficient computing platform we currently know of: the human brain. Our brain is “a million times” more efficient than the AI technology that ChatGPT and large language models use, Wilson says. And it happens to come in a very flexible, convenient, and portable package.

“I always like to talk about scale and efficiency, right? The brain has achieved both,” Wilson says. “Typically, when we’re looking at compute platforms, we have to choose.”

That means you can get the creativity that is obvious in ChatGPT or Stable Diffusion, which relies on data center compute to build AI-generated answers or art (trained, yes, on copyrighted images), or you can get something small and efficient enough to deploy and run on a mobile phone, but doesn’t have much intelligence.

That, Wilson says, is a trade-off that we don’t want to keep having to make.

Which is why, he says, an artificial brain built with memristors that can “ultimately enable 100 billion-parameter models in a chip the size of a thumbnail,” is critical.

For reference, ChatGPT’s large language model is built on 175 billion parameters, and it’s one of the largest and most powerful yet built. ChatGPT 4, which rumors say is as big a leap from ChatGPT 3 as the third version was from its predecessors — will likely be much larger. But even the current version used 10,000 Nvidia GPUs just for training, with likely more to support actual queries, and costs about a penny an answer.

Running something of roughly similar scale on your finger is going to be multiple orders of magnitude cheaper.

And if we can do that, it unlocks much smarter machines that generate that intelligence in much more local ways.

“How can we make training so cheap and so efficient that you can push that all the way to the edge?” Wilson asks. “Because if you can do that, then I think that’s what really encapsulates an artificial brain. It’s a device. It’s a piece of hardware and software that can exist, untethered, perhaps in a cell phone, or AirPods, or a robot, or a drone. And it importantly has the ability to learn on the fly. To adapt to a changing environment or a changing self.”

That’s a critical evolution in the development of artificial intelligence. Doing so enables smarts in machines we own and not just rent, which means intelligence that is not dependent on full-time access to the cloud. Also: intelligence that doesn’t upload everything known about us to systems owned by corporations we end up having no choice but to trust.

It also, potentially, enables machines that differentiate. Learn. Adapt. Maybe even grow.

My car should know me and my area better than a distant colleagues’ car. Your personal robot should know you and your routines, your likes and dislikes, better than mine. And those likes and dislikes, with your personal data, should stay local on that local machine.

There’s a lot more development, however, to be done on analog systems and neuromorphic computing: at least several years. Rain has been working on the problem for six years, and Wilson thinks shipping product in quantity — 10,000 units for Open AI, 100,000 units for Google — is at least “a few years away.” Other companies like chip giant Intel are also working on neuromorphic computing with the Loihi chip, but we haven’t seen that come to the market in scale yet.

If and when we do, however, the brain-emulation approach shows great promise. And the potential for great disruption.

“A brain is a platform that sports intelligence,” says Wilson. “And a brain, a biological brain, is hardware and software and algorithms all blended together in a very deeply intertwined way. An artificial brain, like what we’re building at Rain, is also hardware plus algorithms plus software, co-designed, intertwined, in a way that is really … inseparable.”

Even, possibly, at shutdown.

Get a full transcript of our conversation, or subscribe to TechFirst.

Follow me on Twitter or LinkedIn. Check out my website or some of my other work here.

I forecast and analyze trends affecting the mobile ecosystem. I’ve been a journalist, analyst, and corporate executive, and have chronicled the rise of the mobile economy. I built

Stuff I like

Monthly Archives: February 2023

The Register: OpenAI CEO heralds AGI no one in their right mind wants

The Atlantic: The AI Disaster Scenario

The New York Times: The Imminent Danger of A.I. Is One We’re Not Talking About

Can AI Think? Searle’s Chinese Room Thought Experiment

Many Experts Say We Shouldn’t Worry About Superintelligent AI. They’re Wrong – IEEE Spectrum

The Generative AI Race Has a Dirty Secret | WIRED

ChatGPT Burns Millions Every Day. Can Computer Scientists

Using ‘radioactive data’ to detect if a dataset was used for training

How to Detect AI-Generated Text, According to Researchers | WIRED

Scientists Made a Mind-Bending Discovery About How AI Actually Works

Recent Posts

Recent Comments

Archives

Categories

Meta

Search

RSS Feeds

Meta