OpenAI may be synonymous with machine learning and Google is doing its best to get itself off the ground, but both may soon face a new threat: rapidly multiplying open source projects pushing the state of the art and the entrenched but unwieldy companies in their dust. This Zerg-like threat may not be existential, but it’s sure to keep the dominant players on the defensive.
The idea is by no means new – in the fast-moving AI community, this kind of disruption is expected to happen on a weekly basis – but the situation was put into perspective by a widely shared document claimed to be from Google. “We don’t have a moat, and neither does OpenAI,” the memo reads.
I won’t burden the reader with a long summary of this perfectly readable and interesting piece, but the gist is that while GPT-4 and other proprietary models have gotten the lion’s share of the attention and even the income, the edge they’ve gained with funding and infrastructure is looking leaner by the day.
While the pace of OpenAI’s releases may seem lightning fast by the standards of regular major software releases, GPT-3, ChatGPT, and GPT-4 were certainly close to each other when compared to versions of iOS or Photoshop. But they still occur on the scale of months and years.
What the memo indicates is that a leaked base language model of Meta, called LLaMA, was leaked in pretty rough form in March. Inside to soften, people tinkering with laptops and penny-a-minute servers had added core features such as instruction tuning, multiple modalities, and reinforcement learning from human feedback. OpenAI and Google were also probably poking around in the code, but they couldn’t replicate the level of collaboration and experimentation that occurs in subreddits and Discords.
Could it really be that the giant computational problem that seemed like an insurmountable obstacle – a moat – to challengers is already a vestige of another era of AI development?
Sam Altman already noted that we should expect diminishing returns if we throw parameters at the problem. Bigger isn’t always better, of course – but few would have guessed that smaller was instead.
GPT-4 is a Walmart and nobody really likes Walmart
The business paradigm currently being pursued by OpenAI and others is a direct descendant of the SaaS model. You have a high value software or service and you provide carefully secured access to it through an API or similar. It’s a straightforward and time-tested approach that makes perfect sense when you’ve invested hundreds of millions in developing a single monolithic but versatile product, such as a large language model.
If GPT-4 generalizes well for answering questions about contract law precedent, great – it doesn’t matter that much of its “intellect” is devoted to being able to copy the style of any author who has ever produced a work in the English language has published. GPT-4 is like a Walmart. Nobody actually want to to go there, so the company makes sure there is no other option.
But customers are starting to wonder: Why am I walking down 50 aisles of junk to buy a few apples? Why am I hiring the services of the largest and most common AI model ever created if all I want to do is use some intelligence to compare the language of this contract with a few hundred others? At the risk of torturing the metaphor (not to mention the reader), if GPT-4 is the Walmart you go to for apples, what happens when a fruit stand opens in the parking lot?
It wasn’t long in the AI world before a large language model, in highly truncated form, of course, was running on (appropriately) a Raspberry Pi. For a company like OpenAI, its jockey Microsoft, Google or anyone else in the AI-as-a-service world, it basically disappoints the whole premise of their company: that these systems are so hard to build and use that they have to do it for you. In fact, it seems that these companies have chosen and developed a version of AI that fits their existing business model, not the other way around!
There was a time when you had to transfer the calculations involved in word processing to a mainframe – your terminal was just a monitor. Of course, that was a different era and we have long since been able to fit the entire application on a personal computer. That process has happened many times since then, as our devices have repeatedly and exponentially increased their computational capacity. Nowadays, when something has to be done on a supercomputer, everyone understands that it is only a matter of time and optimization.
For Google and OpenAI, the time came much faster than expected. And they weren’t the ones doing the optimization – and maybe never at this rate.
That doesn’t mean they’re just unlucky. Google didn’t get where it is by being the best – at least not for a long time. Being a Walmart has its perks. Businesses don’t want to have to find the custom solution that gets the job done 30% faster if they can get a decent price from their existing supplier and don’t rock the boat too much. Never underestimate the value of slowness in business!
Of course, people repeat on LLaMA so quickly that they’re running out of camelids to name them after. By the way, I’d like to thank the developers for an excuse to just scroll through hundreds of photos of cute, tawny vicunas instead of working. Few enterprise IT departments will cobble together an implementation of Stability’s open source derivative in execution of a quasi-legal leaked meta-model over OpenAI’s simple, effective API. They have a business to run!
But at the same time, I stopped using Photoshop for image editing and creating years ago because the open source options like Gimp and Paint.net have gotten so incredibly good. At this point, the argument goes the other way. How much to pay for Photoshop? No way, we have a business to run!
What Google’s anonymous authors are clearly concerned about is that the distance from the first situation to the second will be much shorter than anyone thought, and that no one seems to be able to do anything about it.
Except, the memo argues: embrace it. Open, publish, collaborate, share, compromise. As they conclude:
Google should become a leader in the open source community and lead the way by collaborating with, rather than ignoring, the wider conversation. This probably means taking some awkward steps, like publishing the model weights for minor ULM variants. This necessarily means giving up some control over our models. But this compromise is inevitable. We cannot hope to both stimulate and control innovation.