AI & RoboticsNews

Lightning AI CEO slams OpenAI’s GPT-4 paper as ‘masquerading as research’

Shortly after OpenAI’s surprise release of its long-awaited GPT-4 model yesterday, there was a raft of online criticism about what accompanied the announcement: a 98-page technical report about the “development of GPT-4.” 

Many said the report was notable mostly for what it did not include. In a section called Scope and Limitations of this Technical Report, it says: “Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.”

“I think we can call it shut on ‘Open’ AI: the 98 page paper introducing GPT-4 proudly declares that they’re disclosing *nothing* about the contents of their training set,” tweeted Ben Schmidt, VP of information design at Nomic AI. 

>>Don’t miss our special issue: The quest for Nirvana: Applying AI at scale.<<

And David Picard, an AI researcher at Ecole des Ponts ParisTech, tweeted: “Please @OpenAI change your name ASAP. It’s an insult to our intelligence to call yourself ‘open’ and release that kind of ‘technical report’ that contains no technical information whatsoever.” 

One noteworthy critic of the report is William Falcon, CEO of Lightning AI and creator of PyTorch Lightning, an open-source Python library that provides a high-level interface for popular deep learning framework PyTorch. After he posted the following meme, I reached out to Falcon for comment. This interview has been edited and condensed for clarity.

VentureBeat: There is a lot of criticism right now about the newly released GPT-4 research paper. What are the biggest issues? 

William Falcon: I think what’s bothering everyone is that OpenAI made a whole paper that’s like 90-something pages long. That makes it feel like it’s open-source and academic, but it’s not. They describe literally nothing in there. When an academic paper says benchmarks, it says ‘Hey, we did better than this and here’s a way for you to validate that.’ There’s no way to validate that here.

That’s not a problem if you’re a company and you say, “My thing is 10 times faster than this.” We’re going to take that with a grain of salt. But when you try to masquerade as research, that’s the problem.

When I publish, or anyone in the community publishes a paper, I benchmark it against things that people already have, and they’re public and I put the code out there and I tell them exactly what the data is. Usually, there’s code on GitHub that you can run to reproduce this.

VB: Is this different than it was when ChatGPT came out? Or DALL-E? Were those masquerading as research in the same way?

Falcon: No, they weren’t. Remember, GPT-4 is based on Transformer architecture that was open-sourced for many years by Google. So we all know that that’s exactly what they’re using. They usually had code to verify. It wasn’t fully replicable, but you could make it happen if you knew what you’re doing. With GPT-4, you can’t do it.

My company is not competitive with OpenAI. So we don’t really care. A lot of the other people who are tweeting are competitors. So their beef is mostly that they’re not going to be able to replicate the results. Which is totally fair — OpenAI doesn’t want you to keep copying their models, that makes sense. You have every right to do that as a company. But you’re masquerading as research. That’s the problem.

From GPT to ChatGPT, the thing that made it work really well is RLHF, or reinforcement learning from human feedback. OpenAI showed that that worked. They didn’t need to write a paper about how it works because that’s a known research technique. If we’re cooking, it’s like we all know how to sauté, so let’s try this. Because of that, there are a lot of companies like Anthropic who actually replicated a lot of OpenAI’s results, because they knew what the recipe was. So I think what OpenAI is trying to do now, to safeguard GPT-4 from being copied again, is by not letting you know how it’s done. 

But there’s something else that they’re doing, some version of RLHF that’s not open, so no one knows what that is. It’s very likely some slightly different technique that’s making it work. Honestly, I don’t even know if it works better. It sounds like it does. I hear mixed results about GPT-4.  But the point is, there’s a secret ingredient in there that they’re not telling anyone what it is. That’s confusing everyone.

VB: So in the past, even though it wasn’t exactly replicable, you at least knew what the basic ingredients of the recipe were. But now here’s some new ingredient that no one can identify, like the KFC secret recipe?

Falcon: Yeah, that’s exactly what it is. It could even be their data. Maybe there’s not a change. But just think about if I give you a recipe for fried chicken — we all know how to make fried chicken. But suddenly I do something slightly different and you’re like wait, why is this different? And you can’t even identify the ingredient. Or maybe it’s not even fried. Who knows?

It’s like from 2015-2019 we were trying to figure out as a research field what food people wanted to eat. We found burgers were a hit. From 2020-2022 we learned to cook them well. And in 2023, apparently now we are adding secret sauces to the burgers.

VB: Is the fear that this is where we’re going — that the secret ingredients won’t even be shared, let alone the model itself?

Falcon: Yeah, it’s going to set a bad precedent. I’m a little bit sad about this. We all came from academia. I’m an AI researcher. So our values are rooted in open source and academia. I came from Yann LeCun’s lab at Facebook, where everything that they do is open-source and he keeps doing that and he’s been doing that a lot at FAIR.

I think LLaMa, there’s a recent one that’s introduced that’s a really good example of that thinking. Most of the AI world has done that. My company is open-source, everything we’ve done is open-source, other companies are open-source, we power a lot of those AI tools. So we have all given that a lot to the community for AI to be where it is today.

And OpenAI has been supportive of that generally. They’ve played along nicely. Now, because they have this pressure to monetize, I think literally today is the day where they became really closed-source. They just divorced themselves from the community. They’re like, we don’t care about academia, we’re selling out to Silicon Valley.

We all have VC funding, but we all still maintain academic integrity.

VB: So would you say that this step goes farther than anything from Google, or Microsoft, or Meta?

Falcon: Yeah, Meta is the most open — I’m not biased, I came from there, but they’re still the most open. Google still has private models but they always write papers that you can replicate. Now it might be really hard, like the chef or some crazy restaurant writing a recipe where four people in the world can replicate that recipe, but it’s there if you want to try. Google’s always done that. All these companies have. I think [this is] the first time I’m seeing this is not possible, based on this paper.

VB: What are the dangers of this as far as ethics or responsible AI?

Falcon: One, there’s a whole slew of companies that are starting to come out that are not out of the academia community. They’re Silicon Valley startup types who are starting companies, and they don’t really bring these ethical AI research values with them. I think OpenAI is setting a bad precedent for them. They’re basically saying, it’s cool, just do your thing, we don’t care. So you are going to have all these companies who are not going to be incentivized anymore to make things open-source, to tell people what they’re doing.

Second, if this model goes wrong, and it will, you’ve already seen it with hallucinations and giving you false information, how is the community supposed to react? How are ethical researchers supposed to go and actually suggest solutions and say, this way doesn’t work, maybe tweak it to do this other thing?

The community’s losing out on all this, so these models can get super-dangerous very quickly, without people monitoring them. And it’s just really hard to audit. It’s kind of like a bank that doesn’t belong to FINRA, like how are you supposed to regulate it?

VB: Why do you think OpenAI is doing this? Is there any other way they could have both protected GPT-4 from replication and opened it up?

Falcon: There might be other reasons, I kind of know Sam, but I can’t read his mind. I think they’re more concerned with making the product work. They definitely have concerns about ethics and making sure that things don’t harm people. I think they’ve been thoughtful about that.

In this case, I think it’s really just about people not replicating because, if you notice, every time they launch something new [it gets replicated]. Let’s start with Stable Diffusion. Stable Diffusion came out many years ago by OpenAI. It took a few years to replicate, but it was done in open source by Stability AI. Then ChatGPT came out and it’s only a few months old and we already have a pretty good version that’s open-source. So the time is getting shorter.

At the end of the day, it’s going to come down to what data you have, not the particular model or the techniques you use. So the thing they can do is protect the data, which they already do. They don’t really tell you what they train on. So that’s kind of the main thing that people can do. I just think companies in general need to stop worrying so much about the models themselves being closed-source and worry more about the data and the quality being the thing that you defend.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.


Author: Sharon Goldman
Source: Venturebeat

Related posts
AI & RoboticsNews

Nvidia and DataStax just made generative AI smarter and leaner — here’s how

AI & RoboticsNews

OpenAI opens up its most powerful model, o1, to third-party developers

AI & RoboticsNews

UAE’s Falcon 3 challenges open-source leaders amid surging demand for small AI models

DefenseNews

Army, Navy conduct key hypersonic missile test

Sign up for our Newsletter and
stay informed!