Recently, OpenAI CEO Sam Altman attended an in-depth interview to share his history of entrepreneurship from 2016 to the present, insights into the current state of the artificial intelligence (AI) industry and projections of future trends. ** He began by reviewing the history of the creation and development of OpenAI** In 2016, Sam Altman, in a small office with a dozen companions, explored the direction of artificial intelligence around a whiteboard, when they had not even defined the specific concept of a large-scale language model (LLM) but only tried to “play video games”. Altman recalled: “We were just convinced that a certain direction was right, but we had no clear plan, let alone that we would stand on the stage today.” Initially OpenAI had a lot to explore, from games to robotics to no supervised learning models, which led to the gradual clarification of the development path of the GPS series.
# Here’s a summary of some of the main elements:
** The first product is not ChatGPT**
-
The first real “consumption level” product is the DALL E image generation model.
-
The real first commercial product is GPT API, which was launched around June 2020.
** The secret of the high speed release rhythm**
-
Maintain the “small team + high responsibility” principle to ensure that there is real output per person.
-
Avoiding bureaucracy: maintaining small teams, with a few taking on a large number of responsibilities, rather than “dozens sitting in conference rooms arguing about details”.
-
Keep busy: ensure that researchers, engineers and product teams remain busy and efficient.
-
Corporate growth must be accompanied by growth in product output**, otherwise it can easily fall into “human inflation + proliferation of meetings”.
** Strategic direction for products**
-
**OpenAI positioned itself as a “ Core AI Subscription Service “ **, spread around ChatGPT.
-
Continue to expand API, SDK and explore the possibility of platforming.
-
Long-term objective: to create an “AI platform at the level of the future operating system”.
** Models, arithmeticals and algorithms**
-
Three pillars: ** Better Models , ** Stronger computing infrastructure **, ** Broader social embedding
-
The scale of the model continues to grow, from GPT-3 to GPT-4, and to evolve even further.
-
Algorithm innovation will be the greatest leverage and there may be one or two “10 or 100 times” breakthroughs in the future;
-
Coding capacity will become the core competitiveness of AI, and future models will require the ability to perform operations and even complete processes;
-
Voice technology will be the next focus of OpenAI ‘ s input, with the goal of interacting with humans, leading to new interactive patterns and even types of equipment.
The future shape of ChatGPT
-
Long-term objective:** Infinite contextual memory** + ** Personalization experience without fine-tuning**.
-
The ideal state: “All life data, dialogue, behaviour of users” is in the context where full personalization can be achieved.
** Use of young users: ChatGPT as operating system** In the interview, Altman specifically mentioned the unique way the younger generation uses ChatGPT:
-
Older persons use ChatGPT as a search engine;
-
Young people between 20 and 30 years of age consider them as life counsellors;
-
Students use ChatGPT as a personal operating system for more in-depth interaction and collaboration.
He was particularly struck by this intergenerational difference: “Youth consult ChatGPT almost before all major decisions are made, and they write the hints in their minds, and the interaction with the models is very complicated.” ** Attitudes and visions for customization** Altman stated that the customization of the model at this stage represented a compromise towards the desired state, which was: “A small reasoning model, with hundreds of billions of token context, covering all data throughout your life, can refine your personal data without the need for frequent re-training.” ** Industry Watch and Business Transformation Challenges** For the transformational dilemma of large companies in the AI era, Altman argued:
-
Large companies are vulnerable to old rules, and each technological revolution has a successful start-up company;
-
The slowness of decision-making by large companies and the difficulty of adapting to a rapidly changing environment, while young firms can adapt and innovate more quickly.
He pointed out that “the essence of a company is an extension of individual behaviour, with young people more likely to adapt to the rapid changes in AI tools, while large companies often lag behind”. ** Advice to entrepreneurs: resilience and durability in the face of adversity** At the end of the interview, Altman shared his advice to entrepreneurs in the face of adversity:
-
The real difficulty is not the day of the crisis, but the long after-effects of the crisis;
-
“The emotional resilience and affordability of adversity require continuous training and development.”
He encouraged the founders: “The psychological capacity to deal with the problem will become stronger over time, although the risks are increasing”.
Interview video
Interview text translation
** Moderator:** Our next guest doesn’t need to be introduced, so I’m not going to say much. I just want to say that Sam Altman has been in our AI three times in a row, and we appreciate his support. Welcome, Sam. Sam Altman: Thank you. Glad to be back, this is our first office in the year. ** Moderator: ** Really? Say it again? Sam: Yeah, this is our first office. It’s nice to be back here. ** Moderator: ** Let’s go back to the original office time. You started in 2016. We just invited Jensen, who said he was delivering the first GGX1 system. Sam: That’s what he said? Yeah, think about how small the machines are right now… though the equipment is still big. But it’s an interesting memory. ** Moderator: ** How much was it? Sam: he said about 70 pounds. It’s heavy, but it can still be moved. ** Moderator: ** So did you ever think you’d be standing here today in 2016? Sam: No. At that time, we were probably 14 individuals sitting around the board to discuss what we were going to do. We were a research laboratory, with strong conviction and direction, but no specific action plan. Not only were there concepts of “company” and “products” at the time, but even large language models were far from being formed. Our goal was to “let AI play games.” ** Moderator: ** You took six years to launch the first consumer-oriented product, ChatGPT. How did you set milestones until then? Sam: The first consumer-oriented product is not ChatGPT, but DALL E. But, technically, the first real product is API. We tried a lot of directions, like, we thought, “We have to build a system to see if we’re on the right path, not just to write papers.” So we try to get models to play video games, control robots, etc. And then it started with one person, then a small team, who was interested in learning without supervision and building language models. That led to GPT-1, and then to GPT-2. To GPT-3, we thought, “This thing is interesting,” but we didn’t know how to use it, and we realized we needed more money to do bigger models, like GPT-4. It was hard to train such “billions of dollars models” in a way that scientific experiments alone could not last unless you were a research institute like a particle accelerator – but even so, it was difficult. So we started thinking: we both want to turn this into a sustainable business system, and we think that this technology will be useful. We release the model weight of the GPT-2, resounding flat. I see a phenomenon: a lot of YC (Y Combinator) companies often run well if they do API products. And when the models were getting bigger and harder to deploy, we said, “Let’s write a good software to host them.” At the same time, we didn’t want to do a product directly, but we wanted someone to build a product based on our API. I don’t remember exactly the exact time, probably June 2020, when we released the GPT-3 API. The public didn’t react, but some people in Silicon Valley noticed. They thought, “Oh, this thing is interesting.” Some even thought it was a prototype of AGI. But the real business with GPT-3 API is probably just some of the “AI writing services” companies. The GPT-3 has just reached the threshold of “economic utility” in this direction. But we noticed one interesting thing: no one can make too many products with GPT-3 API, but everybody likes to talk to it in Playgroup. Although it was poor at the time, because we didn’t have an enhanced learning human feedback (RHF), we liked talking to it. In addition to writing, this is probably the only “killer-level application,” and it’s the key clue that led us to make ChatGPT. By the time we launched ChatGPT 3.5, the direction in which API could build business had changed from one to eight, but our core conviction is becoming clearer: people want to talk to models. We’ve done DALL E before, and it’s doing well. But as the model can be fine-tuned, we know very well that we want to build a “product that lets you talk to the model,” and there’s ChhatGPT. It’s on line on November 30, 2022, and it’s been about six years since we were set up. Now, more than 500 million people use it every week. ** Moderator: ** You’ve published very fast in the last six months. How did you do that? Many big companies are going slower. Sam: A lot of companies grow up without “doing more” but with a lot of change in output. They maintain the same product lines, output rhythms, which slow down the process and reduce efficiency. I believe that “let everyone be busy” is the key. We tend to keep small teams and take a lot of responsibility. Otherwise, you’ll have “40 people in the same meeting” and argue for a small function. In terms of business principles, a good executive must be busy — because if he’s idle, he’s probably just running around. Researchers, engineers, product managers are the ones who make value. You keep these people strong and productive, that’s the best solution. Now we have to build a truly important Internet platform. If we can really be everyone’s “daily AI assistant”, we have to cover all the services, scenes, platforms, and equipment in their lives. That means that we have to build a lot of functions – not waiting. ** Moderator: Your proudest product in the last six months?** Sam: I think the models are really good in themselves. Of course they have room for uplifting, but ChatGPT is a good product because it’s a good model. We’ve done a lot at the product level, but the “big model” is the core. ** Moderator: How can you avoid being run over by you?** Sam: We want to be a “core AI subscription service” for users. We’ll build an increasingly intelligent model, along with some interfaces, future devices, entry points for similar operating systems. But we haven’t fully figured out the standards of the API, SDK, and these interfaces, and we may need to try a few rounds. But once it is determined, we want to create a lot of wealth and opportunity to help others build on it. Our goal is to model, subscribe to, perform a few key scenes, and others are welcome to build. ** Moderator: From outside, you’re financing at $340 billion, $40 billion? ** Sam:We’ve already published it. ** Moderator: What is your “ambitious” next step?** Sam: There’s no big blueprint. We just keep building models and publishing good products. We don’t do reverse-by-subject projects. We believe that focusing on every step of the day is better than pushing backwards from the finish line. We know we need bigger AI infrastructure, stronger models, better consumer products, and we’re flexible, fast-tracking tactics. The product that we’re going to build next year, and we may not have figured it out yet. I have more faith than ever in the course of our research. ** Moderator: So you believe more in “go ahead” than “reversely push the blueprint”?** Sam Altman: Yes. I’ve heard some talk about how they’re “scheming the path”, like, “Let’s do this, then do that, then rule the world”, and then push back to today’s starting point — I’ve never seen such people actually succeed. ** Audience asks: What do you think is wrong with the transformation of large corporations into AI-based organizations?** Sam: I think this is what happens in every technological revolution. There’s nothing to be surprised about. The problem is that they’re, like in the past, trapped in the path. When you are confronted with a world that is changing dramatically every quarter, your Information Security Committee meets only once a year to discuss which applications are allowed and how to access the data … This is a disaster. Large companies are trapped by their own processes and culture. They try to pretend that all these changes will not reshape the industry, but in the end they can only “surrender” at the end of the day. It’s not just an organizational phenomenon, it’s an intergenerational difference. For example, you look at how a 20-year-old uses ChatGPT, and you look at how a 35-year-old uses it, and the difference is amazing. Like when smartphones come out, children can be skilled, and adults take three years to use basic functions. This intergenerational difference is now particularly evident in AI tools, where corporate organizations are only an extension of the phenomenon. ** Audience asked: What is the speciality of young people using ChatGPT?** Sam: They really use it as an “operational system”. They have a set of processes that connect ChatGPT to a variety of files, with complex prompts in their minds, or they keep them elsewhere and paste them at any time. They even ask ChatGPT before they make a life decision. It has a history of chatting with their friends, a context of life, and a “memory function” that deepens the relationship. To sum up:
-
Older users replace it with Google;
-
20-30-year-old users use it as a “life consultant”;
-
University students use it as an “AI operating system”.
** Audience asked: How did OpenAI use ChatGPT inside?** Sam: It wrote a lot of our codes. I don’t know exactly how much, and I don’t think it makes sense to use “code lines.” Microsoft says that 20-30% of their code is written by AI, but line numbers are not the point. I can say that it did write the code for “important parts”. ** Audience asks: ** Since most of your income comes from consumer subscriptions, why keep API?** Sam: I hope in 10 years these things will be integrated. For example, you can use OpenAI login for other services. Other services can access our powerful SDK, even embedded in the ChatGPT UI. Because if you want to have an AI that understands you, context, memory, understanding your life, you want to use it in different settings. Although the current API is a long way from this goal, I’m sure we’ll move on. ** Audience asks: ** Are you going to give priority to developers when we’re a start-up company that’s building an application level and wants to use bottom components, such as “In-depth research API”?** Sam: I hope we can finally build a new protocol, just like the HTTP of the Internet age. The future Internet will be decentralised, made up of many small components and agents. They call each other tools, complete identification, transfer, share data – all built in the universal protocol layer. We don’t know what it looks like yet, but it’s “showing out of the fog.” We need a few more rounds, but that’s what I’d like the platform to look like. ** Audience asks: ** Do you consider accessing sensor data in the real world (e.g. temperature) to enhance AI’s understanding?** Sam: A lot of people are already doing this. For example, someone’s inputting sensor data into API (e.g. calling GPT-4o), which works very well in some settings. The latest models have made significant progress in processing these data, which were not good in the past, but are now doing well. We will support this in a more systematic way in the future. ** Audience asks: What do you think about the importance of voice? How does it rank at the infrastructure level?** Sam: Voice is very important. Frankly, our voice product is not good enough – but that’s okay, and the text model is not good at first. We’ll finally solve the voice problem. I’m sure people will be more willing to interact with the voice. When we first released voice mode, I found a very interesting thing: you can click on a cell phone while you talk, like a stacking interactive. This “voice+GUI” experience has great potential. We have not solved it yet, but once it has been done, I think not only will we be able to perform well on the existing equipment, but it may even lead to the creation of a category of “new equipment”. ** Audiences ask: is the code one of your prowess, or is it the core of the future?** Sam: The code is the core. At present, ChatGPT returns text and sometimes images. But, ideally, it should return to an entire program. That is, it can build a complete system at your request – or call API to do what you want. I think “writing code” is the central way for AI to act and influence the world. ** Audience asked: What are the important undervalued factors in addition to data, algorithms and arithmetic?** Sam: Every one is really hard. And, of course, the biggest leverage is the algorithmal breakthrough. I think we have a couple of times 10 times or 100 times the level of progress – not much, but one or two can make a big difference. So, yes, it’s the three main axes: algorithms, data, arithmetic. ** Audience asked: How do you balance the management of free exploration with project advancement?** Sam: Some projects do require “top-down” coordination, but a lot of people do too much. We spent a lot of time studying “What is a good research lab” since OpenAI was founded. You have to go back to history and look at the great research institutions of the past. We have asked a lot of people — many of them, of course, have died, after all, this pattern has not been in place for a long time. People often ask us, “Why is OpenAI innovating, while other laboratories imitate?” We give principles, sources, experiences that we follow. And they say, “Thank you for sharing, we’ll do it our way.” And then we fail. We did not invent these principles ourselves, but rather “corruptly copied” from the best research laboratories in history, but they are truly effective. ** Audiences ask: Do you think large models can help us answer questions in the humanities, such as historical cycles, social prejudices? ** Does OpenAI have a cooperative plan?** Sam: Yes, we have collaborative projects with the academic community, and we also have some customized support. But most researchers just want to access models or model models, and we’re good at that. We have cooperative projects and customized them, but most of our momentum is focused on “making models smarter, cheaper and more widely available”, which is in fact very useful to the academic community and to humanity as a whole. ** Audience asks: What do you think of “AI personalization” and “custody model”? Do you prefer “core model improvement” or “reprocessing fine-tuning”?** Sam: My ideal shape is: a small but powerful reasoning model with a trillion token context window that can fit your life. The model will never need to be retrained or reweighted, but it has all your conversations, books read, mail, web pages, data streams, corporate information, and continuously updates the context. That is my ideal “custody model” — and all the fine-tuning and reprocessing we do now are compromises on that ideal state of affairs. ** Audience asks: What do you think is the greatest source of value for the next 12 months? Sam: I think the value of the future will be concentrated in three directions: Infrastructure: Large-scale AI plant, computing cluster; ** Smarter models; ** Systematic design to integrate AI with social integration If these three things continue to move forward, other issues will be resolved naturally. More specifically: I predict that the year 2025 will be an “AI doing things” year, especially the code ** will become the main battleground. By 2026, we might see AI helping humans make major scientific discoveries. In 2027, I think AI will really move into the physical world and robots from “new toys” to “economic entities”. ** Moderator: Last few quick questions. Is ChatGPT 5 smarter than all of us here? Sam: If you think you’re a lot better than GPT-4o, you might have a little more time to chase. GPT-4o is already smart. ** Moderator: As a founder, you’ve also experienced corporate unrest. It’s a little far away. Sam: Over time, you’ll face bigger challenges and higher stakes. But you’ll have less stress. The more difficult you go through, the faster you recover. More emotional resilience. And what’s really hard is not the moment of the crisis, but the “60th day after,” when you’re trying to rebuild, that’s the worst. In the current crisis, there was adrenaline and supporters, but long-term recovery, psychological construction and training were more difficult and often neglected. ** Moderator: Thank you, Sam. You’re actually on paternity leave.** Sam: Yes, but it’s good to be here, thanks for the invitation.