What is an AI Engineer?

Raza Habib

In this week’s episode of the High Agency Podcast, I had the pleasure of hosting Shawn Wang, better known as Swyx. As the founder of Smol AI, host of the Latent Space podcast, and author of the influential essay "Rise of the AI Engineer," Swyx is at the forefront of defining the evolving field of AI engineering. The conversation covered a wide range of topics, from the emergence of the AI Engineer role to key trends shaping the industry. Swyx's insights, drawn from his extensive experience and network in the AI community, offer valuable perspectives for anyone working in AI development and product management.

Subscribe to Humanloop’s new podcast, High Agency, on YouTube, Spotify, or Apple Podcasts

Swyx sees the AI Engineer as a distinct role that bridges the gap between traditional ML engineering and product-focused development. While he acknowledges that the title might face skepticism and evolve over time, he believes it serves a crucial purpose in the current landscape. Unlike ML Engineers who typically focus on model development and optimization, AI Engineers are more concerned with applying AI capabilities to solve specific product challenges. They often work with pre-trained models and APIs, focusing on rapid iteration and product integration rather than deep mathematical optimizations. Swyx notes that while having ML knowledge is beneficial, the key skills for AI Engineers include understanding AI capabilities, product thinking, and the ability to quickly prototype and iterate on AI-powered features.

/blog/what-is-an-AI-Engineer/thumbnail.png

Shawn's depiction of the spectrum of software engineering roles and where the AI engineer fits in.

When it comes to building an AI product team, Swyx suggests an interesting composition. He proposes a ratio of about 4 AI Engineers to 1 ML Engineer. This balance allows teams to have the necessary depth in model development while maintaining a strong focus on product application and feature development. Swyx emphasises that AI Engineers can handle many tasks that ML Engineers might find tedious or less interesting, such as API integration, user interface development, and product iteration. He also stresses the importance of product managers and domain experts in AI teams, noting that their insights into customer needs and market dynamics are crucial for successful AI product development.

Swyx highlighted several key trends that AI professionals should be aware of. First, he pointed to the ongoing commodification of intelligence, where the cost of AI capabilities (measured by benchmarks like MMLU) is decreasing rapidly year over year. He also noted the significant improvements in inference speeds, with some models aiming for thousands of tokens per second. Expanding context windows, reaching up to a million tokens, is another trend opening new possibilities. Swyx emphasized the growing importance of multimodal AI, capable of handling various types of input and output. Lastly, he introduced the concept of "temperature 2" use cases, where AI's creative and sometimes unpredictable outputs are seen as a feature rather than a bug, potentially opening new avenues for innovation in AI applications.

I found it to be a fascinating conversation and I think you’ll really enjoy it too.

Chapters

00:00 - Introduction and background on Shawn Wang (Swyx)
03:45 - Reflecting on the "Rise of the AI Engineer" essay
07:30 - Skills and characteristics of AI Engineers
12:15 - Team composition for AI products
16:30 - Vertical vs. horizontal AI startups
23:00 - Advice for AI product creators and leaders
28:15 - Tools and buying vs. building for AI products
33:30 - Key trends in AI research and development
41:00 - Closing thoughts and information on the AI Engineer World Fair Summit

Highlights:

05:30 - "A wide range of AI tasks that used to take five years and a research team to accomplish now just require API docs and a spare afternoon in 2023."

18:45 - "Vertical startups actually have the unique insight into their customers that they're making the most money and growing the fastest compared to the horizontal ones."

25:30 - "Move fast and much more so than a typical YC startup mentality. If you're taking like three months to ship a thing, what could you do to make it three weeks?"

39:30 - "The cost of like a 70 MMLU every single year goes down by something like five to 10X. And so like, this is the observed trend."

Intro:

[00:00:00] Shawn Wang: I think there's some advantage to be early, but you do not get the right to win just by being early. The AI engineer is also like sort of a plug or filler gap for all the things that are not traditionally part of the ML engineer skill set. So what do I mean by that? The AI engineer is more of the 0 to 1 phase, whereas the ML engineer is more of the 1 to M phase.

[00:00:17] Shawn Wang: We should adopt like a fire ready aim approach instead of a ready aim fire approach. And where the old ML engineering process was much more deliberative, Here, you actually win by moving really quickly and getting information from the market, from shipping products, so that you can iterate much faster. I think we have three foundation models launching, by the way, that's sort of, yeah, I haven't, I have never said that publicly, but

[00:00:39] Raza: you heard it here first.

Podcast:

[00:00:41] Raza: This is High Agency, the podcast for AI builders. I'm Raza Habib. Today's a bit of a special episode of the High Agency Podcast because I'm joined by Shawn Wang, a. k. a. Swyx, author of Smol AI, the host of the Latent Space podcast, and critically the author of the essay Rise of the AI Engineer, as well as the founder of the AI Engineer [00:01:00] World Fair.

[00:01:00] Raza: And we're recording today two weeks before that AI Engineer World Fair, as well as roughly a year on from when Swyx first released the essay. And so I wanted to, to get Swyx on the show to talk about the AI Engineer essay, what's happened since then, as well as the upcoming conference and trends. So Shawn, it's great to have you on.

[00:01:17] Shawn Wang: Yeah. Thanks so much for having me excited to come on and talk a little bit about the meta behind the podcast and the essay.

[00:01:24] Raza: Yeah, fantastic. To start with, do you mind, I actually wanted to just read a little extract from the start of the essay to give people who maybe haven't read it the context, right?

[00:01:33] Raza: And it was polemic almost, pitching the claim that there's going to be a new type of role emerging, a different type of engineering role as a result of, of LLMs and, and Gen AI. And I think you put it really beautifully. So I'm going to just read a little extract from the start and then I'd love to get your reflections on it.

[00:01:49] Raza: Now that we're, we're a year on, so what you said was we're observing a once in a generation shift right of applied AI fueled by the emergent capabilities and the open source API [00:02:00] availability of foundation models. A wide range of AI tasks that used to take five years and a research team to accomplish in 2013 now just require API docs and a spare afternoon in 2023.

[00:02:12] Raza: However, the devil is in the details. There are no end of challenges in successfully evaluating, applying and productizing AI. I take this seriously and literally, I think it's a full time job. I think software engineering will spawn a new subdiscipline, specializing in applications of AI and wielding the emerging stack effectively.

[00:02:28] Raza: Just as site reliability engineer, DevOps engineer and data engineer and analytics engineer emerged, the emerging version of this role seems to be AI engineer. So I think that was a good summary of the thesis and beautifully written. So Shawn, my first question to you is, you know, the essay got a lot of attention at the time.

[00:02:45] Raza: Inevitably on Hacker News, there's a mixed reception of some people who completely buy it. Some people are critical. Reflecting now 11 months later, what do you think you got right? What do you think has changed? And I'd be curious to get your take.

[00:02:57] Shawn Wang: So, first of all, it's very flattering to have it read out by [00:03:00] you, because I definitely feel like you were earlier to the miya actually to this persona and you know, I think that that is a lot of the success of human loop as well.

[00:03:09] Shawn Wang: Yeah. So I think mostly the general thesis, the diagram, right, like the, Yeah. The one thing I probably did better than in the essay was just that diagram. I knew that I needed to have like a very, very simple diagram that would kind of stick in people's minds. And

[00:03:23] Raza: this is the picture you have in the essay of kind of a spectrum where on one side you have kind of hardcore ML engineers and on the other side, I think you've got sort of front end developers.

[00:03:33] Raza: There's a new kind of emerging gap. Well, we can link to it from the show notes, but yeah, maybe describe it maybe for people who are, who are listening.

[00:03:39] Shawn Wang: Yeah. I typically describe it as a spectrum from sort of data constraints, product constraint. And then there's a bunch of roles from most ML technical for at least ML oriented, like zero ML oriented.

[00:03:51] Shawn Wang: And there's, you know, there's all sorts of different roles beyond that. And I think that the primary feature that I think people should understand is there's a sort of API line between [00:04:00] the ML side and the product side, right? The API line used to be internal within the company and now increasingly it's between companies, especially as the cost to create a model increases and model labs become increasingly closed.

[00:04:13] Shawn Wang: Actually, it used to be the case that most companies would hire their own ML engineers and ML research scientists and all that. And now it's increasingly outsourced and you can sort of decouple that because of the, the rise of foundation models and concurrently. There is a rise in specialist engineers on the other side of the API line who are up to date and, and, and know the stack because the stack is deepening every single day.

[00:04:34] Shawn Wang: It is a full time job to keep up with the news and the people who are putting effort into it are going to do better than the people who are generalist and dabbling a little bit in AI. And I think that once I had that fundamental insight, I think it was very hard to shake the idea that there would be full timers who want to distinguish themselves.

[00:04:51] Shawn Wang: And that can also come a little bit into the origin story of DSA, which was, you know, Really, I was just trying to create a term that everyone would sort of congregate on, on the buy side and sell [00:05:00] side of talent. Because companies were talking to me all the time about like, I want to hire for this profile, but the average engineer hasn't spent nearly the amount of time that we will want for, instead of an AI specialist role.

[00:05:09] Shawn Wang: And then on the software side, on the engineering side, people wanted to know which direction to specialize in and honestly have a little bit of legitimacy because they would never have the background of a research scientist. They don't have a PhD. They don't have even have the qualifications of a typical data scientist.

[00:05:26] Shawn Wang: And I realized that all of those things were irrelevant in the new age of foundation models anyway. So here's, here's an opportunity for generational shift because of a platform shift. And I've seen this before. So that was the last part of the essay, which was that similar things that, you know, I've observed in my career for SRE and DevOps and data is happening for AI.

[00:05:44] Shawn Wang: So that's the process I went through really.

[00:05:46] Raza: One of the lines in that intro that really stuck out for me. was you said, you know, a wide range of AI tasks that used to take five years and a research team to accomplish now just require an API doc and an afternoon 2023. And I think there's [00:06:00] also this very famous XKCD cartoon from maybe like 2010, 2012, that kind of timeframe.

[00:06:05] Raza: Where, uh, you know, someone's asking an engineer like, Oh, can you, uh, sort of build an app that uses a GPS satellites to tell me my exact location on national park? And he's like, great. Yes. Then the follow up question is like, and can it also allow me to take pictures of birds and know what they are? And the response is, you know, give me 10 years in a research team.

[00:06:22] Raza: I'll come back to you. And the intuition that people often have for like, what's hard and what's easy and machine learning has historically been off given that there is now possible to do things in. hours or days that previously required a research team and years of work, et cetera. It's clearly not the case that the skills you need to be good at this are like a PhD and math research and all of those kinds of things.

[00:06:45] Raza: So what are the skills? Like what is the skill set that an AI engineer does need that's different to an ML engineer?

[00:06:52] Shawn Wang: I don't have a pat answer to this yet. And that's honestly one thing that I refrained from doing when writing the essay. [00:07:00] because every successful specification is necessarily underspecified and I wanted to let people define it themselves and, you know, have, have their own contribution to what it is.

[00:07:11] Shawn Wang: I think it's a spectrum. The more skills that the ML engineer you have, the more successful you're going to be as an AI engineer. It's just not strictly required to get started. But obviously the more advanced you, you become as an AI engineer, you're going to want to learn more and more of the ML engineer skills.

[00:07:27] Shawn Wang: What does that involve? So for example, you know, to get some kind of AI products up and running really quickly, you just need to prompt, you just need to know how to call a few APIs, but eventually, for example, if you want to put something into production, and if you want to create a motor on your products, you probably want to invest your op stack, your fine tuning stack and all the inference and sort of eval capabilities that you may need in order to support that.

[00:07:53] Shawn Wang: And that's increasingly more and more ML engineer. So I definitely think like the more sort of deep you go into the, the moat that you might want [00:08:00] to build for an AI, uh, AI product, the as well as control over open source you want. I think that those are the primary criteria, right? Like if you're primarily sort of wrapping GPT 4, which totally fine, and actually you can make some money doing that.

[00:08:12] Shawn Wang: You don't need as much of the ML engineer skills, but the more of the sort of model work that you take on, the more you're going to need to bring that ML engineering in house. Then I think finally, AI engineer is also like sort of a plug or filler gap for all the things that are not traditionally part of the ML engineer skillset.

[00:08:31] Shawn Wang: So what do I mean by that? Like ML engineers typically did not used to work on agents stuff. And of course they have been involved in the agent research for a while. Of course they have something to contribute there, but they don't have a unique advantage there. Actually, AI engineers with a completely different base of assumptions and experience and focus on agents might be able to do a better job than the typical ML engineer.

[00:08:50] Shawn Wang: I typically call this also a qualitative or anthropological difference between the ML engineer and the AI engineer that I observe. I've been observing the MLOps [00:09:00] community for a while and the other sort of ML engineering type data science communities. I've been attending Databricks data AI summits and, you know, observes the differences between their summits and mine, my conference.

[00:09:11] Shawn Wang: And it's just a different type of person. Basically boil it down to, if you're an ML engineer, you work on one to end problems, right? You have a baseline, a large amount of data to work with, and you are trying to optimize or build a very specific model for that problem because you have the scale to do so.

[00:09:27] Shawn Wang: You have the data and you have the problems where you're the kind of problem where like reducing, you know, like, Fraud rate from like 8 percent to 5 percent is a big deal. And like that makes a lot of money and there should be someone focused on that. But that's not what an AI engineer would do. AI engineer is more of the zero to one phase, whereas the ML engineer is more of the one to n phase.

[00:09:42] Shawn Wang: And it takes probably a different skill set, probably a different mindset to be successful there.

[00:09:46] Raza: So what, so what would be an example of that zero to one phase? Like what would be some, some things you have in mind there?

[00:09:51] Shawn Wang: The classic example is like, can the ML engineer make a cursor or make a co pilot, right?

[00:09:57] Shawn Wang: Can the ML engineer make a, uh, [00:10:00] photo AI or all the other sort of like generative image companies that have emerged.

[00:10:04] Raza: But is another way to think about this, that AI engineers and the kinds of people who are building products with LLMs and foundation models today tend to be closer to thinking about product.

[00:10:13] Raza: They're closer to full stack. These people have front end skills. Whereas ML engineers, by virtue of the fact that they need to be much more mathematically sophisticated, are much more specialized. For them, the world almost starts and end with the model. Whereas like for an AI engineer, the world only starts with the model, right?

[00:10:31] Raza: Actually, the model is kind of given to them. And what they're trying to do is figure out like, how do I make this a useful product?

[00:10:37] Shawn Wang: Yeah, totally. I draw, I draw that in the diagram as the API line, but it's, it's equally just the model, right? Whether or not you're creating the model or you have the model served to you and you pick up the baton and carry on from there.

[00:10:47] Shawn Wang: So yeah, it's a different mentality. Of course they can cross pollinate, of course you can do the skills of the others and be successful at the others, but there, there's a home base where you're most comfortable and I'm finding that these require different personas and the kind of person that's [00:11:00] successful in one doesn't necessarily translate to the other.

[00:11:02] Raza: I think I buy the thesis overall. Like I'm convinced that there is a new type of. persona and skillset emerging that's kind of got a little bit of the ML engineer in it, but also kind of drawing from full stack and is more product focused. You just see this in the tools and the languages people are using.

[00:11:17] Raza: I think JavaScript and, you know, front end tools like are way more popular amongst people building compared to like the ML stack or ML op stack traditionally, which is like much more Python heavy. And I think that like shows in the community, the AI engineer is like only one part of, I think of like a product AI team.

[00:11:34] Raza: I have my own opinions on this, but I'd be curious, like, if you were staffing a team to be building an AI product, what composition of skill sets would you be looking

[00:11:43] Shawn Wang: for? Oh,

[00:11:43] Raza: boy.

[00:11:44] Shawn Wang: Yeah, I, I, that's a, that's a fascinating question. I think the answer is definitely in there at the MVP stage. And then it's an open question as to how you develop it as you grow.

[00:11:55] Shawn Wang: I do consider there to, you know, This is the original essay and then [00:12:00] six months later at the first summit, I do talk about like the other, the three kinds of AI engineer, which we can talk about, but for team composition, I do think that we've had internal discussions in the discord. I think it's some kind of ratio of like, let's say four to one in terms of AI engineer to ML engineer for more mature teams.

[00:12:16] Shawn Wang: Because AI engineers do the stuff that ML engineers would usually chafe at because it's not really working on the model. But, you know, some, some kind of nice ratio where you have a decent amount of products, people filling out all the long tail gaps and working on product stuff. Whereas ML engineers, you know, would get the leeway and capacity to work on the most of the model.

[00:12:35] Shawn Wang: I think that makes sense. You know, if you want to expand the scope a little bit in terms of your team proposition to designers and product managers and all that, that's your prerogative, but in terms of just. your engineering allocation. The general thesis is that the number of AI engineers, because the bar is lower, number of engineers is going to be higher and it's going to be a multiple of ML engineer.

[00:12:54] Shawn Wang: And that's an opportunity obviously for vendors serving that. But also, I just think it's a factor [00:13:00] of just the pipeline that we're not training enough ML engineers. The number of graduates in these programs that would typically qualify people for like the data scientist role. It's just not going to be enough for the sheer amount of demand.

[00:13:11] Shawn Wang: That companies have for this role. So therefore you need to scale the ML engineer by supplementing them with a whole bunch of engineers. who actually may be self sufficient in themselves. It depends how deep this role gets. We don't really know the boundaries yet because it's so new.

[00:13:24] Raza: Yeah, I guess we're less than two years on from chat GPT and 11 months from the essay.

[00:13:30] Raza: It's certainly early days. What about the non engineers? We touched on it briefly, but how do you see product managers, domain experts, and other people fitting into this?

[00:13:39] Shawn Wang: Yeah, I think actually the product manager and domain expert is going to be very, very key and probably more key than the AI engineer, depending on, you know, how good code assistants and code agents get.

[00:13:50] Shawn Wang: And one thing you cannot replace is insight into the customer and into the product and no amount of AI engineering can solve like a bad product [00:14:00] decision, you know, like it's not really good engineering, it's more about just like picking the directions for a good product. And so I think that that is the probably like the last automatable task on that team.

[00:14:11] Shawn Wang: And it's very important, like either the AI engineer owns that, I always think like that's the ideal sort of one person AI engineer is to embody that sort of product thinking and a little bit of the ML knowledge and enough engineering to be dangerous. But you know, on a larger team, yes absolutely the PM and domain expert is going to be key because they will provide the inputs for the engineers, just like they always have, like that part absolutely does not change.

[00:14:35] Shawn Wang: They need a counterpart who can tell them what's possible and translate their requirements to current capabilities today. They always needed that. That doesn't change, but I think the AI engineer is going to be much more equipped to tell them what state of the art is with foundation models than the ML engineer, just because they are fundamentally wired to do that because they are product thinkers first and foremost.

[00:14:55] Raza: Yeah, that really resonates. I mean, I have this kind of thesis with human loop based on what we've [00:15:00] been seeing as well, which is that we used to live in a world in which product managers were the experts in customer problems. They wrote the spec, they figured out what was needed, and then it was like entirely the role of engineers to translate that to code.

[00:15:13] Raza: And what I feel like we're seeing more and more of, and I think this trend will continue as the models get better. is product managers and subject matter experts can be much more directly involved in writing prompts and actually creating artifacts that are part of the product. And so instead of just playing this like definitional transitional role, they're collaborating very directly with the AI engineers actually to bring the product to life, which I don't think was true for like ML.

[00:15:40] Shawn Wang: Yeah, and I think a collaborative tool that, you know, bridges the gaps between, between the different roles in the team makes sense. Uh, I think that's why human loop is going to do particularly well in this area.

[00:15:52] Raza: Yeah, or at least, or at least it's why we're, it's why we're making the bet that we are.

[00:15:55] Shawn Wang: Yeah, it's a genuine role.

[00:15:57] Shawn Wang: You know, the, the other side of it is that you are not going to be the [00:16:00] only person doing this, right? Like there's, there's going to be a bunch of you because this is an insight that's big enough for multiple teams pursuing this. And then it's all about execution from there. But that's, you know, that's not really my business.

[00:16:11] Shawn Wang: It's yours. It makes sense though, right? Like this is the shape of how people are going to collaborate. And this is how domain experts should be involved in the work. They always should have done that, but now it's possible. Now it's easier. And that's great for everyone concerned, including the customers.

[00:16:23] Raza: So we've spoken a little bit about the AI engineer role, you know, how it's changed maybe, or it's emergence over the last year, before we talk about the upcoming conference and kind of what you're excited about there and what others might be excited about just briefly. When you introduce a new term like this, inevitably there's going to be some detractors, some critics, some people who try to say, Oh, we don't really need a new role here.

[00:16:46] Raza: Like, what have you found to be the most compelling criticisms? And like, what do you think people have got wrong?

[00:16:51] Shawn Wang: Yeah. The most obvious one actually comes from a friend of mine, Jared, he's VP of AI. If your product's AI, I think at Purcell [00:17:00] and immediately on the call, you know, I hosted like a Twitter space to discuss the essay and, you know, he immediately was like, I don't think there will be an AI engineer.

[00:17:06] Shawn Wang: I think every software engineer is an AI engineer. And I'm like, okay, that is the typical take that that is the sort of diametric opposition, right? Instead of role specialization to feature another product, right? Like everyone will have this, you're not special. This is unnecessary labeling. And I do think that, uh, you know, there's some part of every engineer that if you're responsible and, you know, up to date with the world.

[00:17:27] Shawn Wang: You should be up to speed on AI. The reality is that I don't think it's worked out that way. There are a lot of AI skeptics. Something on the order of, like, 50 percent of Hacker News still hasn't really adopted, like, Copilot. You know, like, the future is always here but unevenly distributed. And there's just going to be always people that take things more seriously than others.

[00:17:47] Shawn Wang: And that's their prerogative. Like, I do not think everyone should, should work on AI. I think people should work on distributed systems. I think people should work on front end, should work on databases. There's so many other valuable problems in the world that yes, that you should go work on those things.

[00:17:58] Shawn Wang: And yes, you deserve a special [00:18:00] job title for that. AI engineer is going to be low status for a long time. Definitely, it's going to be low status for, compared to the ML engineer and research scientist, right? Just because the barrier of entry is so low, nobody's going to really respect it. People are always going to question it.

[00:18:12] Shawn Wang: And I think that's fine, like the pushing and pulling the boundaries of like where the definition is and how much it's going to be, I think it's fine. response I have to that is just wait and see. And I think that for me, creating the shelling point where people can find each other in the job marketplace, like that's really the main goal.

[00:18:30] Shawn Wang: The secondary goal is that, you know, people define skill tree, a skill ladder. You know, we're not there yet. We don't have senior staff, AI engineer. We do have VPs of AI engineering though. I have one of those speakers at the conference is like a VP of AI engineering at MasterCard. And I was very interested to see that title.

[00:18:47] Raza: It is particularly interesting to see that title at MasterCard. Right? I don't, and MasterCard, I don't think it was a startup, like that's a, you know, very clear incumbent company. Cool. To see. They've adopted that title.

[00:18:57] Shawn Wang: OpenAI has adopted that title. They're, they're hiring for AI [00:19:00] engineers. You know, I, I have quibbles with their job description, but they obviously, they have their own take on it and that's fine.

[00:19:04] Raza: What's the, what's the open AI take on an AI engineer?

[00:19:06] Shawn Wang: They want something like five to seven years of ML engineering experience on the, on the job description. And I would never require that for a role that doesn't involve pre-training that role that they have listed. It's on the website. I talked to Shamal who put that role up and that role is primarily for fine tuning for the custom models team.

[00:19:23] Shawn Wang: And do you need the full MLE experience? Do you need five, seven years? Maybe, maybe not, but like, it's a requirement that they put up there. And so it's a point of contention, right? Like, how much do you need? I would say that they put themselves more on the ML engineer spectrum of an AI engineer. And that's fine.

[00:19:37] Shawn Wang: I've always said it's a spectrum, right? So it's a fuzzy nature that I think people get uncomfortable by. They don't like the hype that comes with attaching the word to AI. You know, often the criticism is also that ML is when it works, and AI is when it's sort of magic, right? I forget the exact terminology you're saying for this.

[00:19:54] Shawn Wang: But it's true that like I am leaning into hype. So the people, a bunch of people propose alternative titles like LLM [00:20:00] engineer, cognitive engineer, and so on and so forth. But you just have to go to the lowest common denominator. And then the natural thing that people want to say that rolls off the tongue is AI engineer.

[00:20:08] Shawn Wang: They don't say AI developer. They want to say engineer because it's, it is engineering. And I think, you know, having explored all the past, this is the one that, you know, I, I predicted that people would go to and it seems like it's catching to the point now where people complain about it on Hacker News because of the reality and they find that annoying to them.

[00:20:25] Shawn Wang: But yeah, that would be it, like the legitimacy of, do we need a title like this? Or should we just reuse existing concepts? And my only response is look at demand and supply. Look at the amount of work that actually goes into this thing. Yes, it's going to feel illegitimate. Now it's going to feel increasingly less illegitimate every single year.

[00:20:42] Shawn Wang: And the whole point of my conference, my podcast, my newsletter, everything I do is to serve this role and to spec it out.

[00:20:49] Raza: You mentioned that it's maybe a little bit low status today because the barrier to entry is, is lower than for an ML engineer. Is there any advantage to being early? If, if it's going to be a legitimate role in the future, do [00:21:00] you get on the ground floor now and get some, some advantage from that?

[00:21:03] Shawn Wang: I think there's some advantage to be early, but you do not get the right to win just by being early. So I'll, I'll elaborate on that advantage to being early because there is a lot of papers to keep up on and techniques and history of prompting and APIs that are out there to keep up on. People will just assume is assumed knowledge for anyone in this role.

[00:21:25] Shawn Wang: And so the later you join. The more you're going to have to learn, it's not impossible to learn. People have done it before. It's just going to be harder. It's better to like learn the baseline and live with everyone else as you proceed through the timeline at the same pace that the rest of us, but just because you're early, it doesn't mean you win, right?

[00:21:41] Shawn Wang: Like there's a lot of flame outs in. AI. One of the title sponsors of last year's conference was AutoGPT. They're not necessarily as active anymore and they, you know, they flamed really, really hard. Like they, April last year, I had this sort of one year trajectory essay that I'm writing about the history of agents.

[00:21:57] Shawn Wang: You know, they got more stars at GitHub than PyTorch, Bitcoin, Kubernetes, [00:22:00] Django combined. And . Wow. and uh, you know, like selling the promise of AI and they were early, but were they able to convert that into something lasting? Not really. And and I think, and that's a challenge for a lot of the AI products, right?

[00:22:14] Shawn Wang: Like, they're going to get a lot of interest 'cause we're in this mode where we're optimistic. We we're, we're really, uh, have the budget or even like the time or tension to, to try things out and to, to be anticipatory. But then if it doesn't work out, people move on as quickly as they came as well. So we have to, you know, be aware that, you know, we have to work on deep, sustainable, hard problems and be mindful that hype will be here and be gone the next day.

[00:22:38] Raza: So assuming that someone's bought into the idea of being an AI engineer, they want to be part of the community, the World Fair is coming up in two weeks. Can you tell me a little bit about the event, how it's changed from the last time you guys organized it and, you know, what someone coming should be trying to get out of it?

[00:22:53] Shawn Wang: So like, yeah, the event is my highest stakes expression of my opinion of what AI engineering [00:23:00] should be. Because obviously everyone should have their own opinion, but you know, this is my, my chance to sort of gather the community that has gathered around this term. It's also for me, an expression of creation of opportunities, right?

[00:23:11] Shawn Wang: I'm always about trying to get people to meet each other and it's sort of many to many races rather than just through me or meeting to be really limited by me. So the most obvious transition is the transition from a single track conference to a multi track conference. And we definitely overdid that one.

[00:23:27] Shawn Wang: Uh, we went from one track to nine tracks. Didn't really expect that at the start, but just this sheer amount of interest and clear swimlanes of things that people are working on that I thought were worthy of their own conference. I basically sort of picked those things up. Last year, one track, and then this year, we have the sort of standbys, the rag and the, you know.

[00:23:47] Shawn Wang: Code generation stuff, which we also had last year. But then this year we added multi modality as its own track. This year we added evals and ops as their own track, which is super, super popular. Agents we had a bit of last year, but this year we have [00:24:00] much more organized tracks. And then, for example, I wanted to have tracks to address specific criticisms of AI.

[00:24:05] Shawn Wang: For example, a lot of people would say that AI is, you know, adopted by startups. And that's absolutely true. But, you know, there's interesting stories to tell in the fortune 500 and larger scale deployments. So I have straight up AI in the fortune 500. And just putting that in the title attracted a different kind of speaker and different kind of audience, which is great.

[00:24:23] Shawn Wang: I actually thought last year was two startups focused. that gets you in this sort of navel gazy Silicon Valley type bubble where everyone actually like talks a big game and raises a lot of money, but doesn't actually make a ton of revenue. And, uh, you know, I think the revenue is basically in a capitalist society, the only thing that keeps you honest.

[00:24:42] Shawn Wang: And so I'm really proud to initiate that track. Uh, you know, we, not everyone in there is actually formerly fortune 500, but they're, they're at least household names that everyone would care about Coinbase and Salesforce. And then finally, uh, we, for the first time, we were also doing the sort of VPs of AI track, right?

[00:24:58] Shawn Wang: Like trying to, and that's the one [00:25:00] that you're speaking on, which is addressing some team and leadership level conversations around AI strategy and growing a team and organizing a team and setting principles for what that should look like. For me, the ultimate win condition for this conference is actually like, you don't even go for the talks.

[00:25:15] Shawn Wang: You just go, cause you know, everyone else is going and you just show up and like, you're like, I know I should be going for the talks, but my conversations here are so great. Like I don't need anything else. That's actually a win for me. But these, these conferences are expensive affairs and people have to expense it.

[00:25:28] Shawn Wang: Right. So like, I'm definitely creating something for people to expense and to be work relevant and to, to level up, to find jobs, to hire, to launch their products. I think we have three foundation models launching, by the way, that's sort of, yeah, I haven't, I have never said that publicly, but

[00:25:43] Raza: you had it, you heard it here first.

[00:25:45] Raza: Are we going to get the big one? Not the big,

[00:25:48] Shawn Wang: big one. No comment. We have a, we have decent. So like, I don't think we've reached that tier where people want to give us their big launch. Every year we're getting more legit. You know, this is the second iteration. The first year we had [00:26:00] OpenAI Anthropic. How many people

[00:26:01] Raza: have you got coming this year?

[00:26:02] Shawn Wang: We're targeting 2, 000. We're at 1, 500 now. I don't know what the exact number is going to be, but you know, and you know, typically the online audience. And so like last year, In person was 500 and the online audience was 20, 000. And then the asynchronous audience was 150K ish for a single one of our top talks.

[00:26:18] Shawn Wang: And then this year we're just basically trying to make everything four times larger.

[00:26:21] Raza: I think relative to almost anyone in the space, you are incredibly well positioned to speak to a lot of different types of people through the Latent Space Podcast, through your work as a developer relations engineer. I think you have a pretty unique perspective.

[00:26:35] Raza: And so one thing I wanted to ask you about is, you know, what advice do you have for product and AI and leaders? And I have specific questions there, but I want to give you the generic version of the question first to see where you take it. So if I'm a AI engineer or I'm a product engineer building today, Maybe I'm starting a new project.

[00:26:52] Raza: You know, what are the gotchas? What is the best advice that you would give? Oh boy

[00:26:58] Shawn Wang: I don't really don't feel qualified to do it I [00:27:00] can I can repeat the wisdom of people much smarter and more experienced and more successful than me because that is my role as I guess a content person and community person.

[00:27:08] Shawn Wang: Okay, I'll give you I'll give you a common take and I'll give you a harder take Okay, common take would be that you should move fast and much more so than a typical YC startup mentality. If you're taking like three months to ship a thing, what could you do to make it three weeks? You know, that, that kind of mentality of trying to move fast.

[00:27:28] Shawn Wang: Because if your typical development timeline assumptions are based on, you traditional ways of making PEI products, then maybe you should think about like, what is unlocked by foundation models that you can do differently, that is faster, that is mockable by a really shitty prompt, but whatever, it kind of works and get it out there.

[00:27:47] Shawn Wang: It is, you know, a core part of the thesis of the original essay. Was that you should do, you should adopt like a fire ready, aim approach instead of a ready and fire approach. And where did the old ML engineering process was much more deliberative [00:28:00] here. You actually win by moving really quickly and getting information from the market, from shipping products so that you can iterate much faster, maybe, you know, in a way like the gradient descent.

[00:28:10] Shawn Wang: is performed in the marketplace of the user rather than in a model weight, because you don't need to start with data. You need to, you need to create in this sense, your product.

[00:28:19] Raza: And that, that's something that resonates at least from what I've seen as well. And that I've seen a lot of companies ship a V1 that's good enough.

[00:28:27] Raza: and use the data they gather in production, the eval data, the feedback data to rapidly improve that. And without that feedback data, it's really hard to do so. So there is a flywheel that you can get going if you're quick to deploy something. How do you trade that off with the caution that some larger companies might have about hallucinations or risks?

[00:28:46] Shawn Wang: Those are very important, but also I think probably overstated for a lot of companies because They can just set the expectation that, hey, this is a beta thing, or this is like a, this is generative AI products. Wink, wink. You all know the risk that comes with that, but we're [00:29:00] just trying our best. And that suffices for most companies that are not named google.

[00:29:03] Shawn Wang: com and with the responsibility that google. com has. But most companies, you are not Google. Go ahead and experiment. Get out of your own head. Go try

[00:29:11] Raza: things. So you said your, your non spicy take was people should definitely focus on moving fast and faster than they might. Yeah. In another circumstance.

[00:29:20] Raza: What was the spicy take?

[00:29:21] Shawn Wang: Yeah, it's non spicy because everyone should be moving faster and regardless of AI or not, right? Like actually that's a universal truth. The spicier take is something I've been sitting on myself for a while and you know, maybe concerns you a little bit. What kinds of startups are performing better and seeing more success in the market?

[00:29:37] Shawn Wang: It's vertical rather than horizontal startups. And a lot of people try to build picks and shovels. And obviously some picks and shovels creators will win. But actually the people that have the most proprietary data stand out the most. Pursue the markets with the highest margin because they pursue very price insensitive, non technical audiences that just have a problem that's the burning pain points that you can [00:30:00] solve with AI.

[00:30:00] Shawn Wang: Like they want an AI solution. No one's solving solutions for them. Those vertical startups actually have the unique insight into their customers that they're making the most money and growing the fastest compared to the horizontal one. We have a company that's launching with us that has worked in AI and construction, right?

[00:30:17] Shawn Wang: And most software engineers. Don't really like, you know, the real world, much, much rather, much happier building developer tools where, you know, we just talk APIs and can log everything, but like, you know, if you're willing to do the nasty work in the real world, then you would get appropriately rewarded. I think that's entirely appropriate.

[00:30:34] Shawn Wang: But at the same time, you also have the most domains of specificity and unique knowledge that you can, you know, train your models or whatever, you can pursue high margin markets and you're most likely to not be steamrolled by OpenAI when that day comes when OpenAI does a launch and they're like, whoops, you know, my company's gone.

[00:30:49] Shawn Wang: So for many, many reasons, I do think that vertical companies have it right. And I wonder if the horizontal players, like there's going to be a shakeout where a lot of them fail. And that's just the brutality of [00:31:00] the market that, I mean, that's neither here nor there. But I think that people might underappreciate how easy it is to get started on the vertical stuff, compared to, compared to on the, on the horizontal stuff where you're one of many.

[00:31:11] Raza: And which vertical AI startups have really caught your attention? Like, what do you have in mind when you say that? Is it something like Harvey AI in the legal space? Or you mentioned construction, like what are other examples? Because the counter argument that people will often give is, you know, some of these vertical AI startups, the worry is that they'll get eaten by improvements in the models over time.

[00:31:31] Raza: Why is that concern wrong? And what are some examples that you are really excited about?

[00:31:35] Shawn Wang: Okay. So maybe we start with the examples, which is always challenging to recite at the drop of a hat. So I think Harvey is definitely well known as an example. I do think like mid journey is vertical, like it's serving that sort of creative market, you know, and classically for those who don't know, mid journey is making somewhere between 200 to 300 million a year, completely bootstrapped with a 50 person team.

[00:31:58] Shawn Wang: Something of that of the order the numbers might [00:32:00] be like plus minus 10 percent off But those are the rough magnitude and like that is one of the most successful starters of all time already So like you have to get creative about like what that vertical is But I think increasingly you will see other verticals that they just come up and take on incumbents directly I am thinking of perplexity there as a very successful like anti Google even though they sort of wink wink They're like no, we're friends of everyone Now they're trying to take on Google, and I think they're doing a decent job.

[00:32:25] Shawn Wang: It's an open question as to how profitable they are, but they definitely make a dent in terms of the public persona perception. And then like other kinds of verticals, I could name, Peter Levels like photo AI and the sort of rooms. I forget what it's what I think it's called interior AI, right? And that focuses on real estate agents doing virtual staging for houses, right?

[00:32:45] Shawn Wang: Like whenever they're trying to put houses on sale, like they need that. You know, there's a bunch of people serving those, those verticals. I mean, yes. One of the verticals is developer tooling. I think the sort of cursors and co pilots of the world, like focusing on that is like, it's a very clear use [00:33:00] case where like you win when everyone Basically says like, you need to have this or you're behind in that industry.

[00:33:06] Shawn Wang: And I think the Harvey customers would say that. I think the cursor and co pilot customers would say that. And I think like there, there is a tool for like building the AI, you know, companion or friend or like sort of default tool of every single vertical out there. I like, there's got to be someone for them for the medical side.

[00:33:21] Shawn Wang: I can't really name them just because I'm not that close to that side, but like the same exact thing as Harvey is doing, you would apply on the medical side, but a bright wave with our most recent podcast guest is doing that for hedge funds. It's obvious that people want to do research quickly. A lot of research analysis is very commoditizable.

[00:33:39] Shawn Wang: Using language models, you can do that a lot quicker. And yes, you're going to hallucinate some, but so do your analysts, by the way, and it's no different. So yeah, those are, those are all verticals. And for me on my very, very small, you know, perch in the world, new summarization is a vertical that no one's actually seriously pursued.

[00:33:55] Shawn Wang: Everyone, you know, builds it as a feature I'm building as a product. Like it's so it's too easy [00:34:00] to win here. Like it's, that's something, something I really like. Puzzles me that like, you have to work really, really hard in, in horizontal, like dev tools type things to like, when you, you know, everything has to be open source, everything has to be like distributed and scalable.

[00:34:13] Shawn Wang: And you, you need, you know, you need all these like checkboxes on your, you need to like SOC two and whatever, in order to get your customer. And then like, you just build like one vertical product that like solves the pain point and everyone is raving about it, even though they don't really know anything else about it.

[00:34:26] Shawn Wang: And that's fantastic. Like that's, that's what PMF should look like.

[00:34:29] Raza: You know, you've worked now, I guess at like three different dev tool companies that, you know, unicorn companies in dev rel. I think you've developed like a lot of taste for developer tools. You're working on small AI yourself. We just discussed that like the horizontal developer tool space for AI is increasingly crowded.

[00:34:46] Raza: There's like a ton of different options. If you're a buyer trying to like navigate this, there's a lot of noise. So what advice would you have for someone like not on the. vendor side, but on the buyer side about like, what is smoke and mirrors [00:35:00] versus what do people really need? Are there tools you're excited about?

[00:35:03] Raza: You know, like what's, what's the stack, I guess, that people should be thinking about building?

[00:35:06] Shawn Wang: I think that there is actually a lot that you should buy from the beginning and then selectively build later on because The community has probably run into more problems than you know of, and actually it serves you well to buy first, understand your own problems with, with the protection of like the communities, figure this out.

[00:35:28] Shawn Wang: And then once you understand where you defer from the community, then you can eject if you want. So obviously you want to think about lock in, like, it's more expensive to build because you're going to slow yourself down if you try to build everything in house, right, that sort of not invented here syndrome is very big.

[00:35:43] Shawn Wang: And I think some things like, you know, you need some kind of evals platform makes absolute sense. You need some, you know, observability and monitoring on your APIs, just like other regular APIs. But, you know, now with an AI flavor, some try to charge you a lot for that. And obviously there's a right price for this, but it does absolutely make sense that you should [00:36:00] buy all these things just to move faster.

[00:36:02] Shawn Wang: Right. Going back to the organizing principle of like, what makes sense in this world. And then you can build later on if you're like, none of these make sense for our case, or we're like seriously overpaying for this. Fine, you only paid for like two months of this. Go back to build, backfill whatever you need.

[00:36:16] Shawn Wang: Only once you've understood the problem and you've explored the features that other people have built to serve other customers, because you're not going to be the only one running into the fact that, oh, you need portable keys or like, you know, time limited keys for your API, you know, key rotation for your OpenAI API keys.

[00:36:30] Shawn Wang: Or you need to track your inputs and outputs so that you can do, you know, evaluations or fine tuning or whatever. Like, everyone has the same issues. You're not special. Just buy it because, you know, you're sharing the development cost with everyone. And then there's the AI product tooling, which is for you building products for your customers.

[00:36:46] Shawn Wang: And then there's the internal productivity tooling, which is a separate thing. And actually, that gets adopted a lot quicker. And that typically comes in the form of developer tooling, right? Like the, in, in terms of, are your people using copilot or cursor or sort of source graph Cody or anything of that sort, that's [00:37:00] the baseline.

[00:37:00] Shawn Wang: That's the most proven one, at least most people in the AI community are using that. But then also there's the other productivity stack of like, you're meeting some writers and what have you, you know, I'm not really in the business of picking productivity tooling, but I do think that there is a lot of.

[00:37:13] Shawn Wang: Advantages to that. Ultimately, the point that we're going to get to here is we should seriously think about virtual employees or like AI employees that would reliably perform parts of the jobs that you would normally assign to a human that role is going to start small now, and then going to grow over time, the people who are most able to take advantage of AI to do that tasks are going to be the most capital leverage, right?

[00:37:33] Shawn Wang: Like that they don't have to manage humans to do that. These things can work all night for you. And that's, it's fantastic. So like, yeah, we haven't really seen that yet. I think the closest I've seen is Lindy AI, uh, from, from my friend, Vogue Gravello. But again, like, you know, it's not really being adopted at massive scale yet.

[00:37:48] Shawn Wang: Like every year I have one, one essay that stands the test of time. This essay is the Sour Lesson, as opposed to the Bitter Lesson by Rich Slutton, which typically talks about the scaling of models and how you should basically never bet against scale. The Sour Lesson is always [00:38:00] actually more about trying to compare humans and AI on a sort of equal footing.

[00:38:03] Shawn Wang: And actually they're, due to Moravec's paradox, they're actually good at different things. And they're just always going to learn differently, develop differently. And you should not try to directly compare humans and AIs. So like, I think the way that we interact with these AI employees, let's call it, is going to be very different than humans.

[00:38:18] Shawn Wang: And actually the analogy isn't going to be that light, like there are going to be some things that we find where they're, that these are killer apps and that we would never give a human in vice versa for the other, other side.

[00:38:27] Raza: I think I, I somewhat agree with that. I mean, for that reason, I think actually the idea of like human level in AI is probably a mistake because they're already superhuman in some dimensions and they're like much weaker than others, right?

[00:38:39] Raza: Like if we kind of look at the graph of skills, there are things that humans find trivial that are really hard for AI. And so the moment that AIs are able to match humans on the tasks where they're currently weaker than us. They'll already be superhuman, right? There's not going to be a moment where they're human level because they're already much better than us at many things, retrieval, memory, etc.

[00:38:58] Raza: Yeah, exactly. I'm actually going to [00:39:00] move on because there's one final thing that I want to chat to you about before we wrap, which is just Trent. There is so much noise. There's so much hype. There's so many new papers. There's something released every day. And I think you do a really good job of curating this for the community.

[00:39:15] Raza: You have your automatic summaries, but you also have the podcast and tweets, et cetera. If someone was going to focus on just a small number of trends or like a couple of things that, you know, they're trying to find the signal from the noise. What are the trends that you think people should really be paying attention to that are going on right now that maybe are slightly overlooked, or maybe they're not overlooked, but just in the amount of noise there is, it's just hard to follow them.

[00:39:38] Shawn Wang: Okay, there are a few categories of this, which I could point out, we do occasional essays on latent space. And what I would highlight is the full wars essay, and then the research directions essay, which I'll Pick out here and then there's other trends that didn't fit those those buckets Well, I'll just name them and people can go read up on that the trends are that there are several [00:40:00] key battlegrounds that You as a company if you're going to be an ai company of any consequence You're going to have a beachfront on this war or you're not really in the game at all Because you're you're fighting a war where there's no opponent because there's no limited resource and everyone is aligned in the same way.

[00:40:14] Shawn Wang: So for example, open source is not really a war, at least in the technical domain. It is in the political domain, but not in the technical domain because most people are pro open source in tech. That's not a war. What is a war is fighting over data, fighting over GPUs, fighting over the God model versus the sort of domain specific model, and then fighting over RagnarOps.

[00:40:33] Shawn Wang: And so those are the four battlegrounds that I picked out that I think are sort of key battlegrounds where. There will be winners, but all the participants cannot all win, and that's what makes it somewhat zero sum and interesting as a war. That also means that there are several domains where I don't think there are wars yet, it's just because they're not interesting or contentious enough.

[00:40:51] Shawn Wang: So like code generation, very, very important problem, but no war because everyone's like trying to make their own headway, and no one's really figured out [00:41:00] like what's the next thing after co pilot, right? Maybe it's dev and maybe it's not. Okay, those are the four words you can see our reading on that and then like research directions So moving from the sort of commercial space into the the research space and I think having a filter for what kind of research matters Actually really helps to survive on Twitter because you're just constantly DDoS by all these influencers going like this changes everything and You know this week there was like this like Paper about how like you don't need matrix multiplications anymore.

[00:41:27] Shawn Wang: Oh my god. NVIDIA is going to zero and I'm like what kind of What are you smoking I want that So like having a filter for like what is worthwhile research direction is actually important I put myself on on record as like ranking a list of directions, right? So for us, it's long inference As number one, synthetic data, number two, alternative architectures, number three, mixture of experts and merging a model is number four and online L is number five.

[00:41:53] Shawn Wang: And yeah, those are the main trends. You know, I can always talk about the other trends that we're seeing. So for example, having like, what is [00:42:00] a Moore's law of AI and what are the long term trends that you can bet on? Right. And I think we're trying to define this for one of our podcasts that we're doing, but basically like the cost of like a 70 MMLU every single year goes down by something like five to 10 X.

[00:42:14] Shawn Wang: And so like, this is the observed trend and it's going to go, it's going to keep going down. So like you actually, it makes sense to build a product that loses money today because you're going to be ahead. You just have to wait, kind of wait it out because of Moore's law is going to bail you out of whatever sort of bad inference ideas that you may have

[00:42:28] Raza: for people who might not know.

[00:42:29] Raza: I assume most of our audience would, but just briefly, what, what's MML? You just let people have the context.

[00:42:34] Shawn Wang: I think it's multimodal, but no, it's not, it's not multimodal massively. something multi I don't actually know the end. Language understanding. Language understanding. It's not multimodal. It's, it's, it's something, yeah, it's something about domain specifics.

[00:42:47] Shawn Wang: Basically, it's, it's a, it's a conglomeration of all the professional exams that you could possibly ever want as a human, right? Like sort of from like It's massive multitask are the two M's. Massive multitask, right. Fun fact is created by [00:43:00] Dan Hendricks, who is like a very, very noted AI safetyist. And that's a whole different discussion because it's very ironic that the thing that he created is like the primary tracker for AGI.

[00:43:08] Shawn Wang: Anyway, it's the primary number that every LLM basically compares each other by. Um, and this is not something that you should fixate on because It's very likely within two to three years, we'll move on to something else.

[00:43:18] Raza: Bench benchmarks are very much fly by once they, once they saturate, we create the next one is one way to interpret what you're saying is this kind of Moore's law of AI and then they'll use the specific one you've chosen for now, but it's, it's really a statement more on like the intelligence of the capabilities of the model at professional tasks we care about every year is going down by five or 10 X or something like that.

[00:43:37] Raza: The cost to achieve the same, the same performance. Yeah.

[00:43:41] Shawn Wang: You can also call that like our GPT four level model. Was not possible in 2022. And suddenly in 2022, you could pay something like two to 20 per million tokens for it. Now the cost is 2 per million tokens. It is on its way to 0. 5 or 0. [00:44:00] 25, depending on if you look at deep seek MOE, being a legitimate GPT 4 level model, and it will, it will trend towards zero, so, okay.

[00:44:06] Shawn Wang: That's great. But then our levels are sort of bar for acceptable AI intelligence will increase when GPT 5 drops. And so then when GPC 5 starts on a higher curve, it goes down again. And that's a, that's a very classic sort of cost curve model. That's typically what semiconductors, you know, typically operate on like the different semiconductor process nodes, and they operate on, on that, on that curve.

[00:44:27] Shawn Wang: And actually you want to understand that and build that into your product planning, right? So it's, it's not just costs. I have a few trends here that I haven't really published, but I'll just kind of go through it. It's commodification of intelligence, which is the MMRU cost going down over time. It's also inference speed as well.

[00:44:43] Shawn Wang: So like going from like maybe, you know, 170 to a hundred tokens per second to 500. I've had very credible sources tell me that Grok is aiming for 5, 000 tokens per second. So another 10x from where they are here. And so what, what do you do differently? Right. Every 10X unlocks a different kind of [00:45:00] product.

[00:45:01] Raza: So that's, that's roughly, you know, 10 pages of text or a full essay every second in terms of inference speed.

[00:45:07] Shawn Wang: Like, yeah, we use, I used to emphasize that every, every AI API must incorporate streaming because you know, you want to stream out tokens when you, when you're sort of autoregressively generating things, but when you're generating 5, 000 tokens per second, you don't need it.

[00:45:21] Shawn Wang: It's crazy. Okay, so I'll keep going. Context is treading to infinity, right? We used to have, you know, 4, 000 token context models, and we, I thought that was enough, and now we have a million token context levels, models, and I don't know what to do with it, but people are finding use cases for that, right?

[00:45:35] Shawn Wang: It's, it's really interesting. Multimodal everything is, is another trend that I called out where, yeah, I mean, this is obvious now because it's GPC 4. 0, but you know, multi, all modalities in, all modalities out is, is, is a very nice shorthand for that. There's, nuances around that. And then finally, variance is a very interesting trend, which I, which is definitely one of those underrated things that maybe could become more prominent over time.

[00:45:55] Shawn Wang: What this is is basically a lot of the use cases that we talked about for work is temperature, zero [00:46:00] use cases. You receive this God model from the heavens. And the first thing you want to do is lock it down and make it do your retrieval augmented generation, right? Like. That's the most boring possible use case.

[00:46:09] Shawn Wang: Like, what if you just let it loose and try to make it think of things that you never thought of? You know, that's, that's his whole comparative benefit and you're chaining it down, trying to force it to do this, this other very unnatural thing for it. What if hallucination was a feature and not above?

[00:46:23] Shawn Wang: Right. So there's this whole emerging category of temperature to use cases is what I call it, where they're diametrically opposed to the temperature zero people where like hallucination is a feature. Like come on in. Like, this is actually really great. Cause I never thought of that. And, and creativity is expensive on my team.

[00:46:36] Shawn Wang: And if I can, you know, spend a few dollars to add more creativity, absolutely. I should do that.

[00:46:41] Raza: Yeah, that last one I think is a particularly interesting one that I resonate with. I think it's not just for creativity, but actually if we want to be able to create new knowledge from these systems, I don't think we can do it with LLMs alone, but the combination of a model that can work as act as a conjecture machine, can chuck out possible explanations for things [00:47:00] coupled with some way to like actually test that and measure it, I think is a way that you start to build AI or computer systems that can generate new knowledge.

[00:47:06] Raza: And obviously you want those operating as you see a sort of high temperature, not deterministic mode. All right, Shawn, I think that's all we're going to have time for today. I feel like I could talk to you for hours and hours. Um, just before we go, if people want to attend the conference, you know, how do they get tickets?

[00:47:21] Raza: Where do they find it? And where else can they find you on the internet?

[00:47:24] Shawn Wang: Uh, yeah, probably the smartest thing I did when launching that writing the essay is I bought the domain. The domain is ai. engineer. So you go to ai. engineer and you'll, you always see like the currently active conference. And I've created a code i agency for people, for listeners who've come in this far.

[00:47:39] Shawn Wang: And when I get last minute tickets, we'll hopefully get this out as soon as possible. But yeah, it's coming. It's definitely the industry conference to be at. You can find me on Twitter at Swyx and you know, my, our podcast, as well as latent dot space. We just love the short domains, right? I hope you get hideout agency, but that's already taken, but we got latent dot space donated by a listener, actually, which is very [00:48:00] fun.

[00:48:00] Shawn Wang: You can find more conversations that we have on our podcast.

[00:48:03] Raza: Fantastic. Well, Shawn, it's been an absolute pleasure. Thanks for coming on. Alright, that's it for today's conversation on high agency. I'm Raza Habib and I hope you enjoyed our conversation. If you did enjoy the episode, please take a moment to rate and review us on your favorite podcast platform like Spotify, apple Podcasts, or wherever you listen and subscribe.

[00:48:21] Raza: It really helps us reach more AI builders like you. For extras, show notes and more episodes of hi agency, check out human loop.com/podcast. If today's conversation sparked any new ideas or insight. I'd really love to hear from you. Your feedback means a lot and helps us create the content that matters most to you.

About the author

avatar
Name
Raza Habib
Role
Cofounder and CEO
Twitter
𝕏RazRazcle
Raza is the CEO and Cofounder at Humanloop. He was inspired to work on AI as “the most transformative technology in our lifetimes” after studying under Prof David Mackay while doing Physics at Cambridge. Raza was the founding engineer of Monolith AI – applying AI to mechanical engineering, and has built speech systems at Google AI. He has a PhD in Machine Learning from UCL.