Lessons from Gusto's $9.5 billion journey with Eddie Kim & Ali Rowghani

By Raza HabibCofounder and CEO

Gusto's evolution from a startup to a $9.5 billion valuation is impressive, but what's truly noteworthy is their integration of AI into core business processes.

In this week’s episode of the High Agency podcast, Humanloop Co-Founder and CEO Raza Habib sat down with Eddie Kim, co-founder and Head of Technology at Gusto, and guest host Ali Rowghani, early Gusto investor and former YC and Twitter executive, to discuss how Gusto has applied AI to revolutionize ops-heavy processes like payroll and HR admin.

Subscribe to Humanloop’s new podcast, High Agency, on YouTube, Spotify, or Apple Podcasts

In this conversation, we get into the technical elements of Gusto's AI transformation, offering valuable best practices for AI builders seeking to implement AI in complex operational environments.

The Benefits of Creating Centralized AI Product Teams

One of the key decisions Gusto made early on was to centralize their AI efforts under a dedicated team, but starting out, this wasn’t their first approach.

Gusto's AI integration journey began with a decentralized, bottom-up approach. Each product team was mandated to build some kind of AI features into existing product lines. While this method yielded valuable insights and incremental product improvements, the team recognized that a more cohesive strategy was needed.

By bringing together a cross-functional team of engineers, data scientists, operation specialists, and domain experts, Gusto was able to build AI products without being distracted by competing priorities. This centralized approach enabled them to quickly identify and prioritize high-impact use cases, while also fostering a culture of collaboration and knowledge sharing.

Identifying Initial AI Use Cases By Targeting Time-Consuming Tasks

Gusto's first significant AI project tackled payroll reporting, a notoriously time-consuming task for users.

“There are multiple steps an admin has to take to get a question answered. Step one is going to the reporting page in Gusto... you generate the report... but you're not done yet because then you have to download that CSV file... you do some pivot tables... and then you synthesize that data to finally get the answer... The report is an intermediate step, but the whole process to answer a basic question could take 10-15 minutes. If you have a complex question that can take hours,” said Kim.

Gusto’s AI solution dramatically reduced this process:

“This was a perfect use case to apply AI. You just ask the question to a model. The model's really good at figuring out from your question what columns to pick... and it actually goes into our reporting and generates that report... That's a pretty complex thing that we've built. It involves multiple models and function calls, but it takes about 40 seconds to do that."

AI Guardrails: Building User Trust While Maintaining Data Privacy

Payroll is an area where people really need peace of mind and data errors and AI hallucinations can immediately kill user trust. Eddie emphasized the critical nature of data privacy and access, which led the company to prioritize developing custom in-house AI solutions.

"The limitation of off-the-shelf, AI point-solutions solutions is that they just don't have all the contextual information about that specific customer to improve the response, to improve the retrieval process," Kim explained.

To ensure data security, their AI models interact with the same API endpoints as the web application, adhering to existing authorization layers.

The reporting process involves multiple AI models working in tandem: A model interprets the user's question and selects relevant data columns. It generates an API request to pull the required data. Another model, using OpenAI's Assistants API, analyzes the data and formulates an answer.

Kim emphasized transparency:

"When our AI products generate reports, we also provide a human-readable CSV that was generated to get the answer. We keep the human in the loop to trust but verify…The user can open up the CSV to verify that they got the right answer."

Role-Based Access Control: A Critical Component of Gusto's AI Reporting System

Gusto implemented robust RBAC in their AI reporting system. Eddie Kim emphasized the critical nature of this feature:

"We have to be really thoughtful about the role-based access of those APIs. So one of the design principles is that we in no circumstances want AI to accidentally get information that it shouldn't have access to."

This principle is applied through a carefully designed authorization layer.

"If you're the super admin of the company, you might be able to generate a report that has the salaries of every single employee in your company, but if you're a manager in that company and you're going to a reporting page, you shouldn't be able to do that. You should only be able to, depending on how you set up the permissions on Gusto, you should only be able to generate a report that contains a salary of your direct reports or your indirect reports."

To maintain this level of security, Gusto has integrated its AI reporting model with existing backend APIs.

"That's why we actually have that reporting model go through the same reporting backend APIs as the web app goes through," said Kim.

This approach ensures that the AI system adheres to the same strict access controls as human users, preventing unauthorized data access while still leveraging the power of AI for efficient reporting.

Creating an “AI Ejector Hatch” Using GraphQL Functions

When it comes to interacting with chatbots and agents, many of us can relate to having negative or frustrating customer experience when chatbots that don’t understand our questions.

Gusto solved this by creating an “AI ejector hatch” via a GraphQL database.

“We created a graphical user interface with a card the Gusto platform calls. The UI will say, ‘You're going to approve time off for this person John Doe, and you're going to approve his time off request in December, and this is the reason why he requested time off.’ And then the user literally has a ‘Yes’ or ‘No’ button. Only when the human admin clicks ‘Yes’ is when the copilot completes the function,” said Kim.

“We use GraphQL so the copilot makes a GraphQL mutation to actually approve that time off request. This is a great example of how we can use AI to do the heavy lifting and take you all the way to the last mile… It's a deterministic action executed by the human at the very last mile to ensure accuracy.”

Creating an “AI Ejector Hatch” Using GraphQL Functions

When it comes to interacting with chatbots and agents, many of us can relate to having negative or frustrating customer experience when chatbots that don’t understand our questions.

Gusto solved this by creating an “AI ejector hatch” via a GraphQL database.

“We created a graphical user interface with a card the Gusto platform calls. The UI will say, ‘You're going to approve time off for this person John Doe, and you're going to approve his time off request in December, and this is the reason why he requested time off.’ And then the user literally has a ‘Yes’ or ‘No’ button. Only when the human admin clicks ‘Yes’ is when the copilot completes the function,” said Kim.

“We use GraphQL so the copilot makes a GraphQL mutation to actually approve that time off request. This is a great example of how we can use AI to do the heavy lifting and take you all the way to the last mile… It's a deterministic action executed by the human at the very last mile to ensure accuracy.”

Future Projects: Breaking Down Data Silos

Looking ahead, Gusto aims to leverage AI to break down data silos, enabling seamless access and synthesis of data from various sources. This approach promises to deliver a more cohesive and personalized experience for customers.

Gusto's approach serves as a valuable blueprint for companies looking to integrate AI into their operations. By centralizing efforts, building custom solutions, prioritizing data security, and exploring innovative applications, Gusto demonstrates how AI can transform business operations while maintaining user trust and data integrity.

Chapters 00:00 - Introduction and Background
02:15 - Overview of Gusto's Business
05:59 - Operational Complexity and AI Opportunities
08:51 - Build vs. Buy: Internal vs. External AI Tools
10:07 - Prioritizing AI Use Cases
13:53 - Human-in-the-Loop Approach
19:39 - Centralized AI Team and Approach
22:53 - Measuring ROI from AI Initiatives
32:25 - AI-Powered Reporting Feature
38:46 - Code Generation and Developer Tools
42:52 - Impact of AI on Companies and Society
47:22 - AI Safety and Risks
49:54 - Closing Thoughts

Podcast:

Introduction and Background

[00:00:42] Raza Habib: This is High Agency, the podcast for AI builders. I'm Raza Habib.

[00:00:48] Raza Habib: I'm joined today by Eddie Kim, the co-founder and head of technology at Gusto. He's led them all the way from the time they were a Y Combinator startup, like we are now, to a $9.5 billion valuation. Gusto is known for payroll and HR software for small businesses. But the reason I'm super excited to chat with Eddie is because he's been spearheading their AI initiatives, and I think they've got some really interesting applications of AI at scale. So I'm looking forward to diving into his experiences and the lessons they've learned making it practical.

[00:01:15] Raza Habib: Joining me today as a special guest is Ali Rowghani. Ali Rowghani is a friend. We've done a podcast episode before, actually one of the Y Combinator ones that ended up going viral and got half a million views. So this is not our first time doing this together. But also, I really wanted Ali Rowghani to join me because he has a long career as an operator in tech. He was at Pixar and Twitter, he was CFO, but he also ran the Y Combinator Continuity Fund for 10 years. The podcast is supposed to be about helping builders in the AI space. I think that while I can bring a really good technical angle, including Ali Rowghani on the show is an opportunity to have someone who comes with the view of business and product, and who can think about not just how to build things, but what we should build and why.

[00:01:56] Raza Habib: So it's a pleasure to have both of you on for the episode. I should also add that Ali Rowghani was an early angel investor in Gusto and sat on their board for a while. He was also just deep in the company too.

Overview of Gusto's Business

[00:02:15] Raza Habib: Eddie, to start with, for our audience's benefit, just to set the context and give everyone the background that's needed, can you tell me a little bit of an overview of Gusto as a business, both what you do and the structure of the company? How should we think about it? How many customers do you have and what's the breakdown of different types of employees? Things like that.

[00:02:26] Eddie Kim: Yeah, so Gusto is a people platform for small businesses, basically anything and everything related to running the people side of a business. That's what Gusto helps with. The way you think about our platform is you join for payroll. We process payroll for more than 300,000 employers in the US. I think it's more than 6% of the US's payroll in terms of employers are going through Gusto at this point. And although most of our companies start with payroll on Gusto, they end up doing a lot more on the people platform.

[00:02:54] Eddie Kim: So you could think of other ways that we help that are related to the people aspect of running that business. Those are things like benefits, health insurance, workers compensation, other forms of business insurance, HR tools, we have an applicant tracking system, learning management system, performance management system, time tools for clocking in, clocking out if you're hourly employees, and a bunch of other things that are on our platform. So once you're on our platform, we have the system of record of all the people who work at your company. We can do a lot more by offering all these different what we call, internally, apps on top of our platform.

[00:03:37] Ali Rowghani: You know, it's funny, on the surface, payroll seems kind of ordinary and simple and kind of part of business that's in the background. But running a payroll company, especially focused on small, medium businesses, is actually much, much more complicated. I learned in my history with Gusto than it may seem. Can you talk a little bit about the complexity of your product, the complexity of the business that you run? You know, I just think that your customer scale is unbelievable, just to start with. So could you talk a little bit about that?

[00:04:07] Eddie Kim: Yeah, I would say payroll is, like, what I've learned is deceptively difficult. It seems simple and somewhat boring on the surface, but when you dive into it, it's actually really, really complicated and also really interesting at the same time. Part of it stems from the fact that in the US you have, you know, not just the federal payroll taxes and not just the state taxes, right? For each of the 50 states who essentially run, from a taxing perspective, as their own countries, basically. You actually have thousands of locals as well. So many cities have their own tax jurisdictions, and they kind of in effect run as their own tiny little countries as well.

[00:04:42] Eddie Kim: Everything is decentralized, and unfortunately, many of the taxing agencies are still really, you know, paper-based, old school. You call them up, you fax them forms. We literally have taxing agencies where we have to burn CD-ROMs with the file, text file on it, and mail it to them in order to communicate with them. And they actually will mail back another CD-ROM with information burned on it. And we get it in our mail, we open it up, and then that's how we kind of do data exchanges with some taxing agencies.

[00:05:12] Eddie Kim: So it's a combination of a very decentralized system in the US, and then also the fact that many of these agencies actually, you know, don't have nice APIs. And then add on top of that, you have lots of situations where employees may live in one state but work in another. For example, it's very common to live in New Jersey and commute into Manhattan for work, and that adds another layer of complexity to how taxes are handled. So it's a really, really kind of exponentially complicated problem that's really, actually quite hard to model the domain.

Operational Complexity and AI Opportunities

[00:05:59] Raza Habib: Eddie, you know, there's a huge contrast, obviously, between payroll and tax and things that people think of as maybe more traditional or even boring, and then the cutting-edge nature of AI and LLMs. But one of the reasons that I think, actually, you guys have been on the frontier and one of the first people to adopt this is my understanding is that it's quite an operations-heavy business by virtue of the fact that you serve small businesses and often very small businesses, at least starting off. Do you mind talking a little bit about that and what your customers are like and the complexity that arises from that?

[00:06:30] Eddie Kim: There are just so many different situations that companies can get themselves into. There's a lot of communication that has to happen, either directly to the small business or directly to a taxing agency about a small business. So in a company like ours, there's, like, kind of a lot of just back office operations that we have. We have about 2,500 employees in the company, I believe. And that seems like a lot, actually, a good percentage of that is in our customer support and in our back office operations.

[00:06:54] Eddie Kim: So AI actually can be used in many ways at Gusto, to sort of improve the efficiency and improve the quality of those operations and the frontline support. So first of all, on the support side, you could imagine, with all these different agencies, a very decentralized system, different rule sets, that the kind of breadth of questions that we get is, it's a huge breadth of questions. It's very different from, say, like an e-commerce site, where you may get, you know, a return-type question and you kind of bucket things. We have a very, very long tail of questions that we get, and it's really difficult to sort of train a human being to know the answer to all those questions.

[00:07:27] Eddie Kim: You employ the traditional techniques where you may kind of assign agents as different pillars that are specialized, and they're trained to support those pillars, but that only gets you so far. This is where things like an internal-facing AI copilot actually really helps. It's really good at sort of retrieving documents from an internal knowledge base that our support people are already using, and then kind of pulling out the most relevant ones related to the email that just came in or what's being said over the phone or chat. AI can really sort of narrow down the problem space quite a bit for the customer-facing agent, to the point where today, actually everyone just uses this AI copilot tool, and they don't use the internal knowledge base that we spent years improving.

[00:08:03] Eddie Kim: Now we still have teams that improve those internal knowledge bases, but they're actually used now more for the copilot, the AI copilot, to essentially retrieve from and then give to the CX agent. So it's still really important, but important in a very different way than it used to be.

Build vs. Buy: Internal vs. External AI Tools

[00:08:51] Raza Habib: So you built the knowledge base for the CS team, but now it's used mostly actually by the AI.

[00:08:56] Raza Habib: And did you build that all internally yourselves, or did you buy a vertical solution? I'm kind of curious, particularly because it feels like customer support might be an area where people would buy a product off the shelf. And I'm wondering if you didn't, it sounds like maybe you didn't. Why didn't you, or how did you make that decision?

[00:09:14] Eddie Kim: So actually, we started out with a third-party solution, and I think those are actually fairly decent to get off the ground. A lot of these third-party solutions, they're already storing your internal knowledge base, and so they can, like, pretty easily create embeddings out of those and put them into a vector database and create you a basic copilot. And I think that's going to actually help out of the box, like many, many companies.

[00:09:35] Eddie Kim: But when you start to try to insert maybe specific context about that particular company, or, you know, past cases that they've had, that's where it's hard to just kind of use an out-of-the-box solution, because they don't have access to that contextual information about the customer that can help improve the retrieval of the completion. And so at some point, as we have at Gusto, you realize that you have to kind of build this more in-house.

Prioritizing AI Use Cases

[00:10:07] Raza Habib: It's funny, because when you just think about Gusto in the abstract, there are probably a dozen ways in which various AI products could make Gusto a better product or a better company, whether it's, you know, new AI-based customer-facing features, or ways to make your support agents more efficient, as you just described, or maybe even, like, if you have telephone support requests, maybe an AI handles those, or some percentage of those. When you guys were initially kind of said, "Okay, we're gonna dedicate resources to improving our business using AI," how did you sort of survey the landscape of opportunities within the company to use AI? And how did you pick what you picked?

[00:10:46] Eddie Kim: That was like one of the hardest things, because, like, when you start to think about what AI can do, your mind starts to explode. And, you know, kind of everything kind of becomes like an AI thing at a certain point. And you're working on something, and you're like, "Oh, this is like a really cool idea." And then, like, sometimes you can distract yourselves. And so I think this is one of the hardest problems, is like picking something and really going deep into it and sort of resisting the urge to kind of pivot and get sidetracked on something.

[00:11:12] Eddie Kim: Initially, I knew things could essentially be broadly bucketed into two buckets. Basically, one is more AI to improve our internal operations. So like non-customer-facing things that are more agent-facing, or back office operations facing, or even engineering facing, when you're looking at coding copilot tools. And those are all about, like, how can we automate a lot of the work that we do manually, or how can we augment the people that are doing the work so they can do it more efficiently? And that has, like, direct impact on the bottom line of the business.

[00:11:39] Eddie Kim: And the other bucket is, of course, like, what are some customer-facing things that you can do using AI? And for us, it was really about kind of defining what's the sort of metrics that we want to try to drive for customers? For us, it was really about, how can we save them time? That's a big theme that we hear from our customers, is like doing things like payroll, running reports, getting set up on benefits, they take a lot of time, and they create a lot of uneasiness on like, "Am I following the law?" So they want to kind of have peace of mind. And so for us, it's really kind of focusing first of all on what's the high-level problem that we want to solve for our customers. How do we measure that? For us, it was primarily like, how can we save our customers time? How can we give them peace of mind?

[00:12:38] Raza Habib: So you guys prioritize customer needs, particularly around saving them time in terms of your AI efforts over "Hey, how do we make our operations more efficient?"

[00:12:47] Eddie Kim: Well, so we kind of effectively divided the team into two, and half the team is essentially working more on our internal operations. How do we make them more efficient? And then half the team is working on the customer-facing. How do we save them time? How do we give them peace of mind?

[00:13:01] Raza Habib: And that implies you guys have a central AI team, like you put together a team and then you split it into...

[00:13:07] Eddie Kim: Yeah, so it's a, you know, relatively new team for us. I'm personally leading the team, and essentially we've sort of, at least for now, centralized our AI efforts. And it's all happening out of this central team. I think in the future, I believe that every engineer at Gusto should ultimately become an AI engineer, and it should be sort of embedded in everything that we do, just like, you know, you may have data science or design as part of every team. I think you will have an AI component in every single team eventually. But to kind of get it off the ground, we've decided to sort of centralize it, like, prove out some things from this team, and then start to kind of help democratize it across the company. And that's kind of happening gradually as we invest in our AI platform and developer tools.

Human-in-the-Loop Approach

[00:13:53] Raza Habib: Something you said when you were explaining that really jumped out at me, which is you said, you know, for payroll, it's an area where people really need peace of mind, like it's a high-stakes application for them. And obviously, one of the challenges with using LLMs is that they still sometimes go off the rails or they hallucinate, or they do things that aren't 100% accurate. How do you guys get sufficient confidence in the performance of the models to be willing to deploy them in what is otherwise like quite a high-stakes task? Or do you just not deploy them in other ways? Like, how do you solve that problem for yourselves?

[00:14:24] Eddie Kim: Yeah, we thought about that a lot, and that's definitely a really important topic for us, because of the peace of mind component. The approach that we've taken, which actually gives us a lot of flexibility to ship things and iterate faster, is a lot of times we will use AI to do like 99% of the work for the customer, and only until the very end, we will actually put the human in the loop, and they'll have to, like, kind of literally click a button to execute the thing. And they'll review it, and then at the end of the day, it's a deterministic thing that happens that's, you know, kind of approved by the user, but AI will help get you there.

[00:14:57] Eddie Kim: So I'll give an example, like, if you want to use AI to approve one of your direct reports' time off request. So let's say you're working at a company that's using Gusto. You're a manager, you have a team, and one of your team members has requested time off. And in Gusto, you could do that, but then the manager has to, like, go in and sort of approve that time off request. So you can imagine, like, a manager can go to our customer-facing AI assistant, it'll say, "Hey, I want to approve Ali Rowghani's time off request." The AI will say, "Okay, well, there's two Ali Rowghanis in the company, like, which one are you talking about?" You kind of disambiguate that by having a conversation. Then it says, "Okay, I know which Ali Rowghani you're talking about, Ali Rowghani. And now, turns out Ali Rowghani has two time off requests, one in August and one in December. Which one do you want to approve?" Then you need to have a conversation. You can say, "Both."

[00:15:59] Eddie Kim: It's kind of a UX experience. It's very fault-tolerant because it keeps going back to the human for clarification and asking questions before it kind of executes anything.

[00:16:05] Eddie Kim: It does, but it actually hasn't done anything yet. Now it finally has the information it needs to approve that time off request. So what happens is, it actually shows a graphical user interface, like a card we call it, and it'll say you can approve time off for this person, for Ali Rowghani, you're gonna approve his time off request in December, and this is the reason why he requested time off. And then you, like, literally have a yes or no button, and then only when you click yes, it actually does it. It makes a GraphQL mutation to actually approve that time off request. So you can see how like AI can do the heavy lifting and take you all the way to sort of the last mile. And then it's not AI, actually, it's a deterministic action at the very last mile.

[00:16:51] Ali Rowghani: Is there an ejector hatch somewhere along the way? So one of the bad customer experiences I've had in the past is when you're given a chatbot interface to support or something, and it just doesn't understand you properly, and it just gets very frustrating because you don't know how to fix it. Like, how do you avoid that? 90% of the time, maybe this is going to make people's experiences better, but I would worry about that 10% of time where they're sitting there at the chat interface and they're like, "No, I really don't want to improve this. Is this the wrong Ali Rowghani or something?" And it's not getting there, you know, what they want. Is there a user experience to eject somehow?

[00:17:23] Eddie Kim: Yeah. So one of the things that we think about building this experience is that it kind of is a hybrid of like conversational and graphical interfaces. It's not a purely conversational interface. So when it's trying to disambiguate which time off request, it'll actually show an experience where it'll list the time off requests, and then you can kind of select which one, and you can also kind of hit "never mind" to eject from it.

[00:17:44] Eddie Kim: So wherever we think it makes sense, wherever we think it's a better user experience because it's going to save more time, it's going to result in more accuracy, we'll actually kind of go back to a graphical user interface. There's a lot of talk these days about how conversational is way better, and that's the future interface. And while I think that's true, I don't think it's great for all situations.

[00:18:03] Eddie Kim: It kind of reminds me of when mobile came out. I think there was a period of time where you try to do everything on mobile. And there are some applications where you can do everything on mobile. It's mobile-first, but there are still a lot of things out there that mobile is not great for. And it's more because, you know, your screen size is smaller, that it's harder to type in there. So you have to be really thoughtful, in my opinion, about an interface, like, what is it really good at? And what is it not good at?

[00:18:27] Eddie Kim: Mobile is going to be great for things where you need GPS, or you always have it on you, things where you kind of do something on the go, but mobile's not going to be good for other things. Same with conversational. Conversational is going to be great for a lot of things. But, you know, if you're trying to type someone's name, and it's really long, my co-founder's name is Tomer. Every time I try to type his name, it auto-corrects it to toner, like printer toner. And so that could lead to mistakes from AI because, literally, you're giving the wrong name.

[00:18:52] Eddie Kim: And so I think that's why we try to think about not just purely conversational, which I think can lead to frustrating experiences. Think about how to intermix graphical experiences with conversational experiences, all backed by AI, but ultimately adding up to the best of both worlds.

Centralized AI Team and Approach

[00:19:39] Ali Rowghani: If I may just ask one final question, Eddie, on this topic of how you guys did the actual prioritization between different use cases. I'm kind of curious to what extent was this top-down? Was this the founders in a room together, kind of deciding this is going to be the direction of the company versus bottoms-up, people at the company getting excited about AI and coming to you? And also, could you just narrate for us the literal process you went through? So I understand how you guys came to the conclusion of customer-facing versus internal, etc. But, you know, is this a team around a whiteboard? Are you putting questionnaires out to the company? Are you looking at analytics? Who was leading it and how did you actually do it?

[00:20:08] Eddie Kim: I think we tried to do bottoms-up. I think I would send out messages to everyone, saying, "Hey, this is a thing. Let's everyone think of something that they can do in your domain that incorporates AI." And I remember a process I actually ran, which was, like, during our quarterly planning process, I had actually a top-down mandate to bottoms-up. Every team has to have one or two things that's AI-related.

[00:20:29] Eddie Kim: And every company's mileage, I think, will vary. But it actually didn't work that well for us. A lot of teams, I think, did stuff, but they were pretty small, I think. And I think you can kind of start to see it in the product where you have little speckles of AI, like, in certain features, where this little star button and you hit it and this AI is just sprinkled through.

[00:20:48] Eddie Kim: And it's like you're doing AI, but you haven't really been thoughtful, and it's a little bit scattershot. So we actually ended up pivoting the approach, which is where we said, "Let's centralize a team. Let's bring some engineers into it. Let's hire some AI ops folks, some AI scientists, and let's just run it from the central team, and let's put a co-founder like me to lead it." And so that's really how we got started. And I think we're getting a lot better mileage doing it this way.

[00:21:13] Eddie Kim: I think there's a combination of being data-driven and talking to your customers and understanding what it is that they want, but then also, like, just kind of using your intuition and your years of experience to have strong hypotheses of what will be of value to your customers, and then just go build it.

[00:21:28] Eddie Kim: I think you can spend a lot of time in AI sort of trying to figure out what exactly to build. But I think this landscape is changing so much, the capabilities are changing so quickly that I actually don't think that's always the most optimal way to do things in AI. I think it's kind of like a rapidly evolving space, and that's why I get excited about it personally, because it's kind of like we're starting a startup within Gusto, and you just gotta ship things very, very quickly, see what resonates, and then learn and then do it again. And eventually you find something that really, really resonates, that you kind of decide, "Okay, we're on to something. Let's invest a lot more into this and expand it as quickly as possible."

[00:22:03] Eddie Kim: I think that's the stage that AI is in, and I think that's why a very senior leader in the company, or ideally, like a founder or co-founder with a centralized, small team, I think, is kind of the most ideal way to run it. But, you know, that's just kind of the Gusto experience.

Measuring ROI from AI Initiatives

[00:22:53] Raza Habib: Eddie, you said you've got part of the team working on customer-facing features to hopefully save them time and give them better assurance that things are going well, and then part of the team is on internal efficiency, etc. I'm curious, have you guys got concrete examples that you can point to already where there's been ROI for the company?

[00:23:10] Eddie Kim: Yeah, I mean, definitely the internal co-pilot tool we know is actually saving a lot of time and also improving the customer experience. This is the tool for the support agents.

[00:23:20] Eddie Kim: And, you know, for support like, in many companies, instrumentation is actually quite good. You measure everything there. You measure things like CSAT (customer satisfaction score), you know, average time to close a case. And so we measure all of those things and I think there's very clear ROI in an internal co-pilot tool, because we see it in the metrics for cases that use this and cases that don't.

[00:23:54] Raza Habib: So both in terms of the efficiency of the agents and the satisfaction of the customer?

[00:23:58] Eddie Kim: Exactly. Yeah. We've had some times where it goes down, and the CX team, they're upset because they can't use it. Almost to the point where I was like, "Well, you used to have this internal knowledge base. That's what you used to use, like, three months ago to search for things," and they don't use that anymore.

[00:24:18] Raza Habib: And that's strong internal product-market fit, if the moment it goes down, people start complaining.

[00:24:22] Raza Habib: Eddie, I just wanted to revisit the fact that you guys built this yourself, because, again, it's something I'm curious about. Do you think that's a stage of market thing that there just isn't a good customer service AI product out there? Or do you think it's something more fundamental that actually to get that good AI customer support agent internally really does require integration into so many different things within the company that it's just gonna be easier for companies to build it themselves?

[00:24:49] Eddie Kim: Yeah, well, I guess first of all, just to be super clear, we started out by using a third party, and we still actually kind of use it for our CX, but we've also built our own internal version. And so we kind of actually, right now have two different versions of it. And what we're doing right now is we're split testing it. Some of our agents have the third-party vendor solution, and some of them have the homegrown solution. Our plan, obviously, is going to be to use the homegrown solution, but we're kind of in this middle phase right now.

[00:25:19] Eddie Kim: I think the limitation of third-party solutions is that they just don't have all the contextual information about that specific customer to improve the response, to improve the retrieval process. This is a very, very basic example, but there are certain help center articles that are applicable to a company that is using benefits. That's very obvious. If they're not using our benefits, our health insurance, then it doesn't make sense to try to retrieve Help Center articles related to our benefits product.

[00:25:48] Eddie Kim: That is one form of customer contextual information that a third party may or may not have. And we have, we could share it with them. But then at some point, you start asking yourself, like, how much do we feel comfortable sharing? Do you want to share what states they're in, how big the company is? And a company like Gusto, you sometimes just start getting into health-related information, PII. And so it can be done. It just, I think, eventually gets a lot harder, and so you kind of find yourself limited in the contextual information about that customer that allows you to improve the retrieval and the completion.

[00:26:27] Raza Habib: So it's really kind of privacy security considerations that sort of tilt the balance towards building something internally, as opposed to using an off-the-shelf tool, because you don't want to give the off-the-shelf tool as much access as it would need to outperform an internal tool that has full access?

[00:26:44] Eddie Kim: Yeah, and the way we're thinking about this in the future is, you know, some of the stuff that saves the customer-facing AI work that we're doing, I'm actually starting to see it converge a little bit with the internal-facing. So a lot of questions that our customer support team will get, AI can actually do it for you.

[00:27:01] Eddie Kim: You can say, "I want to approve someone's time off. I don't know how to do it." You know, in some cases that actually may, is probably not a great example. That actually might result in someone starting a chat with a Gusto agent to kind of help get that done if they can't discover how to do it. But imagine, we can build, we have built, actually an agent that can do that for you in the way that I just described. It kind of guides you through the flow.

[00:27:23] Eddie Kim: You go to get support, get help at Gusto, and then you ask that question, and then our agent might say, "You know, I can actually help you do that." And then you go through that flow where it actually does it for you. That requires the tool to sort of actually take action on your behalf, which obviously would be very, very difficult to trust a third party to do. You kind of essentially have to give that user session to a third party, and then you have to call the API on the user's behalf, which we certainly wouldn't want that to happen. And so those can actually help reduce the support tickets you get, can help solve problems that the customer has but require a level of access that you may not want to give to a third party.

[00:28:10] Raza Habib: That makes sense, great. I think you've done a great job answering the ROI question in terms of saving time for customers, which you said was your other priority. Any early results there that are encouraging from an ROI perspective?

[00:28:22] Eddie Kim: Yeah. So, I mean, one of the first things that we worked on was a reporting model, which actually, you know, Raza has seen an early version of. We've actually measured what it is that our customers spend the most time on at Gusto. And it's two things by far. One is running payroll, that takes a while. And two is they spend the most time on the reporting page, where they may have some question about their company. How much taxes did I pay last quarter, or how many employees do I have?

[00:28:48] Eddie Kim: If you think about what you have to do to get that question answered, step one: go to the reporting page in Gusto. You kind of in your head, you think about, what is the report that I need to generate? What are the columns I need to select? What's the date range, the filter bys, the group bys, I need to select in that reporting interface to get the data that I need that will contain the answer to that question?

[00:29:10] Eddie Kim: So you go do that, you generate the report on Gusto. You're not done yet, because after that, you actually have to then download that CSV file. You may load it into Excel or Google Sheets, use some pivot tables, you do some sums or counts or whatever, and you kind of synthesize that data to finally get the answer that you have. The report is sort of like an intermediate step to actually ultimately answer a business question.

[00:29:32] Eddie Kim: And that whole process for a basic question, it could take 10-15 minutes. If you have a complex question that can take hours. And we know it takes hours because sometimes we talk to our accountants. We talk to our accountants all the time, and they'll, you know, if they have a report they want or a question they want to answer, they tell us they spend a lot of time kind of generating the report, synthesizing the answer.

[00:29:52] Eddie Kim: This is a perfect task for AI. This is one of the first things that we built, which is, you just ask the question to a model. The model is really good at figuring out from your question, what columns to pick, basically what data you need, and it actually goes into our reporting and generates that report. Then it's actually really good at crunching the numbers to kind of give you the answer.

[00:30:11] Eddie Kim: Now that takes about 40 seconds to do that, which seems like forever when you're like, you know, you ask the question, and the thing is spinning and it actually says, "I'm working on it, working on it. I download the report. Now I'm crunching the numbers." Just imagine that takes 40 seconds in a web interface, like that feels like forever. But if you think about how long that would take normally, to go through that process, to do it yourself, like, at least 15 minutes. It's way, way faster.

[00:30:35] Eddie Kim: And so that's just one small example of how it saves a ton of time. This kind of presents a really interesting challenge where you're kind of comparing the response time of AI to like, how quickly you know, normal web request response, which is like 50 milliseconds, but I think that's the wrong anchor point. If you think about a lot of AI is replacing what you might do manually as a human, if you kind of think about it from that anchor point, it's actually saving you a ton of time.

[00:31:35] Raza Habib: How do you communicate that to the customer? Like, do the customers appreciate that? Because I can imagine it being a huge amount of real value to them, and them still being frustrated just because humans being what humans are, we adapt very quickly.

[00:31:46] Eddie Kim: Yeah, we, I mean, this, we have we right now, we just say, just set expectations. This can take one to two minutes. And I don't think that's a great solution. But if you have better ideas than that, that's the thing with AI, right? It's like a completely new technology, and so everyone's trying to figure out, like, how, what's the best way to interface with humans, set expectations correctly? Obviously, it's gonna get faster and faster, but I think everyone's just still trying to figure out how to incorporate this into our computer interfaces. And it's going to be a little bit different than, you know, a normal web app or REST API.

AI-Powered Reporting Feature

[00:32:25] Raza Habib: It's easy to forget how unbelievably early all of this still is. Thanks for those examples. I mean, one of the reasons I was most excited to chat to you about it is because there's so much hype. But those are two very concrete examples of, you know, needle-moving ROI activities at a sizable company that I think hopefully show to the world that there is a lot more here than just chat bots and just hype, that actually, this is making real impact for companies. Do you mind if we dive into the weeds a little bit on the technical side for a few moments? And I would love if you wouldn't mind just, you know, picking one of the AI features you've built, whichever one that you think is most interesting, and just talking us through the actual stack of it, how you built it, what is the process for building one of these features in practice?

[00:33:10] Eddie Kim: Yeah, so reporting is really interesting, because reports are depending on your role in the company. There are certain reports you can generate and certain reports you can't generate. So for example, if you're the super admin of the company, you might be able to generate a report that has the salaries of every single employee in your company. But if you're a manager in that company and you're going to our reporting page, you shouldn't be able to do that. You should only be able to, depending on how you set up the permissions on Gusto, you should only be able to generate a report that contains the salary of your direct reports or your indirect reports.

[00:33:44] Eddie Kim: So we've built a pretty solid authorization layer that reporting has to go through, so that it knows what reports it can generate and what it can't based on the user requesting that report. And as the same with any API call, there's really, we have to be really thoughtful about the role-based access of those APIs. So one of the design principles is that we, in no circumstances, want AI to accidentally get information that it shouldn't have access to. And so that's why we actually sort of have that reporting model go through the same reporting backend APIs as the web app goes through.

[00:34:36] Raza Habib: Similar to your internal database was built for humans, but is now being used by the AI?

[00:34:41] Eddie Kim: Exactly, yeah. So our AI models are basically sitting on top of the same APIs that our web app is sitting on top of. And I think it's very tempting to not do that and just give it direct database access. Obviously could be much more powerful, but there is a non-zero probability that it can pull a column or row that it shouldn't have access to depending on that user. So I think it's really, really important, from a design perspective, to sort of go through that same authorization layers as anything else is going through. And yeah, at times, I think it can be restrictive, but I think it is actually the safest thing to do.

[00:35:16] Eddie Kim: And so when you use the AI reporting functionality, you ask a question, it basically has an understanding of the API endpoints, what columns can be selected, and what time ranges it can select. And then it actually sort of sends over a JSON request to them.

[00:35:44] Raza Habib: And how is that literally being achieved? Are you guys using something like OpenAI function calling with the prompt, or have you fine-tuned a model? How do you achieve that ability to give the model access to these APIs?

[00:35:55] Eddie Kim: So the prompt actually has the kind of columns that it can select. It basically describes what it can do from what API, what it can call from on our API. So then it actually makes that call. So then the backend will return a CSV that has the data that the model requested. That CSV looks a little bit different than what a human might get. The report that a human gets has nice titles and, you know, it's more human-readable. The report that the AI model is getting has the same data, but it's formatted in a way that it's easier for it to process.

[00:36:28] Eddie Kim: So now this model has the data. We then have it pass it on to a second model, which then takes this CSV with the original question, and then it'll actually analyze that data to come up with that answer. And there we're using OpenAI's assistants API, particularly because of the code interpreting functionality that it has, so that it can basically do various operations on that CSV to kind of do whatever analysis it needs to answer the question.

[00:37:04] Raza Habib: Okay, so you're actually using their code runtime. So it's kind of running code through the assistants and then processing the CSV?

[00:37:11] Eddie Kim: Exactly, yeah. And then it gets the answer. And then we finally return to the user the answer to that question, but we also return to the user the CSV that was generated, the human-readable version of that. And that way, you know, kind of going back to sort of the human in the loop, and trust but verify. You know, you could take the answer at face value and run with it, which in the vast majority of cases, that will be just fine. But if you want, you can also open up that CSV, see the data that was pulled, and that dataset, that CSV is usually simple enough that you can actually very quickly verify that it actually got the right answer. So I think it's really important for us to actually share the kind of source along with the answer.

[00:37:52] Raza Habib: Can I just reflect for a moment on how crazy what you said just now actually is, but we just treat it as banal now? Like just one step in this pipeline is the model looking at a CSV, deciding what analysis to run, writing the code for it, running the code, and then returning the output. That's just like one step in the pipeline, and we just take that entirely for granted now. But a few years ago, that in and of itself, I think, is pretty incredible and was, I think, impossible two years ago.

[00:38:19] Eddie Kim: I mean, even the first step of taking a natural language question and it knowing, of the hundreds of columns that you can select in our reporting functionality, which five to select, and what is the time range, to me, like, blows my mind too, that it can do that as well. So I mean both, it's a two-step process here, and both are just kind of like mind-blowing to me. And it's even more mind-blowing when you string them together.

Code Generation and Developer Tools

[00:38:46] Ali Rowghani: I guess one thing, Eddie, I'm interested in, you touched on it a little bit, relating to the trust issue with internal versus external tools. But what other areas of AI have you guys, maybe not tackled yourselves, but are looking at other vendors to help you with? I mean, one area that pops to mind is just the code generation area. You know, there's a lot of startups that are active there. And Microsoft obviously is active there. What do you think about that space? Are you guys working on it? Are you using any third parties, and what do you view as the future?

[00:39:17] Eddie Kim: Yeah, we're definitely using some tools and learning about others. You know, we started out by using GitHub Copilot, and many of our engineers are still using it. We've also kind of built some internal stuff that, you know, is really helpful in our context. So we have, like, for example, a very, very large GraphQL graph. So you can imagine, across all those areas I talked about, payroll benefits, applicant tracking, learning, as the list goes on and on, of all the things that we have, like, our graph is just ginormous, and it actually takes, believe it or not, like, many hours of meetings, sometimes across several days, for an engineer to try to figure out when they're building a feature, what's the GraphQL query or the mutation that they need to make for the backend.

[00:39:59] Eddie Kim: And that's especially the case when there's a functionality that they're building a query that spans across multiple domains or multiple teams. You have to kind of set up a meeting with that team and kind of understand each other's graph a little bit more basically. So one of the things that we built was a semantic understanding of our graph, basically. And we created this giant JSON file that essentially describes every GraphQL endpoint, like, what are the inputs and outputs, and what it's typically used for.

[00:40:24] Eddie Kim: And that's actually built using AI, using some prompts, by fundamentally just kind of having it analyze the file that the code sits in. So it's kind of iterating through every single one of our files, and it's sort of building this JSON that kind of has information about what each GraphQL endpoint is doing, and then we put that into, we embed those, and we put those into a vector database. And now any engineer can sort of ask a question. They could ask the question like, "How would I update an employee's compensation?" And it actually is pretty good at responding with the actual GraphQL query that you would have to make to do something like that.

[00:40:57] Eddie Kim: Or it doesn't even have to be questions that have it return like a GraphQL query. It could just be like, "Hey, where is this file sitting?" Or like, help you navigate a codebase. So that's something that I think is somewhat specific to Gusto that we've had to build in-house. It actually improves developer productivity quite a bit.

[00:41:13] Eddie Kim: And I think that's kind of the future of AI coding tools, where instead of just sort of what is essentially a prediction of what is the next line of code that you're going to write, I feel like the best coding tools out there are going to be able to build a semantic understanding of your codebase, documentation around it, architecture, how it all works, and then you'll be able to query it. And I think the results will be much, much stronger.

[00:41:35] Eddie Kim: Right now, I think a lot of the AI coding tools are sort of primarily doing just basically trying to predict the next line that you're going to write. So it's really great for certain use cases where, you know, like, we use it a lot for writing tests, because those are kind of a lot of very well-defined patterns on how to do that. So it saves a lot of time there. But as soon as you want to do something a little bit more novel, I think it requires the model to have an understanding of how your codebase actually works. So I think that's the future of AI coding tools.

[00:42:35] Raza Habib: And it sounds like it's the difference between being able to automate the boilerplate versus actually usefully having some understanding of the problem and situation and contribute in some way. So not just time saving, but actually now the AI systems contributing. I find it fascinating.

Impact of AI on Companies and Society

[00:42:52] Raza Habib: I notice we're probably coming near to our time, and so I wanted to ask Eddie just a few kind of bigger picture questions to wrap up and get your take on it. I'm always curious to see how different people are thinking about what the likely impact of AI will be, both on their companies, but also like society more widely in the future. And so I wonder if you were to try and take a five-year time horizon and you imagine Gusto and society five years from now, what do you think will be different because of AI? What do you imagine will be the same?

[00:43:24] Eddie Kim: One potential way things change significantly because of AI is that sort of the silos that different SaaS products and anything on the web, if you think about the web, everything is very siloed in its own domain. I think a lot of those sort of potentially get broken down and sort of integrated together to deliver a much better user experience.

[00:43:44] Eddie Kim: So just like a small example of that, if you think about Google search, essentially, it's just kind of pointing you to which silo to go to. You have a question about something, and it's basically saying, here's the top 10 pages that might contain the information that you have. So then you go click on the top search result, and then you go into that silo, and then you kind of read it, and you get that information.

[00:44:05] Eddie Kim: You think about what ChatGPT did is, they kind of took all that information, essentially, from those silos, and they kind of brought them together that way. You just ask a question and it gives you the answer. That way, you don't go to all these different silos to figure out what the answer is yourself. And there's a lot of value, that's very, very valuable from the user's perspective, because you don't have to kind of go through all these doors to get something done.

[00:44:28] Eddie Kim: So I think that AI has the potential to sort of bring down these silos and provide a much more integrated experience for the user. Like, the most useful things are going to be things that, you know, you have your information on your bank website, and then you have your information on Gusto, and you kind of want to do something with those two pieces of information, but you can't really do that today. I think AI potentially creates a world where those are not so separated out, and you can kind of do things that require information from both those systems.

[00:44:57] Eddie Kim: And of course, there's this kind of question of how is that going to work from the perspective of the owners of these silos? If you own a site, and you know you have content, you're not happy if OpenAI is scraping it and taking it out of your silo and monetizing it themselves. So that's an interesting question on how I think there's going to be a tension there from what's better from a user experience perspective, and what can AI do, and how much do SaaS companies, or any company want to kind of keep the silos themselves?

[00:45:48] Raza Habib: I don't know the details of this, but apparently, actually, something similar happened when Google search first came around, or when search in general came around and people were indexing websites, and there was a little bit of a debate. I mean, that's why we have robots.txt, right? To say like what can be scraped and not. And at least in that case, the benefit to the users won out over, I think, the benefit to the resource owners.

[00:46:12] Raza Habib: It's not so clear whether that'll fall quite the same way here with AI. I know that something that really stuck in my mind was a little bit after DALL-E came out, for a little while on ArtStation, the only thing that was trending on this online website where artists post art was just the letters "AI" with a big cross through them. Like people were viscerally angry about this. And I suspect there hasn't been quite the same emotional response or the same feeling of theft that there was the first time around. So there are obviously similarities, but also differences, because people feel very strongly that their content is being used and somehow they're not being rewarded for it.

[00:46:45] Eddie Kim: Yeah, totally. I mean, you kind of saw it also play out a little bit with Yodlee and Plaid and this is information that is sitting within bank websites, but it's clearly better if you can kind of get access to it from other web services, like Gusto. And so there's this battle played out on what do the banks want to protect from their data, but actually, what's better from a customer perspective, and what's the bank's data, what's the customer's data? And so I think there's probably going to be a little bit more of this conversation in the future.

AI Safety and Risks

[00:47:22] Raza Habib: Do you have nightmares about AI safety?

[00:47:28] Eddie Kim: I don't. Personally, I think I don't know. At least in my experience, I've seen, I've played around with AI enough to where I know where its limitations are. And so I'm not too worried about AI safety. I think like a lot of new technologies, there are definitely things to be worried about, but I think we find solutions to them, and then everyone feels a lot more safe about it.

[00:47:53] Eddie Kim: So, you know, when cloud computing came about, a lot of people were, rightly, at the time, very concerned about "my data is literally sitting in the same computer as another company, that is literally on the same hard drive." Like that was very concerning, I think, for people. And I remember at the time, getting asked a lot of questions on security questionnaires like, "Do you have access to your physical servers? Do you have guards guarding them?" Things like that. And these are not big concerns anymore, right? Because we've built the technologies to kind of safely protect data.

[00:48:36] Raza Habib: I think there's something interesting and wise in that simultaneously accepts the seriousness of the risks while not being worried about them, right? I think that it's very easy to kind of dismiss people who have concerns as doomers, or say that they're unnecessarily worried. And it's easy to forget that in the past, when we have avoided these problems, I don't know, you think about many, many sort of things that people worried about and then didn't come to pass. It's easy to say, "Look, you worried about nothing." But the reason, the reason that the bad thing didn't happen is often because people intervened and found solutions to avoid it.

[00:49:13] Raza Habib: And, you know, so in some senses, the optimistic take in my mind is the one that says, "Hey, look, there are these problems. There are definitely real risks. But if we take them seriously, we can solve them." I get frustrated by people who dismiss the risks, because I actually think they're the ones who are undermining the good future.

[00:49:26] Eddie Kim: Totally. Yeah, I think we should appreciate the people that are kind of pointing out the risks. And you know what I sometimes like to call, like, paint things red, right? Like, where you have concerns, we should clearly demarcate where we're concerned about it, and then we should do something about it. And I have faith that, you know, over our generations, we've always found solutions to those things and been able to progress, get the benefits of the technology, and kind of still protect ourselves from the things that can go wrong.

Closing Thoughts

[00:49:54] Raza Habib: Fantastic. Well, Ali, unless there's anything else you want to ask?

[00:49:58] Ali Rowghani: No, thank you so much for your time.

[00:50:00] Eddie Kim: Thanks.

[00:50:04] Raza Habib: All right, that's it for today's conversation on High Agency. I'm Raza Habib, and I hope you enjoyed our conversation. If you did enjoy the episode, please take a moment to rate and review us on your favorite podcast platform, like Spotify, Apple Podcasts, or wherever you listen and subscribe. It really helps us reach more AI builders like you. For extras, show notes, and more episodes of High Agency, check out humanloop.com/podcast.

[00:50:30] Raza Habib: If today's conversation sparked any new ideas or insights, I'd really love to hear from you. Your feedback means a lot and helps us create the content that matters most to you. Email me at raza@humanloop.com or find me at RazaHabib on X.

About the author

avatar
Raza Habib
Cofounder and CEO
Raza is the CEO and Cofounder at Humanloop. He was inspired to work on AI as “the most transformative technology in our lifetimes” after studying under Prof David Mackay while doing Physics at Cambridge. Raza was the founding engineer of Monolith AI – applying AI to mechanical engineering, and has built speech systems at Google AI. He has a PhD in Machine Learning from UCL.
Twitter
𝕏RazRazcle

Ready to build successful AI products?

Book a 1:1 demo for a guided tour of the platform tailored to your organization.