로그인|회원가입|고객센터|HBR Korea
페이지 맨 위로 이동
검색버튼 메뉴버튼

AI Startup

Redefining the Web’s Future With AI Agents

Dong-A Ilbo | Updated 2026.01.29
Interview with Devi Parikh, Co-founder of Web Agent AI Startup Yutori
 

Artificial intelligence (AI) is evolving into “technology that works on behalf of humans.” At the center of this shift is the “agent.” An agent is an AI that understands a user’s goal and then calls the necessary tools or directly navigates the web to carry out tasks. In this market, where Big Tech companies are rushing in with offerings such as OpenAI’s “Operator” and Anthropic’s “Computer Use,” one Silicon Valley startup is drawing particular attention: “Yutori,” a web agent startup founded by three former members of Meta’s AI research lab.

Paradoxically, the more technology advances, the busier modern people become. They compare product prices, constantly refresh ticketing sites, and jump between dozens of tabs to track news and information. Time scatters amid endless context switching and repetitive clicks. Yutori is taking aim squarely at this problem. Its starting point is the belief that “we should not be interacting with the web this way.”

Yutori co-founder Debbie Parikh during an interview at Yutori’s headquarters in San Francisco, USA. Photo provided by Choi Joong Hyuk, CEO of Palo Alto Capital

Yutori co-founder and co-CEO Debbie Parikh is a leader in the AI field who previously served as senior director of Meta’s generative AI organization that led the development of the large language model (LLM) “LLaMA.” She has also been a professor at the Georgia Institute of Technology. Her husband and co-founder, Dhruv Batra, headed research on Embodied AI (AI that interacts with the physical world, such as robots) at FAIR (Fundamental AI Research), Meta’s core research division. The third co-founder, Abhishek Das, was Batra’s PhD student and has collaborated with him on research for a decade. He also worked as a research scientist at Meta FAIR. While building their respective careers, they held on to a promise to “start a company together someday,” and once the time felt right, they left their stable positions and took the plunge as a team.

Yutori’s vision is clear: to fundamentally change how people interact with the web. Instead of users clicking and browsing on their own, an AI agent monitors the web 24/7 in the background and retrieves the necessary information. Its first product, “Scouts,” was officially launched in December 2024. It tracks web information in real time that users care about—such as product price changes, event ticket releases, or startup fundraising announcements—and notifies them accordingly.

The technology itself is noteworthy. Yutori’s in-house browser-usage model “Navigator” decides what actions to take solely by looking at website screenshots. In benchmarks against computer-use models from OpenAI, Google, and Anthropic, it recorded top-tier performance. By training a model specialized in browser manipulation rather than relying on general-purpose LLMs, the company succeeded in reducing costs to one-tenth within a few months.

Silicon Valley’s response has been enthusiastic. Top-tier figures have joined as advisors and early investors, including Felicis, Radical Ventures, Stanford professor Fei-Fei Li—often called the “godmother of deep learning”—Jeff Dean, Google’s chief scientist who has led Google AI, the CTO of open-source platform Hugging Face, the CTO of AI startup you.com, and Elad Gil. It is unusual for a startup still at the seed stage to assemble such a lineup.

Choi Joong Hyuk, CEO of Palo Alto Capital (left), and Yutori co-founder Debbie Parikh. Photo provided by CEO Choi Joong Hyuk

The author met co-CEO Parikh at Yutori’s headquarters in San Francisco in December last year. She spoke about why an AI leader who used to manage teams of hundreds at Meta decided to start again from “zero,” and how Yutori envisions the “future of the web.”

From academia to entrepreneurship, and the birth of “Yutori”

Yutori’s co-founders. From left: Abhishek Das, Debbie Parikh, and Dhruv Batra. Photo provided by Yutori

― Yutori’s co-founders were building top-tier careers in AI at Meta GenAI, FAIR, and Georgia Tech. In an interview with the author, Databricks co-founder Professor Ion Stoica said he launched his company because he wanted his research to have an impact on the real world. Despite your academic and professional success, what specific “trigger” led you to found Yutori?

“Several factors came together. All three co-founders were in similar situations. Dhruv and I were professors at Georgia Tech, and all three of us had been working at Meta before leaving together. Dhruv was leading Embodied AI research at Meta.

First, as our roles at Meta grew, we became increasingly distant from hands-on work. At the time, I was a senior director in the GenAI organization, leading multiple teams. It was a fascinating domain that had the highest level of priority within the company. But over time, I started to miss the feeling of directly driving the work. I wanted to get back to the front lines, where I could engage with the details, not just connect and manage people.

Second, there is our long-standing relationship as co-founders. Dhruv and I had been doing research together for nearly 20 years and running a lab at Georgia Tech, while Das had been Dhruv’s PhD student and our collaborator for almost 10 years. We had been talking for a long time about starting a company together someday.

Third, there was Abhishek Das’s decision. He felt his role at Meta had reached a “saturated” state and was planning to start his own company. He suggested we do it together, and as several conditions lined up, we made the decision.

I tend to get excited by new experiences. There is a basic thrill in doing something for the first time, in taking on a new challenge. Starting from zero, building a product, bringing it to market, seeing users find value in it, and then iterating and growing it—that process itself is exciting. It was not one single factor but the intersection of all these elements that became the trigger for founding the company.”

― “Yutori (ゆとり)” is a Japanese word meaning “peace of mind” or “mental room.” It contrasts with Silicon Valley’s culture, which emphasizes hyper-efficiency. Paradoxically, as technology advances, people are becoming busier. What does “true yutori” mean to Yutori, and how is this philosophy reflected in your products and organizational culture?

“In Japanese, ‘yutori’ refers to mental spaciousness, or peace of mind. I do not see this as being in opposition to efficiency. In fact, it describes a state in which one can focus on what truly matters, where the key is less context switching and the disappearance of distraction and mental busyness. Constant context switching and a frazzled state are essentially the opposite of yutori. Yutori is about enabling people to devote their time and energy fully to meaningful work and, as a result, improving ‘efficiency for what really matters.’

This philosophy is reflected most directly in product functionality. ‘Scouts’ is our first product, and it was officially launched the day before this interview (10 December 2025). It tracks user-specified interests—such as product price changes, restocks, event ticket sales, startup fundraising announcements—24/7 in the background and notifies users when changes occur. There is no need to refresh websites, switch tabs, or constantly check social media. The internet is brought to users “proactively.” The product is designed to minimize context switching, and the agent takes on that role on behalf of the user.

The same philosophy is present in the product’s design (UI). We made it clean so that it does not unnecessarily grab attention. Yutori is evident both in “what it does” and “how it is built.”
In terms of organizational culture, the influence is more indirect. The very name of the company reflects the co-founders’ belief that “being able to focus on what matters” is a core value. We believe that an obsession with detail, a culture of not making empty promises, and keeping commitments ultimately translate into product quality.”

― Yutori’s lineup of advisors and early investors is striking. It includes mainstream Silicon Valley figures such as Fei-Fei Li and Jeff Dean. As an early-stage startup, how were you able to bring such people on board?

“To be honest, there was some luck involved (laughs). I came to know people like Fei-Fei Li and Jeff Dean through the research community—meeting at conferences or interacting in other contexts. So when we decided to start the company and reached out, they were happy to support us. They are all very kind and generous.

There were also people we did not know personally, but whose design sensibilities or companies we had long admired. They are the kind of people we aspire to match someday. As supporters began to emerge one by one, they introduced us to others, and our network gradually expanded.

We met Elad Gil and Sarah Guo through the “No Priors” podcast they host, where I appeared a few years ago. In a sense, various puzzle pieces came together from different directions. We are simply very grateful that such people are willing to back us.”

Diverse use cases for the agent “Scouts”

― Scouts appears to have a wide range of use cases, from personal tasks such as reserving tennis courts to business use.

“That is correct. People are using it in very diverse ways. On the personal side, I use it to track product prices and discount codes, reserve tennis courts, make restaurant bookings, and buy tickets. Its utility in work contexts is also significant.

The important point is that Scouts does not overpromise with claims like ‘it will book reservations for you’ or ‘it will make purchases unconditionally.’ The product’s functionality is deliberately defined narrowly. It simply monitors information and alerts users when something changes in the future—that is all. However, the domain of what it can monitor is extremely broad. It can track almost every area of the web.

I think this combination has been effective. By narrowing the function, we can deliver it reliably, and people can use it in day-to-day life. At the same time, because the domain is broad, we can obtain important signals about what targets are most frequently used and where deep usage occurs. We are designing our next expansion directions based on this data. For now, we see this combination of ‘narrow function, broad domain’ as a strong starting point.”

Technical deep dive: latency and vision models

― Web agents are close to a “video-to-action” problem, where the system has to interpret a continuous stream of visual data and decide how to act. In such an architecture, there is inevitably a trade-off between model complexity and latency. How does Yutori’s “Navigator” address the latency issue from a technical standpoint?

“We use static screenshots instead of video. The agent captures a screenshot of the website, decides on an action, and then takes another screenshot, repeating this process. In other words, instead of processing a continuous video stream, it follows a ‘see the screen → act → capture again’ pattern. Because the agent does not need to process all past frames and can decide the next action based mainly on the most recent screen, it helps reduce latency.

Another point is that we trained our own model specialized in browser use rather than using a general-purpose LLM. Instead of designing a model that is good at everything—coding, math, general knowledge—we focused exclusively on web navigation. This specialization leads to faster inference and advantages in latency.

Finally, we use only rendered visual information, not HTML (DOM) code. Early on, we assumed a machine did not need to see what humans see and could just read the code directly, but in reality, every website has a different structure and many exceptions, which forced us to deal with each site individually. In contrast, rendered screens are already optimized for humans and thus provide a much more general and stable input format.”

The fork between rogue bots and delegated agents

― One of the fundamental constraints on web agents is the “spear and shield” conflict with security measures such as CAPTCHA. Websites try to block bots, while agents try to navigate them. How does Yutori intend to manage this adversarial dynamic in a sustainable way? Where do you draw the line between what an agent should and should not do?

“In the long term, I think the ecosystem itself will change. In the past, bots were malicious entities that scraped data from the web without permission. Today’s agents, however, serve as ‘proxies’ that act on behalf of users. They perform tasks that users want to carry out on websites. Therefore, the assumption that ‘all bots are bad’ is likely to evolve over time.

Going forward, I expect the ecosystem to differentiate between ‘rogue bots’ that scrape data without permission and ‘delegated agents’ that act as representatives of users. There are already early efforts to create protocols—such as header-based identification—that can distinguish who the agent represents and which website it is being sent to.

Taking a step back, there is no fundamental reason humans should have to keep opening browsers, clicking buttons, and filling out forms. These tasks will be automated and carried out by agents. Some websites will embrace this change early, while others will wait and join after the ecosystem matures. I do not think it will remain an endless cat-and-mouse game.

As the web becomes more agent-friendly, elements like Model Context Protocol (MCP) servers will become more common. Technically, Yutori’s stack is flexible. If a website provides an API or MCP, we will use that first. Only when such options do not exist—for example, on local tennis court booking sites—do we launch a remote browser and use a vision-based model to click through and gather information. As the ecosystem evolves to be more agent-friendly, we will be well positioned to leverage that trend.”

Agents that do not log in: the “staircase of trust”

― Many users remain uneasy about giving AI agents their login credentials. What technical and product measures have you implemented to bridge this “trust gap” and ensure security?

“Currently, Scouts only monitors publicly accessible information. It does not log in on behalf of users.
We have adopted a ‘staircase of trust’ strategy—a gradual, step-by-step approach to building trust over time. The idea is to first demonstrate usefulness and reliability in low-risk areas and then gradually earn more delegation of authority and context, creating a virtuous cycle that allows us to deliver greater value.

The risk level of tasks also varies. For high-risk tasks like buying airline tickets, it is better not to complete payment immediately. Instead, the agent should find flights and send a link asking, ‘I found this—do you approve the purchase?’ For low-risk tasks such as ordering a pizza, more automation can be allowed. There are different levels of tasks, and users’ tolerance for risk can expand over time.

Scouts is a read-only service that does not perform actions that change the world, which makes it well suited as the first step in building trust. If we consistently deliver information reliably, users’ trust will grow, and over time they will feel comfortable delegating more tasks.”

Popular News

경영·경제 질문은 AI 비서에게,
무엇이든 물어보세요.

Click!