United States v. Google/Findings of Fact/Section 2H

←

United States v. Google
United States District Court for the District of Columbia

Findings of Fact, Section II. General Search Engines

→

4654210United States v. Google — Findings of Fact, Section II. General Search EnginesUnited States District Court for the District of Columbia

Layout 2

H. Artificial Intelligence

107. “Artificial intelligence is the science and engineering of getting machines, typically computer programs, to exhibit intelligent behavior.” Id. at 6339:18-20 (Nayak). One application of AI enables computers to understand and solve problems without human intervention.

108. For instance, AI researchers have sought to program “computers to directly understand a document or a passage just based on the words.” Id. at 1909:5-6 (Lehman). These sorts of programs are known as LLMs or machine-learning models. See id. at 2667:25–2668:4 (Parakhin) (“A large language model is the closest that humanity came to producing actual artificial intelligence. It is a system that can look at written text or images, and reason over it and provide answers in a human readable flowing sort of language.”).

109. Beginning in 2015, Google increasingly began to incorporate AI technologies into its search processes. Id. at 6341:18–6342:11 (Nayak). Around that time, Google published “a family of deep neuralnets that are called transformers that . . . take an input and spit out an output[.]” Id. at 7403:9-17 (Raghavan). This technology, which is incorporated into signals like MUM, allowed Google to rely on less user data and still improve its ranking of search results. FOF ¶¶ 97–101.

110. For instance, AI technology has accelerated search quality with respect to spelling corrections or semantically related concepts, without relying on user data. Tr. at 3697:7-17 (Ramaswamy). Neeva leveraged machine learning to develop its spell-correction technology, as opposed to relying entirely on user data. Id. at 3781:23–3783:20 (Ramaswamy). And if a user were to query “vacuum cleaner for a small apartment with pets,” Google’s transformer technology helps discern “whether the user wants an apartment, a vacuum cleaner[,] or a pet[.]” Id. at 7405:511 (Raghavan); see also UPX197 at 211 (discussing impact of machine learning on relevance).

111. AI technologies have the potential to transform search. Tr. at 3696:11–3697:21 (Ramaswamy) (“AI enables search engines to do things that are not really conceivable in a return-a-set-of-links model, which is what commercial search engines generally do today.”). Recently, Google and Bing have incorporated generative AI technology into their SERPs by providing “AI-powered answer[s],” which do not rely on user data to produce. Id. (generative AI can supplement user data by offering different SERP functionality beyond organic links, such as an “AI-powered answer”). Such answers also can come in the form of AI chatbots, such as Bing’s BingChat (now Copilot) and Google’s Bard (now Gemini). Id. at 8272:9-24 (Reid). The input could be an image or words, and the output may be similarly varied. Id. at 7404:8-11 (Raghavan). Neeva also relied on AI-generated search results to differentiate itself from other GSEs and used AI to develop a search product with less user data. See id. at 3696:11–3697:21 (Ramaswamy).

112. The integration of generative AI is perhaps the clearest example of competition advancing search quality. Google accelerated and launched its public piloting of Bard one day before Microsoft announced BingChat, the integration of ChatGPT’s generative AI technology into Bing to deliver answers to queries. Id. at 8272:4-7 (Reid); id. at 2670:10–2671:9 (Parakhin). (describing BingChat).

113. AI also has applications in search advertising. “Natural language understanding is a subfield of artificial intelligence” that seeks to “understand what it is a user is trying to get done, going back to the intent.” Id. at 7376:1-3 (Raghavan). Google applies natural language understanding to its search advertising to better discern user intent and deliver an optimally responsive advertisement. Id. at 7376:3-21 (Raghavan).

114. Despite these recent advances, AI has not supplanted the traditional ingredients that define general search. See UPX197 at 211 (“There is a lot more to web ranking for which [machine learning] seems much less appropriate.”). And it is not likely to do so anytime soon. Tr. at 7531:23–7532:8 (Raghavan) (“I view this as a journey, not as something that happened overnight. And I think what we in the industry have to figure out is how to use the AI . . . tools to do a better and better job of defining the user’s intent and giving just the perfect answer. And what I’ve seen so far is one more step. I think there’s a few more steps to go, and I expect that in time, for instance, you will see these language models be able to service queries not only from typewritten prompts, but voice queries, image, camera, as well. And that’s a journey that we’re still early on.”); id. at 7530:7-8 (Raghavan) (“It’s not the case . . . that everything we do in ten years will be through” LLMs.); id. at 7530:9-18 (Raghavan) (Google has no plans to stop crawling and indexing the web in the foreseeable future nor will it stop presenting users with organic links on the SERP); id. at 7665:23-25 (Pichai) (“Now with artificial intelligence, I think we are again in the early stages of completely rethinking what’s possible for our users.”).

115. Importantly, generative AI has not (or at least, not yet) eliminated or materially reduced the need for user data to deliver quality search results. Id. at 3697:17-21 (Ramaswamy) (“[T]he middle problem of figuring out what are the most relevant pages for a given query in a given context still benefits enormously from query click information. And it’s absolutely not the case that AI models eliminate that need or supplant that need.”); id. at 1931:21-24 (Lehman) (MUM “definitely” did not replace traditional data-based signals, like Navboost and QBST). When asked to predict how search engines will work in five or 10 years, Google’s former Distinguished Software Engineer, Eric Lehman, testified that while it may be diminished in the future, “there will still be a role for user data[.]” Id. at 1924:18–1925:22 (Lehman). This is in part because “deep learning systems are much harder to understand.” Id. at 6366:21-22 (Nayak). It thus remains vital for Google to “have an infrastructure that [it] understand[s],” i.e., traditional ranking signals. Id. at 6366:21–6367:10 (Nayak) (“[T]here is no sense in which we have turned over our ranking to these systems. We still exercise a modicum of control over what is happening and an understandability there.”).