whether to keep those pages . . . [or] future pages in the index or not”). User data also helps determine where a webpage resides within the larger index. Id. at 10274:22–10275:1 (Oard). Google divides its index into tiers. Id. Each page is assigned to a tier based on how fresh it needs to be, and the fresher tiers are rebuilt more frequently. Id.
93. Retrieval and Ranking. Because humans are imperfect, so too are their queries. Google relies on user data to decipher what a user means when a query is typed imprecisely. For example, user data allows Google to identify misspellings and reformulate queries using synonyms to produce better results. Id. at 8088:15-24 (Gomes) (spelling, synonyms, and autocomplete use query data to improve); id. at 2273:3-15 (Giannandrea) (“reformulation,” which is when a user misspells a query and then re-enters it with another spelling, is important to improve spell check); UPX224 at 914 (Google built its spelling technology by “look[ing] at all the ways in which people mis-spell words in queries and text all over the web, and us[ing] that to predict what you actually mean”).
94. Google scores potentially relevant results to determine the order in which they are placed, or ranked, on the SERP. Scoring is done using a number of signals and ranking systems, which are technologies that attempt to discern the user’s intent and thus identify the most relevant results for a particular query. See UPX204 at 243; Tr. at 1764:1-25 (Lehman). Many of these signals, discussed below, rely on user data.
95. Query-based Salient Terms, or QBST, is a Google signal that helps respond to queries by identifying words and pairs of words that “should appear prominently on web pages that are relevant to that query.” Tr. at 1807:25–1808:10 (Lehman) (e.g., “1600 Pennsylvania Avenue” and “White House”). QBST is a “memorization system[]” that helps the GSE
36