Page:Wikidata making of.pdf/1

From Wikisource
Jump to navigation Jump to search
This page has been validated.

Wikidata: The Making Of

Denny Vrandečić

Wikimedia Foundation
San Francisco, California, USA
Q18618629
denny@wikimedia.org

Lydia Pintscher

Wikimedia Deutschland
Berlin, Germany
Q18016466
lydia.pintscher@wikimedia.de

Markus Krötzsch

TU Dresden
Dresden, Germany
Q18618630
markus.kroetzsch@tu-dresden.de

ABSTRACT

Wikidata, now a decade old, is the largest public knowledge graph, with data on more than 100 million concepts contributed by over 560,000 editors. It is widely used in applications and research. At its launch in late 2012, however, it was little more than a hopeful new Wikimedia project, with no content, almost no community, and a severely restricted platform. Seven years earlier still, in 2005, it was merely a rough idea of a few PhD students, a conceptual nucleus that had yet to pick up many important influences from others to turn into what is now called Wikidata. In this paper, we try to recount this remarkable journey, and we review what has been accomplished, what has been given up on, and what is yet left to do for the future.

CCS CONCEPTS

• Human-centered computing → Wikis; • Social and professional topics → Socio-technical systems; History of software; • Information systems → Wikis.

KEYWORDS

Wikidata, knowledge graph, Wikibase, MediaWiki

ACM Reference Format:
Denny Vrandečić, Lydia Pintscher, and Markus Krötzsch. 2023. Wikidata: The Making Of. In Companion Proceedings of the ACM Web Conference 2023 (WWW ’23 Companion), April 30–May 04, 2023, Austin, TX, USA. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3543873.3585579

This work is licensed under a Creative Commons Attribution-Share Alike International 4.0 License.

WWW ’23 Companion, April 30–May 04, 2023, Austin, TX, USA
© 2023 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9419-2/23/04.
https://doi.org/10.1145/3543873.3585579

1 INTRODUCTION

For many practitioners and researchers, Wikidata [68] simply is the largest freely available knowledge graph today. Indeed, with more than 1.4 billion statements about over 100 million concepts across all domains of human knowledge,[1] it is a valuable resource in many applications. Wikidata content is behind answers of smart assistants such as Alexa or Siri, is used in software and mobile apps (see Fig. 1), and enables research, e.g., in life sciences [38, 73], humanities and social sciences [33, 66, 76], artificial intelligence [1, 10, 49, 53, 57], and beyond [3, 46, 51].

However, Wikidata is much more than a data resource. It is, first and foremost, an international community of volunteers who subscribe to the goal of making free knowledge available to the world. It shares this and other goals with the wider Wikimedia Movement[2] to which Wikidata belongs. Indeed, Wikidata is also a project (and website) of the Wikimedia Foundation, along with sister projects such as Wikipedia and Wikimedia Commons, backed by dedicated staff to create and maintain the infrastructure that enables the work of the community.

Figure 1: Apps using Wikidata (from upper left): Wikipedia iOS app, mobile search on e/OS/, in-flight app by Eurowings/Lufthansa Systems, Siri (historical glitch exposing Wikidata IDs), and WikiShootMe tool for Wikipedia editors

The complexity and scale of the endeavor may suggest that Wikidata was the result of a long and carefully prepared strategic plan of the Wikimedia Foundation, possibly in response to demands from the Wikipedia community. There is certainly some truth to that. However, the real history of how Wikidata was conceived, and how it eventually developed into its present form is not that straightforward: it involves a group of PhD students (naïve but optimistic[3]) a free software project that brought structured data

615

  1. All statistics reported are current at the time of this writing. Up-to-date numbers are found at https://www.wikidata.org/wiki/Wikidata:Statistics.
  2. https://meta.wikimedia.org/wiki/Wikimedia_movement
  3. We maintain that these are different qualities.