Wikisource:Community collaboration/Monthly Challenge/How it works
The Monthly Challenge is largely automated and works with a set of templates, Lua modules and bots. The infrastructure may seem daunting but most parts of it are relatively simple in isolation.
Data
[edit]Challenge works
[edit]The data for the challenges is stored in Lua data tables, arranged on an month-by-month basis. For example, November 2024 challenges draw their works from Module:Monthly Challenge/data/2024-11. Works can also be drawn from prior years' data tables when they roll over a year end (for example, they are added in December but are not completed by year's end).
The data structure for these tables can be seen at Module:Monthly Challenge/data/2024-11.
This table is completely manually-generated. Works are proposed for inclusion at Nominations.
Statistics
[edit]There is no built-in way to track some statistics within the constraints of the MediaWiki software that Wikisource uses. Because of this, a program runs externally and updates statistics on-wiki in the form of Lua tables. This raw data is then used by templates and Lua modules to present the data to users.
Daily statistics are written to modules like Module:Monthly Challenge daily stats/data/2021-05.
These statistical tables should not be manually edited, as that would be overwritten next time the stats update.
The per-index progress bars are provided via the ProofreadPage Lua library and the {{index progress bar}} template.
Templates
[edit]Templates provide most of the "UI" of the Monthly Challenge. A complete list of the templates can be found at Category:Monthly Challenge templates.
Logic templates
[edit]- {{Monthly Challenge listing}} this is one of the core modules and is used on every monthly overview page (for example: May 2021). This module takes the data from the works data table and analyses each index relevant to the current month's challenge. It then arranges the works into sections based on work length and proofreading status, and outputs "tiles" for each work.
- {{Daily proofreading stats table}} this template formats a list of daily proofreading progress stats into a table.
There are some templates that provide more fine-grained views into the Monthly Challenge data:
- {{Monthly Challenge sprint}} shows a sprint for a given month
- {{Monthly Challenge sprint listing}} shows all works in the sprint for a given month
UI templates
[edit]There templates provide parts of the UI for the Monthly Challenge
- {{Monthly Challenge month cat}}
- {{Monthly Challenge month overview}}
- {{MC-Section/s}}, {{MC-Section/e}}: a section of a month's listing
- {{MC-Cover}} an individual "tile" in a month's listing
- {{Collaboration/MC}} the overview of the Monthly Challenge as seen on the front page
Bots
[edit]Some parts of the Monthly Challenge cannot be achieved using only what MediaWiki provides through templates and modules. These parts are therefore done by scripts and bots outside the MediaWiki framework.
Statistics
[edit]Current proofread progress stats and per-challenge daily stats are provided by external program that analyses relevant revisions in the Wikisource database, computes the number of pages in the given statuses and writes to the data tables described above.
The statistics are gathered by analyzing the edits made to each page for each index in the month's data table[1]. There are some limitations and qualifications to how this data can be used:
- The total number of pages (a figure that includes pages that don't exist yet) for an index is based on the data stored in the
pr_index
database table at the time the script is run. This count is used for all the stats, so if the file changes during the month, days before the change will also use the new count. - Edits will only be considered if they are made during the month. If a page is moved into a challenge work partway though, only edits made to that page since the start of the month count towards daily stats. The overall per-index stats will include the new statuses.
- Works are included based on current membership of the Challenge. If an index is removed from the challenge during the month, stats for previous days will be removed at the next update. If a work is added to the challenge, all edits made to its pages (during the month so far) will be included in daily figures from the start of the month.
The current proofread status is determined using the same list of revisions that the daily stats use (as opposed to the q0-4
fields of the pr_index
table). There should be no difference in these figures, but since the database table requires the index to be purged to be up to date, there could be minor and temporary differences.
Statistics are generated according to server time, which is UTC. Thus, if you want to sneak in last-minute edits and you live in California, you only have until about 5pm your time before you miss the cut-off!
Technical
[edit]Some details about the bot:
- It's currently running as User:InductiveBot, but it doesn't have to. It is currently operated by User:Inductiveload.
- The source code is in Python and is found here: https://gitlab.wikimedia.org/inductiveload/iltools/-/tree/master/tools/ws-statistician
- This tool runs on the Toolforge: https://admin.toolforge.org/tool/ws-statistician
- There are two parts to the tool:
- There first part gathers stats by looking at the revisions associated with each page of each index found in the current MC month's Lua data table and creates its own database of revisions and statuses. This runs about every 15 minutes.
- The second part gathers data from there and constructs summaries of the MC progress and saves them as on-wiki Lua data for ingestion by other modules and templates. This runs about every 2 hours (chosen to avoid too much edit spam):
- Daily tracking of the progress of the MC: e.g. Module:Monthly Challenge daily stats/data/2021-09
- Per-index stats at the time of the run: e.g. Module:Monthly Challenge category stats/data/2021-09 (which are less useful now, since the Mediawiki Lua API allows access to most of this data)
- Categorisation of current Indexes into categories under Category:Monthly Challenge for easy listing in templates and by other tools
Issue resolution
[edit]- If you see that the bot has categorised an Index: page, that means that it has been added to the monthly data list (per #Challenge works). Removing the work from that data list will subsequently have the bot remove the categorisation.