User:SDrewthbot/trim trailing LF

From Wikisource
Jump to navigation Jump to search
SDrewthbot

trim trailing LF

[edit]

An amendment to the code to Proofread Page extension has fixed one problem and caused another. Rather than stop proofreading or to cause too much confusion to users, it was suggested that we get a bot to correct the error quietly, and the bot operation linked to here is that fix. If you are really interested in the lead-in discussion, please follow the reference link. This bot will continue to run this task on a regular basis until the mediawiki application has been fixed. — billinghurst sDrewth 05:40, 19 October 2011 (UTC)

Synopsis

[edit]

The below data relates to fixes initially undertake for enWS, and per a request for the fix to be implemented at laWS. Subsequently there was a mulWS request for this fix to be applied more widely through the xxWS community. … "cross-subdomain bot repairing bug 26028 per oldwikisource:Wikisource:Scriptorium"

The bot will run through the broader WS wiki space with this bot account, though not with a bot right slowly fixing these edits. To note that the communities fr/de/it will not be part of the fix due to local fixes being undertaken.

Query being run to determine Page:s edited.

  • https://xx.wikisource.org/w/api.php?action=query&list=recentchanges&rcnamespace=104&rcstart=2011-10-28T21:00:00Z&rcend=2011-10-16T23:00:00Z&rcshow=!bot&rclimit=500&format=xml

noting that where the Page: namespace is not 104, that will be modified as required. As runs are being undertaken they will appear below.

Technical

[edit]

The query to grab edits is based on a query of the API

where the time is varied to get the next group of edits. This query is used with w:Wikipedia:AutoWikiBrowser — to make the list, utilise the HTML Scraper (with advanced Regex) , the url is plugged in and the filter used title="([^"]+) with group set to 1. Note that with AWB HTML scraper that it will grab a maximum of 500 hits, so if the list is going to be bigger than 5000, then shorter timer periods should be searched in spans to build the great list.

The replacement to undertake is

  • \s?\n(\<noinclude\>) with regex yes replaced with $1

Update — As there was noted that it would be preferable for their to be a line feed between <<references/>, I will add one in. — billinghurst sDrewth 13:07, 29 October 2011 (UTC)

To note, that at enWS the first error noted was at 16 October 2011‎ 23:13 UTC this equates to &rcend=2011-10-16T23:00:00Z and fix occurred sometime prior to 21:21, 28 October 2011 &rcstart=2011-10-28T21:00:00Z

[toolserver.org/~phe/statistics.php?diff=14&daysago=0]

Difference between Sat Oct 15 and Sat Oct 29

Page namespace Main namespace
language all pages not proof. problem. w/o text proofread validated all pages with scans w/o scans disamb percent
fr8499207436565766148482073276120.36
en3205303283152559531106090215170.29
de784-298070101214702652184700.20
es1122-25-442110921094672700.11
it1248529104525632583150-2010.11
pl533265-20270703114117-410.31
sv30-76-10107233000.04
no218-140922353842-400.38
ca1840001847013-1300.52
ru1175-3209511980150820100.08
hy41233501766975631022.03
da2241500173043100.09
vec00000500000.00
pt12-10009090-0.00
br28-44056772827100.15
sl77-487-10565048504850-0.00
old5320150029191000.14
la24620900370-30-300.00
hr0000004040-0.00
hu000000130130-0.00
et00000000000.00
id0000003030-0.01
vi00000011601160-0.02
el00000015801571-0.01
zh0000001250127-2-0.00
te000000600600-0.00
he000000500500-0.00
total169812930123117212756476245612406212332

Runs

[edit]

en

[edit]
  1. 18 October c. 1200 UTC
  2. 21 Oct c. 1200 UTC
  3. 22 Oct c.1500 UTC … &rcstart=2011-10-22T15:00:00Z&rcend=2011-10-21T12:00:00Z&
  4. 23 Oct c.1415 UTC … rcstart=2011-10-23T14:00:00Z&rcend=2011-10-22T15:00:00Z
  5. 25 Oct c.1130 UTC … &rcstart=2011-10-25T10:00:00Z&rcend=2011-10-23T14:00:00Z
  6. 26 Oct c.1200 UTC … &rcstart=2011-10-26T12:00:00Z&rcend=2011-10-25T10:00:00Z
  7. 28 Oct 1000 UTC … &rcstart=2011-10-28T10:00:00Z&rcend=2011-10-26T12:00:00Z
  8. 28 Oct 2100 UTC … [&rcstart=2011-10-28T21:00:00Z&rcend=2011-10-28T10:00:00Z

Donebillinghurst sDrewth 21:24, 28 October 2011 (UTC)

la

[edit]
  1. 25 Oct c.1130 UTC … &rcstart=2011-10-25T10:00:00Z&rcend=2011-10-16T22:00:00Z
  2. 29 Oct 0100 UTC … &rcstart=2011-10-28T21:00:00Z&rcend=2011-10-25T10:00:00Z

Donebillinghurst sDrewth 03:34, 29 October 2011 (UTC)

mul

[edit]
  1. 29 Oct 1430 UTC … &rcstart=2011-10-28T21:00:00Z&rcend=2011-10-16T23:00:00Z

Done

br

[edit]
  1. 29 Oct 1530 UTC … &rcnamespace=102&rcstart=2011-10-28T21:00:00Z&rcend=2011-10-16T23:00:00Z

Donebillinghurst sDrewth 04:20, 30 October 2011 (UTC)

sl

[edit]
  1. 30 Oct 0400 UTC …

Done

no

[edit]

31 Oct 1130 UTC … https://no.wikisource.org/w/api.php?action=query&list=recentchanges&rcnamespace=104&rcstart=2011-10-28T21:00:00Z&rcend=2011-10-16T23:00:00Z&rcshow=!bot&rclimit=500&format=xml
Donebillinghurst sDrewth 14:47, 31 October 2011 (UTC)

hy

[edit]

31 Oct 1200 UTC …

Done 14:47, 31 October 2011 (UTC)

ca

[edit]

1 Nov 0245 UTC …

Donebillinghurst sDrewth 09:48, 1 November 2011 (UTC)

ru

[edit]

1 Nov 1000 UTC


pl

[edit]

Not done 1400+ pages, approaching community for approval for bot. — billinghurst sDrewth 10:17, 1 November 2011 (UTC)

Not required. Undertaken by pl:User:AkBotbillinghurst sDrewth 10:35, 1 November 2011 (UTC)

es

[edit]

4 Nov 1000 UTC

Not done1700+ pages, approaching community for approval for bot, or to fix themselves. — billinghurst sDrewth 11:01, 1 November 2011 (UTC)

Done Bot status granted, work undertaken.

The following discussion is closed and will soon be archived.


All wikis completed by respective communities, otherwise undertaken by SDrewthbot. Donebillinghurst sDrewth 13:27, 4 November 2011 (UTC)