Wikisource:Scriptorium/Archives/2019-01

From Wikisource
Latest comment: 5 years ago by Slowking4 in topic 3000 Validated Works
Jump to navigation Jump to search

Antony and Cleopatra (1921) Yale

The File:Antony and Cleopatra (1921) Yale.djvu has two duplicated pages. Pages 13 and 14 of the DjVu file need to be removed so that the following pages (and OCR) are shifted to compensate for the removal. --EncycloPetey (talk) 00:11, 1 January 2019 (UTC)

Done -Einstein95 (talk) 12:39, 5 January 2019 (UTC)

{{Author}} and multiple images

Come across Author:Edward Carpenter which has 2 images in their Wikidata item: File:Carpenter1875.jpg and File:Day, Fred Holland (1864–1933) - Edward Carpenter.jpg. The current {{Author}} template interprets this as one image: File:Carpenter1875.jpg, Day, Fred Holland (1864–1933) - Edward Carpenter.jpg. As the Author template is locked to admin-only editing, can this be fixed so only one of these images gets shown rather than a red link? -Einstein95 (talk) 10:11, 5 January 2019 (UTC)

Change the "image" (P18) priority. Reverse this if you prefer the other picture! 114.73.65.234 11:17, 5 January 2019 (UTC)

Occult books digitized

FYI: "1,600 Occult Books Now Digitized & Put Online, Thanks to the Ritman Library and Da Vinci Code Author Dan Brown". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:59, 3 January 2019 (UTC)

Anything "masonic" amongst that? ShakespeareFan00 (talk) 17:44, 4 January 2019 (UTC)
@ShakespeareFan00: You mean like these?
-Einstein95 (talk) 10:41, 5 January 2019 (UTC)
Yes. assuming they are out of copyright globally. ShakespeareFan00 (talk) 10:42, 5 January 2019 (UTC)
@ShakespeareFan00: Having lost my original edit where I went into detail about the copyright status on each item, I'll give a brief summary:
  • The religious and masonic symbolism of stones - {{pd/1923}}
  • Freemasonry: its outward and visible signs - {{pd/1923}}
  • Knight Templar Masonry - {{pd/1923}}
  • Some deeper aspects of masonic symbolism - {{pd/1923}}
  • Cabul, or Christian symbolism in freemasonry - Unknown, no year given
  • An encyclopedic outline of masonic, hermetic, cabbalistic and rosicrucian symbolical philosophy - {{PD-US-no renewal}} (a later edition was renewed, but not this one)
-Einstein95 (talk) 20:42, 7 January 2019 (UTC)
Okay, did you check author dates for 70 pma jurisdictions? ShakespeareFan00 (talk) 22:52, 7 January 2019 (UTC)

My researches say:-

  • William Wynn Westcott (d. 1925) British.
  • Unknown author, publication of British origin?
  • Unknown author, (Note in fly-leaf sheets given an 18th century date)..Transcription, as this appears to be a handwritten manuscript..
  • Arthur Edward Waite (2 October 1857 – 19 May 1942)
  • Cristopher C. Gill, No date on publication
  • Manly Palmer Hall (March 18, 1901 – August 29, 1990) (So not yet out of copyright per 70 p.m.a rules.)

So that to me says only one of those could be reasonably uploaded on commons.. Oh well. ShakespeareFan00 (talk) 23:06, 7 January 2019 (UTC)

templatestyles inserting paragraph breaks

I've found that {{nowrap}}, after being edited to implement the <templatestyles /> extension, now inserts a paragraph break on Page:Chesterton--The Napoleon of Notting Hill.djvu/63 in the text:

"Auberon! for goodness' sake {{...}}" cried Barker

If the template is rolled back to before this change, it displays fine. -Einstein95 (talk) 04:20, 7 January 2019 (UTC)

@Einstein95: This looks like the issue tracked in Phabricator (linked in the box above). The problem seems to be that TemplateStyles is spitting out a <style /> or <link /> element that isn't actually permitted inside a <p></p>-paragraph. The HTML parser therefore closes the paragraph, inserts the styles, and then opens a new paragraph for the following content.
This is going to affect all templates using TemplateStyles whose notional role is "inline", but will only be visible in some contexts. For example, using {{nowrap}} at the natural end of a paragraph will technically exhibit this problem, but it will be invisible because the erroneous end of the paragraph will coincide with the actual end of the paragraph. Uses of {{nowrap}} in text that, for whatever reason, is wrapped in <div></div> insead of <p></p> will work just fine. And so forth.
The only immediate solution I see is to revert {{nowrap}} to the previous revision which uses inline styles until the WMF developers come up with something more permanent (I've suggested a few possible solutions in Phab, but they may be unworkable and will certainly take a while to implement even if they work and the devs like them). And absent loud protests I've reverted the template for now. --Xover (talk) 11:41, 7 January 2019 (UTC)
Thank you very much for that, I was unaware that there was an open Phabricator bug. I'm guessing there's a number of templates that should hold off having these. -Einstein95 (talk) 20:37, 7 January 2019 (UTC)

18:30, 7 January 2019 (UTC)

Another

Some works called "Another" refer to the title of a previous work. Example: "A Letter to Her Husband, Absent upon Public Employment" is followed by "Another", i.e. another letter to her husband. Another example: "(To His Book) Another" is a poem called "Another" but which was originally preceded by a poem called "To His Book".

In disambiguating poems called "Another", is it wise to use the name of the preceding work as the disambiguation, as the editor of "(To His Book) Another" has done? Is it wise to refer to the name of the preceding work at all? Or should we ignore this unusual feature of works called "Another", and merely disambiguate by first lines as is normally done for poems that share the same name, author, and year of publication? —Beleg Tâl (talk) 22:18, 8 January 2019 (UTC)

Public Domain tags

Is there any plan to move/rename Template:PD/1923 and friends to "PD/US" or something similar since the static 1923 date is now obsolete? Phillipedison1891 (talk) 16:27, 5 January 2019 (UTC)

No. Refer to the discussion near the top of this page, with linked discussion on Commons. --EncycloPetey (talk) 16:41, 5 January 2019 (UTC)
I think you mean #Adapting Template:pd/1996 or a new template, yes? -Pete (talk) 20:17, 5 January 2019 (UTC)
Yes. As the discussion points out, it's not that the 1923 date is "obsolete", but that there is an emergent new group of works in PD through expiry of copyright renewals this year. --EncycloPetey (talk) 20:21, 5 January 2019 (UTC)
I have gone ahead and edited Help:Public Domain to use a {{#expr:{{CURRENTYEAR}}-95}} expression in place of the 1923 date. This is consistent with both the Hirtle chart and the apparent plain meaning of 17 USC §304(b). I understand that EncycloPetey is wary about doing this, but I'm afraid I do not understand. Is there someone else who does? Phillipedison1891 (talk) 16:29, 10 January 2019 (UTC)
The pre-1923 date applies to works regardless of all other considerations. Works published in 1923, that entered public domain this year were works that had registered a US copyright, and renewed, but the renewal has now expired. The end result is public domain, but the reasons are very different, so we have discussed using a new template to mark these cases instead of adjusting an existing template.
But any conversation on this topic should happen in the previously existing thread instead of here. There is no reason to run two simultaneous threads on the same topic. --EncycloPetey (talk) 17:24, 10 January 2019 (UTC)
I am continuing this section just so I can try to clarify and understand what you are saying. For the past several decades, works first published before 1923 have been in the public domain because their copyright renewal had already expired before the act of 1976 took effect. That act, taking into account the 1998 extension, effectively extended the term of 1923 and later works under the old pre-1976 regime to 95 years after publication. It then took 20 years for the 95 year term to "catch up" with the pre-1923 works which were already in the public domain. As of now, however, all works published more than 95 years ago are public domain, and for the same reason - their copyright may have been renewed, but the renewal has subsequently expired.
Given this, I am unclear what you think the benefits would be of separate templates - one for pre-1923 works and one for 1923-[current year - 96] works. Also note that someone (not me!) has already changed Template:PD/1923 and friends to use a dynamic {{#expr:{{CURRENTYEAR}}-95}} date - only reference to a hard-coded 1923 at this point is in the name. Phillipedison1891 (talk) 18:04, 10 January 2019 (UTC)
I agree; all works published more than 95 years ago are now public domain in the US. Those 1923 works now entering the public domain are no different than the 1922 works that entered the public domain in 1998. They may have had different histories, but that's of no importance to us.--Prosfilaes (talk) 23:34, 10 January 2019 (UTC)
Given this, would anyone object to updating Help:Public Domain to at least reflect the correct criteria to determine whether a work is public domain in the US? EncycloPetey mentioned that appropriate template names are still being discussed, but this seems like an issue that could be addressed separately. Phillipedison1891 (talk) 02:24, 11 January 2019 (UTC)
What do you mean by "reflect the correct criteria"? If you mean "make the same edit" you previously made, then I do object. --EncycloPetey (talk) 02:28, 11 January 2019 (UTC)
I don't understand your objections; 1923 is completely irrelevant to US copyright law now. It was only a line because works published between 95 years ago and 1923 were grandfathered out of the copyright extensions; there's no reason to count them separately now.--Prosfilaes (talk) 02:52, 11 January 2019 (UTC)
But some your edits include works published in 1924, which is still relevant for US copyright, yes? Changing the recommendations and changing the templates will affect some works we previously hosted. So what is your plan for identifying and correcting the templates on those works to meet the recommendation changes?
It is astounding to me that a thread on this topic has sat around virtually un-commented upon for months, and suddenly everyone thinks a crisis has arrived and immediate action must take place. How about we discuss and do things right the first time for a change? --EncycloPetey (talk) 03:04, 11 January 2019 (UTC)
What works are you talking about? If you want to discuss, you need to discuss. I'm not worried about getting things perfectly right the first time; that's often how things never get done.--Prosfilaes (talk) 03:18, 11 January 2019 (UTC)
Any time you want to read the discussion on this topic, and join in, feel free to do so. I'm not going to duplicate the discussion here. But it's pretty hypocritical to tell me I need to discuss, not participate in the discussion, and act without discussion yourself. --EncycloPetey (talk) 03:39, 11 January 2019 (UTC)
This is the only place on this page that mentions Help:Public domain. The discussion about that page aren't completely dependent on other changes to the templates.--Prosfilaes (talk) 06:17, 11 January 2019 (UTC)
"But it's pretty hypocritical to tell me I need to discus" = welcome to typical wikimedia admin behavior: abrupt, censorious, non-consensus. Slowking4SvG's revenge 03:55, 12 January 2019 (UTC)
  • Bucket of cold water time. Be very careful. Out of copyright in USA does not mean out of copyright worldwide.
w:Siegfried Sassoon's 1923 poetry collection Recollections may be about to lose copyright protection in USA, but it will remain in copyright in life+70 countries until the end of 2037. Narky Blert (talk) 22:59, 13 January 2019 (UTC)
Yes, we know that. The English Wikisource only worries about status in the US. Users need to check the details if they are worried about copyright status outside of the US.--Prosfilaes (talk) 03:07, 14 January 2019 (UTC)

From outreach:GLAM/Newsletter/December_2018/Contents/Australia_report, by User:Pru.mitchell, President of Wikimedia Australia:

2019 Australia's Year of the Public Domain

What a happy new year! Changes to Australian copyright law taking effect from 1 January 2019 (Copyright Amendment (Disability Access and other Measures) Act 2017) mean at last we have new material entering the public domain. A key change is that unpublished materials such as diaries, letters and shipping manifests will be subject to the same copyright term as their published counterparts, and orphan works have been given a term of 70 years.

Wikimedia Australia looks forward to working with GLAM partners and copyright organisations through the year to maximise awareness and use of these resources.

-- Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:01, 12 January 2019 (UTC)

Note, this does not seem to affect either {{PD-Australia}} nor {{PD-AustraliaGov}} —Beleg Tâl (talk) 16:19, 12 January 2019 (UTC)
the change to unpublished, would suggest a review of deleted australian files over 70 years old. and search for unpublished at internet archive, and at GLAMs. Slowking4SvG's revenge 15:50, 14 January 2019 (UTC)

Missing document in series — Acceptable solution?

I have been adding a serialized set of U.S. government documents to WS: the weekly announcement of listings in the National Register of Historic Places. The source of scans for this is a set of PDFs available on the National Park Service website. In the header template for each issue, I have been providing appropriate links to the previous and next issues in the series.

Recently I came to a gap in the series, where it appears that the original paper document from 1984 did not get scanned with the others. I know this is a gap in the series because:

  1. The body of each issue states a date range covered by that issue. I have the issues with the previous and next date ranges, and a gap between.
  2. I have identified from other databases some of the content that would almost certainly have been in the missing issue.

I do not at this time know how to find the missing issue, or if a copy even exists any more.

I have been unable to find any WS style guidance covering this circumstance. To cover the gap in the sequence, I created this item in the mainspace: a placeholder providing the editorial comment that a document is probably missing from WS's reproduction of the series. I felt this was preferable to giving the impression that WS is providing a complete series by linking directly from the previous issue to the next issue with no acknowledgement in between. IF the missing document is ever located, I assume the placeholder will be deleted and replaced with an appropriate transcription.

Does this solution work for the community? Is there some contrary style guidance that I missed? Ipoellet (talk) 01:02, 13 January 2019 (UTC)

@Ipoellet: Well, I won't speak for anyone but myself, but this approach seems eminently reasonable to me. However, why not simply contact the NRHP and ask about the missing list? They clearly intend to publish it, and it is both public information, in the FOIA sense, and public domain (in the licensing sense). They'll probably appreciate being made aware of the gap. --Xover (talk) 10:35, 13 January 2019 (UTC)
I probably will contact them, but I don't hold out a ton of hope. I've contacted them about missing pages (not whole issues) before, and their response was dismissive of finding copies of the Weekly List in particular. They simply referred me to a different publication series (the Federal Register) where I should be able to find the same content but in a different format. I have developed the vague sense that the original paper Weekly Lists are simply lost, or so deeply buried in archived files as to amount to nearly the same thing, and that the posted PDFs are all they have now. Ipoellet (talk) 21:14, 13 January 2019 (UTC)
i agree. we are living with open government = pdf. and we care about the data more than they do. we have not had events with NPS yet. NPS records are at NARA, maybe we could do some legwork and find it. [5]; [6] they are closed now but we could add it to the list, and ping dominic. -- Slowking4SvG's revenge 16:14, 14 January 2019 (UTC)
@Ipoellet: The usual practice is to simply omit the missing document, leaving red links to the place where the document will be added when a source is found. —Beleg Tâl (talk) 14:12, 13 January 2019 (UTC)

17:55, 14 January 2019 (UTC)

Failed mass message delivery

In the mass message log there is a failed delivery of "Delivery of "No editing for 30 minutes 17 January" to Wikisource:Scriptorium failed with an error code of readonly". Does anyone know what this is and how it applies to us (if at all)? --EncycloPetey (talk) 17:32, 16 January 2019 (UTC)

See below! Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:55, 16 January 2019 (UTC)

No editing for 30 minutes on 17 January

You will not be able to edit the wikis for up to 30 minutes on 17 January 07:00 UTC. This is because of a database problem that has to be fixed immediately. You can still read the wikis. Some wikis are not affected. They don't get this message. You can see which wikis are not affected on this page. Most wikis are affected. The time you can not edit might be shorter than 30 minutes. /Johan (WMF)

18:38, 16 January 2019 (UTC)

Image display issue with pdf files

A new problem has come up with pdf files, which was not there earlier. Ocr-ed pdfs from Internet Archive, or those created with ABBYY FineReader has display problems in Commons, as we already know. But now, pdf files from HathiTrust and Google Books (i.e., books digitised by Google) have display problem in Commons/Wikisource with images, but not with text. Example from Google Books: https://bn.wikisource.org/s/gd92. Example from HathiTrust: c:File:A_Lost_Lady,_by_Willa_Cather.pdf. You can see from the thumbnails that cover of the first upload was not visible, being an image. Other image pages were not visible too, except "digitized by Google" in the footer. So I had to repair the file. Therefore, if you are proofreading a work digitised by Google and you encounter a vacant space where an image could be present, please consult the original source file. Hrishikes (talk) 10:59, 15 January 2019 (UTC)

it might be a better practice to take hathitrust over to Internet archive first, and use IAuploader rather than upload wizard. (and here would be another candidate [14]) Slowking4SvG's revenge 20:46, 15 January 2019 (UTC)
I've seen this a lot with older books scanned by Google, both PDF and DJVU, many of which are available at IA and Hathitrust. I've recently tried to only use colour scans, assuming that b/w scans may have this and other similar issues. —Beleg Tâl (talk) 00:49, 16 January 2019 (UTC)
Google removes public domain images (but not simple drawings), from their donations to IA because they hope that some day they can be copyrighted.— Ineuw talk 04:08, 16 January 2019 (UTC)
Do you have a source for that? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:56, 16 January 2019 (UTC)
i’m sure they would say "out an abundance of caution" and "precautionary principle", we preemptively censored the images. see also [15] no need to project false hopes. and google books has become a cul-de-sac [16] Slowking4SvG's revenge 03:30, 17 January 2019 (UTC)
Actually, I had posted about a different issue. In my examples, the images are actually present in the books (see here) but not displayed in Commons (see here). This is a display issue of Mediawiki Pdf Handler. This occurs in ocr-ed pdfs. If ocr layer is removed, there is no display problem. If the file is downloaded from Commons, the image will be visible on offline viewing. Hrishikes (talk) 05:02, 17 January 2019 (UTC)

Index:The History of Ink.djvu

Anyone good with images? Extraction of the "plates" seems to be what's temporarily stalling this. ShakespeareFan00 (talk) 11:06, 17 January 2019 (UTC)

On it —Beleg Tâl (talk) 16:05, 17 January 2019 (UTC)
@ShakespeareFan00: commons:Category:The History of Ink (1860). Will do the rest within the next few days. —Beleg Tâl (talk) 18:08, 17 January 2019 (UTC)

50% scan-backed

At some point in the past few days we have quietly reached 50% of our mainspace pages transcluded from proofread scans. This has been achieved through a combination of adding new works via the ProofreadPage process and finding scans to backup works that we have acquired through other means over the years. Beeswaxcandle (talk) 04:27, 5 January 2019 (UTC)

That's great news! Tweetworthy, I'd say. Is there somebody to nudge about that? (I see no recent tweets on either @wikisource or @wikisource_en.) It would be good to call attention to our scan-backed standards. A short blog post might be nice, too... -Pete (talk) 20:58, 15 January 2019 (UTC)
Terrific news. :) I've made a tweet: https://twitter.com/wikisource_en/status/1085301928966316032 (And sorry for the recent silence on the Twitter front; been a busy few months.) Sam Wilson 22:26, 15 January 2019 (UTC)
a better milestone will be a million proofread pages, which will happen in a month or two. >Slowking4SvG's revenge 04:56, 18 January 2019 (UTC)

FileExporter beta feature

Johanna Strodt (WMDE) 09:41, 14 January 2019 (UTC)

Files can also be transferred to Commons with FTCG. It contains original upload log, but not original edit history. Original file also gets deleted if user has admin privilege in source wiki. Hrishikes (talk) 08:40, 19 January 2019 (UTC)

20:33, 21 January 2019 (UTC)

Pictures in Rhymes

Happy New Year.
Can this work be uploaded without the watermarks? Meaning can it be? And can someone do it? Thank you.
Pictures In Rhyme by Kennedy, Arthur Clark; Greiffenhagen, Maurice, 1862-1931

Publication date 1891
Usage Public Domain Mark 1.0
Topics Poetry
Collection folkscanomy; additional_collections
Language English
CONTENTS
etc.
--Level C (talk) 16:07, 2 January 2019 (UTC)

Books aren't "uploaded" to Wikisource. Books this old, which are in public domain in their home country, are uploaded to Commons. And no, the watermarks cannot be stripped from the images. However, a Wikisource copy built from this scan would not transcribe the watermarks. --16:50, 2 January 2019 (UTC)
Thanks for the info! --Level C (talk) 23:49, 22 January 2019 (UTC)
ok here you go, i will fix up the index tomorrow. c:File:Pictures In Rhyme.djvu; Index:Pictures In Rhyme.djvu. -- user:slowking4 16:01, 17 January 2019 (UTC)
Thank you for your work. This is awesome. --Level C (talk) 23:49, 22 January 2019 (UTC)

Changes to {{Lang}}

With reference to Wikisource:Scriptorium#Language_tagging, I made an attempt to modify {{Lang}}, adding inline=yes as default option for <span></span> template, and inline=no to generate a <div></div> template.
I would appreciate very much opinions on the choice on names and proposed coding, and a review from someone expert in templates and community in general. Thanks— Mpaa (talk) 21:35, 14 January 2019 (UTC)

Better have the discussion at the template talk page maybe. I posted here the news for visibility.— Mpaa (talk) 21:52, 14 January 2019 (UTC)
It might be simpler to have two templates, like we have for all other similar situations: {lang} for inline (span) and {lang block} for (div). This way the template won't require an extra parameter or template burden in lengthy documents. This is what we already do with {{smaller}} and {{smaller block}}, for example. --EncycloPetey (talk) 22:40, 14 January 2019 (UTC)
Someone in previous thread preferred one single template. I am more inclined to your approach. Let's see a few more comments.— Mpaa (talk) 23:16, 15 January 2019 (UTC)
I also am inclined to {{lang}} and {{lang block}} as separate templates. They can both share common inner workings if desired. —Beleg Tâl (talk) 00:46, 16 January 2019 (UTC)
@Mpaa:I agree with two separate templates as well. --Jan Kameníček (talk) 13:02, 17 January 2019 (UTC)
Separate templates, Would it be possible to apply something like this to the iwtrans templates as well? ShakespeareFan00 (talk) 14:48, 17 January 2019 (UTC)
@Mpaa: Provided you mean Andy's comment in the thread above, and provided I understood them correctly, the objection was to duplicating code in two separate templates: having one template that supports both modes, and a lightweight wrapper template that just invokes the full template with the right parameter(s) set, is wholly in agreement with that stance. From a end user perspective it's two different templates; but there's only one set of template code to maintain. I do not believe anyone has actually objected to having separate templates in the end-user sense. --Xover (talk) 14:55, 17 January 2019 (UTC)
But having one template that calls another template. . . isn't that increasing template burden again? This is a template that has the potential to be transclusion thousands of times in a single work, or even in a single Mainspace page. Template burden is a real concern in that sort of situation. We already have some ToC templates that break in large tables of contents. --EncycloPetey (talk) 15:10, 17 January 2019 (UTC)
In that edge case, use the primary template, with the parameter, not the wrapper. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:25, 17 January 2019 (UTC)
When this template is used, it will be used heavily. We have several current projects that are bilingual dictionaries, glossaries, and the like. I did a work that was a thesis on additions to the standard Greek lexicon, and we have several reference works which include multiple non-English words throughout. So what makes you think that this is an "edge case"? Isn't your proposal effectively to complicate the most complex uses of this template? --EncycloPetey (talk) 19:28, 17 January 2019 (UTC)
@EncycloPetey: So far as I am aware, the two big technical limitations with templates is the recursion limit and the total expanded size, with total number of transclusions on a single page a distant but not insignificant third. I'm not sure what you mean by "template burden", but a template calling a single other template is unlikely to be much different from just simply calling a single template. If there is some issue there that I'm not aware of (absolutely a possibility), I'm sure we can investigate moving the common code into a Lua module that both templates call. Last I heard, the WMF guidance was that performance should never be a consideration for any community discussion: resource limits can be raised, more hardware added, or infrastructure code made more efficient. --Xover (talk) 19:17, 18 January 2019 (UTC)
The WMF guidance you speak of is WP and Commons guidance. Projects like WS and WT are very, very different from the "expected" model based on what happens at WP and Commons. So much so that the usual guidance doesn't apply. --EncycloPetey (talk) 20:59, 18 January 2019 (UTC)
Fair point. But still, based on my (necessarily limited) understanding, I don't see any big problem looming as a result of one template invoking another. The parser and caching should effectively take care of that, and we can always move the common logic to Lua if necessary (I'd say preferably, but I'm not sure everyone here agrees with that). --Xover (talk) 07:24, 19 January 2019 (UTC)
I have added {{lang block}}, which internally calls {{lang}}. If this turns out to be too heavy, it can be refactored appropriately.— Mpaa (talk) 20:16, 20 January 2019 (UTC)
I would really appreciate if someone could make the backend logic in Lua (or solve the following issue). Linter on {{lang}} gives an error due to fact that the {{#ifeq:{{{inline|yes}}}|yes|<span|<div}} does not really create a span tag as the closing ">" arrives only later. It is nothing that generates errors when used or functional issues but it is annoying as it might give such impression. Thanks.— Mpaa (talk) 21:20, 20 January 2019 (UTC)
@Mpaa: Where are you seeing this linter error? --Xover (talk) 09:33, 21 January 2019 (UTC)
On the template page itself. I know it has no effect but when trying to understand where a lint error comes from, it is good to have them clean.— Mpaa (talk) 20:14, 22 January 2019 (UTC)
I'm sorry for being dense, but where exactly on the page? I'm not seeing it anywhere on Template:lang, nor when I preview an edit to it. --Xover (talk) 20:36, 22 January 2019 (UTC)
You need to enable the linter checker gadget.— Mpaa (talk) 20:53, 22 January 2019 (UTC)
In terms of terminology, a "Gadget" is something you can go enable in the preferences with a simple checkbox. Based on a little detective work, what you're using is the "user script" w:User:PerfektesChaos/js/lintHint with a little custom configuration (see the "Configuration by JavaScript" section and enable all namespaces).
In any case, I can't really see why the linter is complaining here. While the source of the template is indeed apt to confuse any parser, the linter really shouldn't see it until after it's been preprocessed sufficiently that what it sees is a complete tag pair. Note also that what it's complaining about is the end tag, because it thinks it hasn't seen a matching start tag, and not the start tag itself. Which, unless there's an actual bug in the code that I'm not seeing, is pretty weird behaviour.
As you say, it doesn't show up when transcluded, so I don't think it's something we should worry overmuch about. However, if someone with the right permission bits can import some dependencies and supporting templates/modules from enwp, I can take a stab at reimplementing the common stuff in Lua to avoid the problem. --Xover (talk) 15:32, 23 January 2019 (UTC)
I forgot how I enabed it. When I try to understand what gives an error, spurious messages just confuse. Sometimes it is already hard enough to tame divs and span ...— Mpaa (talk) 20:38, 23 January 2019 (UTC)

Plan S for free content: feedback request

The Plan S open-access initiative aims to make modern academic articles free content. It is requesting feedback about itself on these questions:

  1. Is there anything unclear or are there any issues that have not been addressed by the [Plan S] guidance document?
  2. Are there other mechanisms or requirements funders should consider to foster full and immediate Open Access of research outputs?

It seems to me as though people here may have useful answers. Feedback is open until the 8th of February.

The plan launched in September and has a large proportion of European research funders and a couple of US ones onside; if you are affiliated with a research funder, they might want to look into it. The best comment on Plan S I've heard so far comes from Elsevier (which doesn't really like the financial transparency provisions, for starters). An Elsevier spokesman said "If you think that information should be free of charge, go to Wikipedia" ("Als je vindt dat informatie gratis moet zijn: ga naar Wikipedia"). I'm not sure if he knew about the journals published here.

Is there a good place to point academics who want to ga-naar-Wikipedia? I'm thinking of a how-to for people unfamiliar with wiki authoring who want to, say, post a post-print to Wikisource. Our metadata could make that last very findable, if properly formatted with something like Wikiversity:Template:Article info. HLHJ (talk) 00:32, 21 January 2019 (UTC)

User:Daniel Mietchen? see also m:WikiCite -- Slowking4SvG's revenge 01:18, 22 January 2019 (UTC)
Thanks, Slowking4. A comment is being drafted at Wikiversity:Talk:WikiJournal User Group#Plan S RfC, contribs welcome. HLHJ (talk) 05:48, 27 January 2019 (UTC)

Google Cloud Vision + books

>Google is also providing Wikipedia free access to its ... Cloud Vision API, which will ... let editors automatically digitize books so they can be used to support Wikipedia articles too.
Google Gives Wikimedia Millions—Plus Machine Learning Tools. Wired. 2019-01-22.

Anyone know where I can read more about the WMF/Wikisource's plans with this? czar 12:43, 23 January 2019 (UTC)

Yes:

-- Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:52, 23 January 2019 (UTC)

Wikisourcers already have access to two OCR systems of Google: Google Cloud Vision OCR and Google Drive OCR. The Drive ocr is better. Users can include both on their edit toolbar by customising gadget preference or own common/global.js. Hrishikes (talk) 14:03, 23 January 2019 (UTC)
yes, this was a big event for the Indic languages, less so for english, although works well for marginal pages. you could develop a google books to commons tool, but we currently upload to commons via internet archive. Slowking4SvG's revenge 18:12, 23 January 2019 (UTC)
At present, three ocr systems can be used in English Wikisource: Tesseract 3, Google Cloud Vision, Google Drive. I have all three in my edit toolbar, but the Drive ocr gives the best output. Tesseract 4 has been developed (I have it for offline use), but onsite, it has not been updated. Secondly, books from Google Books can be transferred directly to Commons using url2commons, e.g. this file. Hrishikes (talk) 02:02, 24 January 2019 (UTC)
and you should change to c:Template:Book which has the right metadata fields. information template is limited. Slowking4SvG's revenge 13:58, 27 January 2019 (UTC)
Is there a limit in usage of Google Drive OCR in case one would like to automate the task?— Mpaa (talk) 20:26, 24 January 2019 (UTC)
Yes, some limit is there, but usually that limit is difficult to reach. The OCR4wikisource tool uses Google Drive OCR for automating the task. Thousands of books in various Indic wikisources have been ocr-ed using this tool. Hrishikes (talk) 04:47, 25 January 2019 (UTC)
If WMF are in touch with Google, is there a process where feedback can be sent about incomplete or unintentionally degraded scans present in Google Books, or through which suggested improvements of the OCR process could be suggested? ShakespeareFan00 (talk) 09:13, 27 January 2019 (UTC)

Page:Baron Trump's marvellous underground journey.pdf/22 and others...

The PDF handler has again failed here, and produces a blank page image. I know the pages is there because I CAN see it if I grab the PDF directly. Can someone please produce, a list of what PDF formats Mediawiki DOES actually support, so these semi-random page display failures can be removed, by down converting files if needed. ShakespeareFan00 (talk) 13:32, 26 January 2019 (UTC)

Have you reported this on Phabricator? —Beleg Tâl (talk) 14:05, 27 January 2019 (UTC)
I haven't because it's usually the file that needs repair not the handler .. My request for a list of "supported" PDF formats stands. ShakespeareFan00 (talk) 14:09, 27 January 2019 (UTC)
Phab might be able to give you a list of supported formats as well. Idk if anyone here knows that level of detail about mediawiki file support. I sure don't. —Beleg Tâl (talk) 15:27, 27 January 2019 (UTC)

Page numbering fails when defining custom strings in Index page list..

s:Things_Mother_Used_to_Make/Pickles failed to display page numbers displayed unless a very specifc format is used when defining the page numbers on the relevant Index: page. Suggestions are welcome? ShakespeareFan00 (talk) 14:16, 27 January 2019 (UTC)

Okay for whatever reason the page-numbering scripting can't apparently cope with simple brackets. What a shame :( ShakespeareFan00 (talk) 14:21, 27 January 2019 (UTC)
"43*" works as a page number "43(*)" doesn't.. Why? ShakespeareFan00 (talk) 14:22, 27 January 2019 (UTC)
Also if this is indeed a limitation, there's NO warning given when trying to update the Index page concerned..ShakespeareFan00 (talk) 14:23, 27 January 2019 (UTC)

Another example : - The Return of Sherlock Holmes, 1905 edition/Chapter 1, page numbers display up to where a page number with a bracket for an image plate appears. and then they do not. 14:38, 27 January 2019 (UTC)

I presume the numbering system was not designed to function with asterisks or parentheses in page numbers, since those characters are neither numerical, nor are they normally included as part of a page number. Change the "(8)" to "Img" or something else that does not use parentheses. --EncycloPetey (talk) 14:42, 27 January 2019 (UTC)
Right- Thats what I was thinking as well, Do you happen to have a list of Index pages that uses paranthesis inside a <pagelist /> tag? Would it be possible to generate one? ShakespeareFan00 (talk) 14:46, 27 January 2019 (UTC)
It's still broken - The_Return_of_Sherlock_Holmes,_1905_edition/Chapter_1 it picks up the Img tag for Djvu page 21, and the blank for page 22, and then fails to pick up the number for page 23. The script needs to be overhauled. ShakespeareFan00 (talk) 14:53, 27 January 2019 (UTC)
Mentioned at Phabricator - https://phabricator.wikimedia.org/T214797ShakespeareFan00 (talk) 15:08, 27 January 2019 (UTC)
The work-arounds suggested by @Billinghurst: and others in the now closed ticket, appear to resolve the issue. However I'd like a clear indication that these work arounds are the correct one, and likely to remain stable in the long-term, before deploying it over the 5000 or more affected pages. A concern that was noted previously by others was the deployment of "fixes" that hadn't been fully debated. ShakespeareFan00 (talk) 12:28, 28 January 2019 (UTC)
I'm still however puzzled given that Mrs. Beeton's Book of Household Management/Chapter I had the same situation of a blank page within the transclusion, but no page numbering display error... Perhpas others can assist in figuring out the more precise circumstances of when the glitch happens?

ShakespeareFan00 (talk) 12:40, 28 January 2019 (UTC)

Well it should all display nicely - https://en.wikisource.org/w/index.php?title=Index%3AMrs_Beeton%27s_Book_of_Household_Management.djvu&type=revision&diff=9073472&oldid=8876133 updated all the non blank pages to have unique id. ShakespeareFan00 (talk) 13:02, 28 January 2019 (UTC)

Work by Communist Party of China

Since it has been long unclear to many if the works by Communist Party of China is copyrightable, I find it necessary to seek clarification and establish consensus. While CPC may hold copyright to some of its works, those should not include orders, and any document released by CPC that is of severe public interest. Under US copyright law, do we consider Edict of CPC as Edict of Government given the status quo of Chinese politics? Considering article 5 of Copyright Law of PRC, do we consider some works by CPC as "other documents of legislative, administrative or judicial nature;"? After the 2018 constitution amendment, the party leadership is written in, does that mean to copyright works by CPC would be unconstitutional in its nature?Viztor (talk) 19:47, 25 January 2019 (UTC)

Your question would be better directed at a forum on Commons. We have very few editors here who work in Chinese, so the issue of CPC copyright seldom comes up. --EncycloPetey (talk) 02:22, 28 January 2019 (UTC)
Many Chinese Wikisourcers consider works by Communist Party of China possibly copyrightable, being discussed at their scriptorium. If in doubt, leave it out.--Jusjih (talk) 04:26, 28 January 2019 (UTC) (your East Asian cultural bridge)
Chinese Wikisource has zh:Wikisource:English Scriptorium.--Jusjih (talk) 00:15, 29 January 2019 (UTC)

18:15, 28 January 2019 (UTC)

When Father Carves the Duck

There are odd characters appearing in this text. Can anyone identify them and determine why they might have ended up in the text? --EncycloPetey (talk) 18:39, 28 January 2019 (UTC)

@EncycloPetey: Are you sure you have the correct text? This is just standard-issue ASCII to me... —Justin (koavf)TCM 19:57, 28 January 2019 (UTC)
On my Mac it looks like basic ASCII, but on the PC, it has odd-looking characters in almost every line of text. --EncycloPetey (talk) 20:54, 28 January 2019 (UTC)
Yeah, cannot reproduce this. Anybody else? —Justin (koavf)TCM 21:28, 28 January 2019 (UTC)
It's U+2028, see Stackoverflow. I see it on Chrome on PC (the browser is going to be important here.) It's the Unicode specific line separator, but since good old ASCII line breaks stuck around, it doesn't seem to be handled right. I don't know the exact source, but it doesn't smell of attack or corruption (as long as all the line breaks that should be there are), so they can just be deleted.--Prosfilaes (talk) 23:54, 28 January 2019 (UTC)
For anyone who is not seeing this, are you seeing blank lines between every lines? It seems wrong for a system to just ignore them, but that's what I'm getting from EncycloPetey's comment about the differences between Mac and PC.--Prosfilaes (talk) 23:57, 28 January 2019 (UTC)
Ooooooh. Shows in Chromium as well. Removed them. Thanks! —Justin (koavf)TCM 23:58, 28 January 2019 (UTC)

Replacing a work -

I've noted some missing scans, and low image qualiity in : Index:Travel Pictures The Record of a European Tour.pdf

There's a version that has the missing pages here:- https://archive.org/details/travelpicturesre00bhav

and being on Archive.org most likely also has a text layer, which the current scans don't.

Can someone advise on this further? ShakespeareFan00 (talk) 12:18, 27 January 2019 (UTC)

Since the bad scan has very little proofreading done so far, I would just import the better scan and start working on it. Make sure to coordinate with User:Wikilover90 who started that transcription project. You can put the bad index on WS:PD when it's fully abandoned in favour of the better scan. —Beleg Tâl (talk) 14:02, 27 January 2019 (UTC)
Agreed. And in case anybody reading this discussion is unaware, I highly recommend the IAupload tool for importing Internet Archive texts. Just log in with Wikimedia credentials, paste the IA identifier (in this case, travelpicturesre00bhav), and upload. -Pete (talk) 22:05, 30 January 2019 (UTC)

Page:Gurney - Things Mother Used to Make.djvu/24

I'd like a second view, Here the recipes are currently formatted such that there's no breaks. the {{Center}} heading being used as such between recipes.

Is this ideal, or would it be better to put new lines in to separate individual recipes? ShakespeareFan00 (talk) 01:40, 28 January 2019 (UTC)

ShakespeareFan00 (talkcontribs), here is how I would organize it. Feel free to revert, however. ―Matthew J. Long -Talk- 04:00, 31 January 2019 (UTC)

Blank page transclusion, and page number suppression...

In this edit for : https://en.wikisource.org/w/index.php?title=Mrs._Beeton%27s_Book_of_Household_Management/Chapter_IV&oldid=6039250

The inclusion in the transcluded sequence of a blank page, means that the page number for the content page subsquent to the blank page with no content is suppressed, this is I've been informed as designed (although disappointing) behaviour.

This was repaired as - https://en.wikisource.org/w/index.php?title=Mrs._Beeton%27s_Book_of_Household_Management/Chapter_IV&direction=next&oldid=6039250

However, it is felt that this is confusing for potential readers (and contributors) that might not be aware of the technical minutiae of how the numbering script works locally, and the need to explicitly exclude specifc blank DJVU pages directly.

Would it be possible for someone technically minded to write scripts that:

  1. Generates a list of article pages, which transcluded sequences which transclude blank pages, so that appropriate so that the "exclude" parameter can be used, which is the solution suggested previously.
  2. Use the list generated, to amend transcluded sequences (typically a <pages /> tag) to explicitly mark using the "exclude=" parameter, pages that are marked as having no content, and which do not contain any wikitext?

Alternatively, the page numbering script could be amended, to cope with this.

Given the repetitive and repeatable nature of the edits required, I feel this task should ideally be done by a bot, like certain other routine maintenance tasks. 11:33, 30 January 2019 (UTC)

A bot sounds feasible for this, but I don't know who operates bots on enWS these days.
I have a long term ambition to attack MediaWiki:PageNumbers.js with an eye to splitting the page numbering bits away from the layouts bits, fixing various issues related to page numbering (see a previous thread on here), and generally tidying up (making it easier to maintain, hopefully). However, that job has to start with turning it into a Gadget so it can be disabled from the preferences. Right now it gets loaded unconditionally for everyone so if you tried to make a new version the two would always be fighting over which one does what to the page numbers and layouts. And since I can claim very little expertise in programming for a MediaWiki environment, I would need the assistance and support of more experienced users (preferably with the +sysop and +interfaceadmin bits, since only admins can edit most of the relevant files).
If anyone is interested in that please do let me know! I make no promises regarding timeframes or results, but it's a personal itch I'm willing to put some effort into scratching. --Xover (talk) 07:15, 31 January 2019 (UTC)

Category:National portals are inconsistent

We have Portal:Côte d’Ivoire but not Portal:Ivory Coast. For some reason, we have Portal:Cape Verde rather than Portal:Cabo Verde, Portal:Swaziland rather than Portal:eSwatini, and Portal:East Timor rather than Portal:Timor-Leste. I recommend moving all the latter three. Also, Portal:Sao Tome and Principe instead of Portal:São Tomé and Príncipe.Justin (koavf)TCM 21:41, 30 January 2019 (UTC)

Those all seem to be the officially declared names in English, with one exception. I find it weird to have w:Ivory Coast but w:Eswatini, but that's Wikipedia's problem. However, I would go with Eswatini, for consistency with Wikipedia and because the capitalization standard is inconsistent and capitals at the front is better English.--Prosfilaes (talk) 05:18, 31 January 2019 (UTC)
@Prosfilaes: With one exception...? —Justin (koavf)TCM 06:32, 31 January 2019 (UTC)
Eswatini instead of eSwatini. Rather, that's at least ambiguous.--Prosfilaes (talk) 07:28, 31 January 2019 (UTC)
Support moving them, but make sure you leave a redirect. Regarding the Eswatini vs eSwatini, see w:Talk:Eswatini#eSwatini vs. Eswatini, where it is noted that Eswatini is the official name in English (eSwatini is the official name in Swazi) while common usage is apparently evenly split between the two, and Eswatini is more in line with normal English orthography. —Beleg Tâl (talk) 12:04, 31 January 2019 (UTC)

License text

Many publishers are negligent or deliberately deceptive in reporting to their users the copyright status of materials in the public domain, or under a free license. One of the ways Wikimedia sets itself apart, in general, is by giving readers (and potential reusers) clear information about licensing. But in one respect, I think we fall short. At the bottom of every Wikisource page is the following text:

"Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply."

On Wikisource, this claim is almost always inaccurate. Most of the material here is in the public domain; other material may have any of several free licenses.

Changing this text would require close consultation with the WMF legal department. But before reaching out to them, I'd like to know: what do Wikisource users think would be the ideal resolution? -Pete (talk) 20:33, 28 January 2019 (UTC)

Most websites that I've seen have handled this by saying "unless otherwise specified, text is available &c." —Beleg Tâl (talk) 02:26, 29 January 2019 (UTC)
Yes, that phrasing is common; but I don't think it's a good fit for Wikisource, since Wikisource is not a place for original composition by project members. If a main space page does not specify otherwise, I would guess that in most cases, it's either (a) public domain without a banner, or (b) a copyright violation that hasn't been caught. I'd guess (though I'm not sure how to measure) that the number of cases in which an unmarked main space page is CC BY-SA are vanishingly rare. -Pete (talk) 01:11, 30 January 2019 (UTC)
I think it's still a good fit for Wikisource, you just need to flip your thinking around. Every hosted text requires a license tag; every item of content on this site that isn't a hosted text is CC BY-SA, so a fully accurate statement would be "Unless otherwise indicated, or unless someone screwed up, all text is CC-BY-SA" - but I think "unless someone screwed up" can be safely taken as given. —Beleg Tâl (talk) 03:46, 30 January 2019 (UTC)
I think the discussed text should not appear in the main namespace at all. The license should be made clear by a license tag only. It is misleading if the license tag states it is in public domain and at the same time the text below states it is licensed under CC-BY-SA + some unspecified additional terms. The license tag makes the statement below redundant. --Jan Kameníček (talk) 09:41, 30 January 2019 (UTC)
I strongly disagree with this. There is lots of meta text in mainspace: headers, notes in headers, license tags, banner text, interface text, &c all of which is CC-BY-SA unless otherwise noted. —Beleg Tâl (talk) 23:56, 1 February 2019 (UTC)
A lot of that is too short to attract a copyright anyway. "This work was published before January 1, 1924, and is in the public domain worldwide because the author died at least 100 years ago." is not at all copyrightable. Notes should not generally be long enough to be copyrightable. I suppose much of the stuff is outside our control, but we should be clear where the CC-BY-SA applies.--Prosfilaes (talk) 00:32, 2 February 2019 (UTC)
I agree with Beleg Tâl. The text isn't optimal outside the Wikipedias (where most text really is CC-BY-SA), but it's still a good default for all the text on enWS that isn't tagged with an explicit license. To wit, this text here is, by way of the terms displayed in the edit field, "irrevocably agree to release your contribution under the CC BY-SA 3.0 License and the GFDL". What's missing is really the "Unless otherwise specified" part to make it clear that public domain works have not magically become subject to new copyright by virtue of us transcribing them.
I also agree with Peteforsyth insofar as it being worthwhile to address the issue. It is a minor issue, but not an entirely insignificant one.
I also think the boilerplate at the bottom there is technically somewhat in the local project's control (i.e. I'm pretty sure it's configured text, not hardcoded) so provided WMF Legal has no actual objection to whatever we decide, it should be realistically possible to get it changed in our lifetimes. --Xover (talk) 06:26, 31 January 2019 (UTC)
Ah, I see I was not as explicit as I could have been. I'm talking specifically about main space, which I think is by far the most important. (I realize it's probably not currently easy to have the text vary based on namespace, but I'd imagine that's something that could be changed if a need is demonstrated.)
With that in mind, I suppose the header info, which occasionally includes extensive "notes", might indeed include text that is CC BY-SA. So maybe the footer message should take that into account. -Pete (talk) 00:21, 1 February 2019 (UTC)
As a purely technical matter, and provided my quick leafing through the docs didn't lead me astray, the text in question is configured using the variable $wgRightsText. This is a global configuration that affects all pages in all namespaces on a given wiki. It is also not, that I have found, changeable by the local community (you need to request the change from the WMF). However, there exists an extension named PerPageLicense that claims to let you specify a license on a per-namespace or even per-page basis. The extension is not installed on enWS, and cannot be installed by the local community (must be requested from WMF; who are likely to ask pointed questions, drag their heels, and involve WMF Legal first). But that being said, there's no obvious technical reason why it can't be installed. Which means—and again I'm speaking from a purely technical perspective here—we could request the sort of change you're envisioning provided there's a strong enough consensus in favour of 1) changing the status quo at all, and 2) what that change should be. --Xover (talk) 06:46, 1 February 2019 (UTC)
Correction: Thanks to an exceedingly helpful gentleperson answering questions elsewhere, I've found that the license boilerplate is actually stored in MediaWiki:Wikimedia-copyright and is actually editable (in the technical sense) by Administrators and Interface Administrators on WMF wikis (for non-WMF MediaWiki installations the previous description holds true). We would still, I think, have to run any change by WMF Legal, but Wikinews and Wikidata have custom license text there already so it's not unheard of. For per-namespace configuration the previous is still correct, except that there's a decided risk the mentioned extension will not work with the special WMF setup. If that's the case we would need to have actual development work done to adapt it (or reimplement the functionality in MW directly), which sounds like it might take a long time and be hard to get bumped up the priority list. --Xover (talk) 08:08, 1 February 2019 (UTC)

Index:A Naturalist on the Prowl.djvu

Misaligned text layer, with resepct to the scans. I will categorise any more of these I find in Category:Scans with misaligned text layer if anyone wants to make repairs..ShakespeareFan00 (talk) 16:28, 24 January 2019 (UTC)

Done.— Mpaa (talk) 21:50, 6 February 2019 (UTC)

Index:The_Moon_(Pickering).djvu

Missing image scans. can some insert placeholder so that the text can be proofread, and the other images used? ShakespeareFan00 (talk) 14:39, 28 January 2019 (UTC)

Done.— Mpaa (talk) 22:55, 8 February 2019 (UTC)

Is this a cache issue?

The reference images at Index:The Living Flora of West Virginia and The Fossil Flora of West Virginia.pdf all seem to be off by one page since I deleted the Google scan title page from the beginning of the PDF. I'm assuming the index is drawing from a cache of the old pages but even after clearing my caches both here and at commons I still have the misalignment problem. Can anyone help? Abyssal (talk) 15:42, 31 January 2019 (UTC)

Deleting a page will automatically shift everything in the scan by one page. Is this what you mean? If not, could you be more specific about what you're seeing? --EncycloPetey (talk) 20:38, 31 January 2019 (UTC)
I'm pretty sure EncycloPetey's explanation is correct; deleting page one will shift all other pages. The better approach is to *replace* page one with a blank page, which you could do by duplicating an existing blank page. (On Linux, I use PDF Shuffler for this, but I don't know what operating system you're using; there are probably free software options for any OS.) You can still do that, and re-upload; I think this is your best bet.
However, there also seem to be major cacheing issues around this stuff, so your initial instinct is not unfounded. I've found it takes several days for caches to catch up, and purging all relevant pages through the MediaWiki software doesn't help. It'll even go back and forth from "incorrect" to "correct" in the process...see this report. What I didn't mention there is, since it "fixed itself," it also screwed itself up again, and now is working properly again. So you might re-upload, and then wait a week, to save yourself a possible headache :) -Pete (talk) 21:41, 31 January 2019 (UTC)
I am seeing no misalignment, after random checking, in the file mentioned by @Abyssal:. Hrishikes (talk) 02:09, 1 February 2019 (UTC)
The places where I (still) see incorrect OCR, @Hrishikes: pages 24 and 25 (DJVU) and perhaps others. It may not be a big deal, if it's just a few pages, as they can just be OCR'd or transcribed "manually." Still, it would be nice to have a clearer sense what goes on in cases like these. (Also, I was incorrect about my example -- it has not "fixed itself" after all. I may have been looking at the wrong page when I wrote that.) I'm inclined to start a phabricator task, but I'm not sure I even understand the problem well enough to describe it accurately. -Pete (talk) 20:29, 7 February 2019 (UTC)
@Peteforsyth: -- In case of misalignment of ocr, it can be done locally, there are multiple options at present. I have done some; please check the quality. Hrishikes (talk) 06:14, 8 February 2019 (UTC)
We typically have Proofread Page Extension issues with PDFs, which is one reason why DjVu files are preferred. If a correct DjVu file of this text can be uploaded, and the work done thus far transferred over to an Index page for the DjVu file, the whole process will likely proceed more smoothly. --EncycloPetey (talk) 20:34, 7 February 2019 (UTC)

3000 Validated Works

Late on Thursday 17 January 2019 (UTC) validation of Amazing Stories/Volume 01/Number 02 was completed. This was our 3000th work to reach this status. To see a list of the previous milestone works go to Portal:Proofreading milestones. Beeswaxcandle (talk) 17:59, 18 January 2019 (UTC)

Well done to all the editors involved. Your work is very much appreciated. —Beleg Tâl (talk) 18:51, 18 January 2019 (UTC)
Yes, very well done! But it occurs to me… Shouldn't these numbers be displayed prominently on the Main Page? We currently have 3,059 validated works, 1,995 proofread works, just under a million proofread pages (according to Slowking above), etc. Some stats are displayed on the Community Portal but that's centred on pages and maintenance: we should have some bragging numbers on the front page for visitors! enwp brags about its 5,801,659 articles at the very top of their w:Main Page, but Wikisource's numbers are even more impressive considering the far smaller number of editors that have achieved them! --Xover (talk) 08:53, 10 February 2019 (UTC)
the front page banner links to https://en.wikisource.org/wiki/Special:Statistics . there are better sources, such as https://stats.wikimedia.org/wikisource/EN/Sitemap.htm ; https://wikisource.org/wiki/Wikisource:ProofreadPage_Statistics ; https://stats.wikimedia.org/wikisource/EN/SummaryEN.htm ; https://en.wikisource.org/wiki/Wikisource:Community_portal/Proofreading_statistics -- Slowking4SvG's revenge 17:16, 20 February 2019 (UTC)