Wikisource:Scriptorium/Archives/2021-06

From Wikisource
Latest comment: 3 years ago by Dick Bos in topic Extract Text button
Jump to navigation Jump to search

Mass rollback request

Can someone just rollback every single edit I made after https://en.wikisource.org/w/index.php?title=Page:Chronological_Table_and_Index_of_the_Statutes.djvu/25&oldid=11369066

Somehow in trying to update something, it just broke here Chronological Table and Index of the Statutes/Chronological Table/Edw4

As I no longer have time, patience or expertise to track down precisely what wentwrong, The simplest answer is just to have the whole effort rollbacked en-masse, despite it for the most part actually working.

If someone has the time to find the typing error, (with no documentation, comments etc.) because that almost certainly what's caused it to be broken, you are welcome to, but I have had it with trying to actually contribute until I can actually rely on being able to do things without continually creating headaches for myself or other contributors to resolve.

I'd like to however thank other contributors here for their support, on past efforts however.

ShakespeareFan00 (talk) 18:42, 5 June 2021 (UTC)

Reverting is easy. Do you need help trying to figure out this table? —Justin (koavf)TCM 18:48, 5 June 2021 (UTC)
Yes. I'd like to know is why the Transclude wasn't working properly, It is almost certainly a typo on my part, somewhere. ShakespeareFan00 (talk) 19:05, 5 June 2021 (UTC)
Also Chronological Table and Index of the Statutes/Chronological Table/Cha2. It would be nice to know WHY. ShakespeareFan00 (talk) 19:29, 5 June 2021 (UTC)
What seems to be going on in Page: namespace as well is various interactions of blank characters (like spaces and line-feeds), with various parts of the header and sectionalisation. I've in the past asked for what the precise handling is to be DOCUMENTED, but no one has done so yet :( Sigh ) ShakespeareFan00 (talk) 19:37, 5 June 2021 (UTC)
I found the issues.. I'd not been setting up the transcludes correctly. No need to rollback now as I am getting it working very nicely :) ShakespeareFan00 (talk) 20:42, 5 June 2021 (UTC)
Chronological Table and Index of the Statutes/Chronological Table/23Geo2 - What went wrong here? ShakespeareFan00 (talk) 22:37, 5 June 2021 (UTC)
Note to self : Check that you've paired tags properly... ShakespeareFan00 (talk) 10:37, 6 June 2021 (UTC)
This section was archived on a request by: — billinghurst sDrewth 07:33, 9 June 2021 (UTC)

Merge Portal:Dodd, Mead & Company and Portal:Dodd, Mead, and Company

For some reason these were made as two separate portals when they are the exact same company. PseudoSkull (talk) 18:04, 13 June 2021 (UTC)

@Billinghurst: this is probably a WD mess too. PseudoSkull (talk) 18:07, 13 June 2021 (UTC)
merged No WD mess, no-one had attached it. I really encourage people to attach portals in WD using the WP article as a guide. — billinghurst sDrewth 22:54, 13 June 2021 (UTC)
This section was archived on a request by: — billinghurst sDrewth 22:54, 13 June 2021 (UTC)

The Wonderful Visit

The Wonderful Visit should be moved to The Wonderful Visit (1895) to disambiguate from The Wonderful Visit (Atlantic Edition). Languageseeker (talk) 04:25, 9 June 2021 (UTC)

already Donebillinghurst sDrewth 07:14, 27 June 2021 (UTC)
This section was archived on a request by: — billinghurst sDrewth 07:15, 27 June 2021 (UTC)

Stella Dallas (Prouty)

@Billinghurst: Please move this to Stella Dallas (Prouty, 1923) because Stella Dallas (Prouty) needs to become a versions page. Stella Dallas (Prouty, 1925) is going to be published on Wikisource within the next few days. PseudoSkull (talk) 20:39, 26 June 2021 (UTC)

Done Typically we would wait until the other version exists, and that is our preference. Noting that I added publisher name to title as they are so close, and as different publishers there is obviously something else at play. — billinghurst sDrewth 02:31, 27 June 2021 (UTC)
This section was archived on a request by: — billinghurst sDrewth 07:15, 27 June 2021 (UTC)

Error on Main Page

Can someone help fixing the "Lua error" in Module:Monthly Challenge statistics? Many thanks for debugging a component of our main page.廣九直通車 (talk) 09:30, 1 June 2021 (UTC)

@廣九直通車. The "responsible parties" have been notified, and knowing them they'll probably fix it in short order (modulo IRL, timezones, etc.). :) Worst case I can try tracing my way through the code at some point over the next 24–48 hours (superficially it doesn't look like it'll be too too hard to fix, but…). Xover (talk) 11:04, 1 June 2021 (UTC)
I think I may have fixed this accidentally by prodding a cronjob before seeing this message. I will attempt to figure out what was actually exploding and hopefully it'll all go off on time next month. Inductiveloadtalk/contribs 11:28, 1 June 2021 (UTC)

Revert my edits on Index:Works of Thomas Carlyle - Volume 03.djvu

Can an administrator revert my edits on Index:Works of Thomas Carlyle - Volume 03.djvu that I recently did. I tried updating the source file and then shifting the pages and I didn't work out.

Page:Works of Thomas Carlyle - Volume 03.djvu/13
Page:Works of Thomas Carlyle - Volume 03.djvu/11
Page:Works of Thomas Carlyle - Volume 03.djvu/10
Page:Works of Thomas Carlyle - Volume 03.djvu/20
Page:Works of Thomas Carlyle - Volume 03.djvu/21

Languageseeker (talk) 03:25, 2 June 2021 (UTC)

@Languageseeker: It's not clear to me what you want done, or what you need an admin for. You can undo your own edits on any page, and you can even undo most page moves (just so long as the automatically created redirect doesn't have any additional edit history). Is the issue that the source file has been updated and the Page: pages are no longer aligned with the scan? Xover (talk) 05:19, 2 June 2021 (UTC)
This got taken care of. Thanks. ! Languageseeker (talk) 20:24, 2 June 2021 (UTC)
If you want to see your page moves special:log/move/Languageseeker and they should have a "revert" link; they do for admins. It will create the redirect, and if you want that deleted then you will need to request those with {{sdelete}}. — billinghurst sDrewth 11:03, 3 June 2021 (UTC)

Over 3000 pages processed in the first-ever Monthly Challenge

The May Monthly Challenge is now complete and the numbers are in: 3122 pages were processed (marked no text, proofread or validated), which is more than 50% over the tentative goal of 2000 and represents an average velocity of over 100 pages/day. The following works were fully proofread:

And the following validated:

As well as quite some progress on other volumes in the challenge. New works in June include:

Thank you to everyone taking part, and here's to 4000 pages in June (for you northern hemisphere people tempted by nice weather: ignore that horrid glowing yellow sphere in the sky. It'll give you cancer, stay in and read books on the Internet!). Inductiveloadtalk/contribs 15:53, 1 June 2021 (UTC)

This is a fantastic initiative, thanks to those who organised. Given WS can be overwhelming and difficult to find your way around at first, I think this has great potential to make contributing easier and more rewarding. Nickw25 (talk) 10:18, 5 June 2021 (UTC)

20:02, 7 June 2021 (UTC)

Plato volumes

I was sifting through the philosophy section and saw that books of Plato's dialogues, at least the 5 volume Jowett translations (which are all empty indices, Index:The Dialogues of Plato v. 1.djvu and so on), have at least one redundant copy, being Index:02 Jowett Plato Facsimile Vol2.pdf, from which Euthyphro and Apology (Plato) is transcluded from into mainspace. Should the proofread and transcribed pages from the PDF moved to the more complete 5 vol DJVUs, and the mainspace pages be transcluded from the moved pages? Thanks, EggOfReason (talk) 00:59, 6 June 2021 (UTC)

I would check to see which scan is of the best quality and migrate content to that copy. --EncycloPetey (talk) 21:55, 9 June 2021 (UTC)

line spacing on Polytonic block

Could someone correct the line spacing on Template:Polytonic block? It produces a larger gap between the first and second line in a block than between subsequent lines, as shown below.

ὄλβου καὶ πλούτου δώσω περικαλλέα ῥάβδον,
χρυσείην, τριπέτηλον, ἀκήριον ἥ σε φυλάξει,
πάντας ἐπικραίνουσ᾽ οἴμους ἐπέων τε καὶ ἔργων
τῶν ἀγαθῶν, ὅσα φημὶ δαήμεναι ἐκ Διὸς ὀμφῆς.

Thanks. --EncycloPetey (talk) 17:29, 10 June 2021 (UTC)

@EncycloPetey: This is another fun manifestion of MediaWiki's rather trigger-happy approach to P-tags. Basically, if the template looks like this:
<div>{{{1}}}</div>
then the output looks like this:
<div>
  Line 1<br>
  <p>
    Line 2<br>
    ...
  </p>
</div>
Whereas if the template looks like this:
<div>
{{{1}}}
</div>
then the output looks like this:
<div>
  <p>
    Line 1<br>
    Line 2<br>
    ...
  </p>
</div>
The MediaWiki skin adds top/bottom margins of 0.5em to P-tags by default, so the former ends up placing that margin between Line 1 and 2.
I have changed the template to align with most other block templates like {{larger block}} which also add the newline after <div>. Inductiveloadtalk/contribs 19:20, 10 June 2021 (UTC)

Should poor quality of an image be preserved in works?

If there is a version of an image that is in color, but the original text contains a version of the image in black and white, should the black and white image be used over the colored one? Or should we be aiming for textual accuracy by showing the image that originally appeared there, in its original state?

A specific example: @Languageseeker: recently added a colored version of the frontispiece to Resurrection Rock (1920). I really appreciate that he found the original painting of this image and it is beautiful, and I think it should be on Wikimedia Commons for sure. But my only concern is that it does not appear in color in this version of Resurrection Rock.

Another example is The Bloom of Monticello, which contains facsimiles of paintings in lower quality than the originals. In that work, I kept the images as they originally appeared. PseudoSkull (talk) 01:48, 6 June 2021 (UTC)

Depends on the work In one work of recent origin (of a technical nature) I used color replacements because it was only the scan that was in monochrome.

ShakespeareFan00 (talk) 10:39, 6 June 2021 (UTC)

I think about what the author's intent would have been. Would the author have intended to have full-color, high quality reproductions and were limited by the technology or did they incorporate such images for artist reasons. As a reader, I would rather see a high-resolution color version and I think that most authors would have preferred the same. Languageseeker (talk) 12:16, 6 June 2021 (UTC)
@Languageseeker: We also keep typographical errors though (except in rare circumstances), which were not intentional on the author's/publisher's part. PseudoSkull (talk) 18:02, 6 June 2021 (UTC)
@PseudoSkull: I've struggled with the issue as well. I think both sides have valid arguments. For me, the major distinction is between technological limitations and errors. Black and White reproductions of paintings resulted from technological restrictions while typographical errors did not. I think it's important to note that in this case, the painting was commissioned specifically for Resurrection Rock. Languageseeker (talk) 13:54, 10 June 2021 (UTC)

It really depends on the purpose of the text. In a book about painting, I would use higher quality color images, if they are available. If the image is intended to show detail in a part of the painting, I might stick with the original. If the image is a minor feature of the text, again I might use the original. --EncycloPetey (talk) 21:58, 9 June 2021 (UTC)

Depend on what the author's intent and Wikimedia Commons may consider having both colored and black & white. Some scans are photocopied from colored to black and white and if so proven, then having colored images may render black & white redundant.--Jusjih (talk) 04:28, 11 June 2021 (UTC)

RunningHeader

Template Runningheader doesn't support anymore leading spaces in the parameters. {{Runningheader| left| center| right}} has for result:

left
center
right

and not

left
center
right

Is there a way to fix the template or the pages with this issue? --M-le-mot-dit (talk) 17:36, 10 June 2021 (UTC)

@M-le-mot-dit: Try &nbsp; Tommy J. (talk) 19:00, 10 June 2021 (UTC)
@M-le-mot-dit: In general, whitespace around parameters is pretty fragile (it's handled differently for positional vs named parameters, for a start). What are you trying to achieve? It's possible index CSS may be more appropriate if it's spacing-related. Inductiveloadtalk/contribs 19:11, 10 June 2021 (UTC)
In fact these spaces are not useful; it was just clearer to separate parameters in long expressions. I have now to fix hundreds of page in Index:All the Year Round - Series 2 - Volume 1.djvu and I wonder if a bot may help me. --M-le-mot-dit (talk) 09:22, 11 June 2021 (UTC)
@M-le-mot-dit: if something is sitting in the header, my thought would be to just leave it. Someone can fix it when they validate the pages and there is no point in wasting good time for something that does not affect the transclusion. — billinghurst sDrewth 11:20, 11 June 2021 (UTC)
@Billinghurst: thanks for your suggestion. I'll fix validated pages and let the proofread pages for the future validation. --M-le-mot-dit (talk) 13:02, 11 June 2021 (UTC)
@M-le-mot-dit: Oh, I see, sorry I misunderstood the issue. I have adjusted the template to avoid this by using explicit named params to strip the whitespace. Inductiveloadtalk/contribs 13:23, 11 June 2021 (UTC)
@Inductiveload:. Excellent! Thanks for this improvement, because I don't remember if I have done the same elsewhere. --M-le-mot-dit (talk) 13:28, 11 June 2021 (UTC)

I see what @M-le-mot-dit: was attempting, and why I think. Leading spaces can be used with named parameters

{{RunningHeader|left= LEFT |center= CENTRE |right= RIGHT }}
LEFT
CENTRE
RIGHT

but not positional (unnamed) ones. Fixing this is a trivial matter for a bot cleanup, doing it manually is not trivial. CYGNIS INSIGNIS 11:54, 11 June 2021 (UTC)

It is a design feature of templates that named parameters work better in managing whitespace, and that positional parameters without the explicit parameter names (1=|2=|3=) just play differently. My point was that in headers in the page: ns we truly don't need to fuss. — billinghurst sDrewth 13:26, 11 June 2021 (UTC)

@M-le-mot-dit: There are some places where the running header has been used in the body rather than in the header or footer, and therefore will be presented in the transcribed work.

@Billinghurst: Is there a way to search for uses of this template in the transcluded text for repair, without having to search for uses that are not transcluded? --EncycloPetey (talk) 15:03, 11 June 2021 (UTC)

Inductiveload has fixed the issue in the template itself; you can see in my example in the top of this section that there is no more difference with or without leading space. So there is no problem even when RunningHeader is used in the body of a discussion. --M-le-mot-dit (talk) 18:03, 11 June 2021 (UTC)
@EncycloPetey: I see a template count of (transclusions: 1,230,437, links: 125) all up, and in Main ns (transclusions: 4,031, links: 0) [7] and we don't know whether they are using the template name or a redirect, so would have to do a what transcluded these to 4000 main ns pages. With regard to header/body/footer, remember they don't actually exist, they are javascript display artefacts of wikitext, so no, I don't think that it is worth chasing down for a handful of possible cases of <pre> text displays. This belongs to our proofreading and validation process. — billinghurst sDrewth 01:22, 12 June 2021 (UTC)

Request for eyeballs and editing => Wikisource:Do not move Index:, Page:, File: pages

Hi. I am preparing this document as a bit of an explanation essay and a bit of guidance. It is a rough draft and I would appreciate people adding to the document or adding annotations for components that need clarifying/modifying. Be as harsh/finicky as you like with it, it is formative and needs to be understandable to newer users. — billinghurst sDrewth 02:17, 14 June 2021 (UTC)

#wikisource IRC channel moved to Libera.chat

  • The #wikisource IRC channel recently moved from Freenode to Libera.Chat.
  • Register an account on Libera.Chat and join us there!
  • More information at m:IRC/Migrating to Libera Chat
  • Links around the place were changed recently.

billinghurst sDrewth 01:53, 13 June 2021 (UTC)

@Billinghurst: or whoever. Is this channel logged and moderated? CYGNIS INSIGNIS 17:01, 14 June 2021 (UTC)
@Cygnis insignis: We haven't logged the channel at this stage. Moderated? No, as a WMF-grouped channel it can be as required. It has m:wm-bot sitting in it, and is simply a channel. Last couple of days it has been Inductiveload educating me in css conversions of my works as I simplify the direct code in my previous ToC and put that into Index:(workname).css, and get them better "ready for export". [Must admit it makes ToC so much easier to navigate and proofread in the Page: ns. Thanks Inductiveload. — billinghurst sDrewth 23:52, 14 June 2021 (UTC)

Suspected OCR errors

I've posted a list of suspected OCR errors on validated pages at User:Мишоко/Suspected OCR errors. Much work to be done for those interested. Мишоко (talk) 08:26, 14 June 2021 (UTC)

@Мишоко: Very good! Inductiveloadtalk/contribs 08:55, 14 June 2021 (UTC)

transclusions as single page

I just created the index Index:My Disillusionment In Russia.djvu as a single page. The practice here is to subpage sections, even to the most discrete part (eg. tertiary works, and at least dictionary). A concern might be the generation of page structure for export, I don't know enough about that. Comments? CYGNIS INSIGNIS 15:20, 14 June 2021 (UTC)

@Cygnis insignis: On-wiki: it's had to go right to a given chapter because there is no TOC.
WRT to export:
  • This means there will be no Chapters in the document TOC: [8]. In conjunction with the lack of in-text TOC, there is no way at all to find a chapter in the epub (or PDF). There is no fundamental reason we can't deal with this a task to add the ability to add an in-page TOC that points to sections on the current page to the epub TOC.
  • If you do do this, you should probably add a {{page break}}, otherwise the chapters all run together, whereas you probably want each chapter to be a new page (this is normal in pretty much all real books and also ebooks). Inductiveloadtalk/contribs 15:35, 14 June 2021 (UTC)

20:26, 14 June 2021 (UTC)

Transclusion Problem

In The Adventures of Huckleberry Finn (1884)/Chapter 4, the first word is cut off on the transclusion, but not on the page scan. (It says Well in Page ns, but not in the transclusion.) Can somebody please take a look. Languageseeker (talk) 02:39, 15 June 2021 (UTC)

It displays for me, same in both main and page: nss. — billinghurst sDrewth 11:53, 15 June 2021 (UTC)
Thanks for checking. It seems to work for every Layout, except Layout 2 (the default one). Which one are you using? Should I take a screenshot? Languageseeker (talk) 12:56, 15 June 2021 (UTC)
@Languageseeker: May be due to the negative indent. Try something like "height1=177px|width1=100%|height2=30px|width2=300px|height3=440px|width3=324px" and no indent. --M-le-mot-dit (talk) 15:13, 15 June 2021 (UTC)
The ELL is sitting up above the image in layout 2. People are just trying to be too clever. Keep it simple. There should be three images, not the one. The title, the image, and the W. Then you can use {{drop initial}}. We need something that works on computer monitors and phones, so trying to think that you can extract one image from a book and get image is just fooling oneself. KISS! — billinghurst sDrewth 16:31, 15 June 2021 (UTC)
{{overfloat image}} and {{flow under}} should be burnt at the stake, and if anyone wishes to hold to them while they are burning ... (well). They won't export nicely. — billinghurst sDrewth 16:37, 15 June 2021 (UTC)

Find-and-replace across a whole work

Do we have a tool that can do find-and-replace operations across all the pages in a single work? I'm thinking of something analogous to VisualFileChange.js on Commons. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:44, 11 June 2021 (UTC)

No automated bots that users can set and forget. Generally I would do a search and bot it. We can just load the special:prefixindex set of pages and bot them. Not hard. More whether we can get a good regex with little false positives. If you want something done then WS:BR. With regard to the script why don't you just load it in your common.js and see if will work for you. — billinghurst sDrewth 13:33, 11 June 2021 (UTC)
visualfilechange can do it on commons. but it was designed as a deletion tool, so not included here. maybe loading the javascript will work here [14] --Slowking4Farmbrough's revenge 12:19, 16 June 2021 (UTC)

Wikimania 2021: Individual Program Submissions

Dear all,

Wikimania 2021 will be hosted virtually for the first time in the event's 15-year history. Since there is no in-person host, the event is being organized by a diverse group of Wikimedia volunteers that form the Core Organizing Team (COT) for Wikimania 2021.

Event Program - Individuals or a group of individuals can submit their session proposals to be a part of the program. There will be translation support for sessions provided in a number of languages. See more information here.

Below are some links to guide you through;

Please note that the deadline for submission is 18th June 2021.

Announcements- To keep up to date with the developments around Wikimania, the COT sends out weekly updates. You can view them in the Announcement section here.

Office Hour - If you are left with questions, the COT will be hosting some office hours (in multiple languages), in multiple time-zones, to answer any programming questions that you might have. Details can be found here.

Best regards,

MediaWiki message delivery (talk) 04:19, 16 June 2021 (UTC)

On behalf of Wikimania 2021 Core Organizing Team

OCR bot

Do I recall reading that someone has a bot that can populate pages with OCR text, and perhaps running headers, ready for proofing? Index:A Catalogue of the Birmingham Collection - 1918.pdf has over 1100 pages, and it would be good to get a help starting on them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:58, 15 June 2021 (UTC)

@Pigsonthewing: Does the OCR not appear automatically once you edit the page? Inductiveloadtalk/contribs 18:22, 15 June 2021 (UTC)
Yes; that's not what I'm talking about Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:44, 15 June 2021 (UTC)
@Pigsonthewing: Also not what you asked, but are you aware that the running header and footer can be inserted automatically as you go? A 'bot' could do that and save the page, is that what you want? I don't consider that as useful as the other methods of proofreading. CYGNIS INSIGNIS 13:31, 16 June 2021 (UTC)
@Cygnis insignis: I suspect what he is asking about is 'automatically' using Help:Gadget-ocr to create the pages with it's improved OCR, instead of just saving the garbage that is typically embedded in the file. Jarnsax (talk) 17:32, 16 June 2021 (UTC)
@Pigsonthewing: I ran the OCR and fixed the running header on the first 50 pages or so for you. Definitely a work that will benefit from one person doing the majority of the actual transcription, for consistent table formatting. Feel free to poke at me for another batch.. far less tedious when you can do smaller batches somewhat automatically. Jarnsax (talk) 17:56, 16 June 2021 (UTC)
Having run 110 or so pages of this, while a bot to perform the task might be useful in some cases, this isn't one of them. Tessaract is obviously quite confused by the page layout, seeing it as 'multi-column' instead of 'dictionary' about half the time, and so the text for the running header isn't reliably placed. To get a 'reliable' OCR would probably require manually futzing with the PDF to define text fields... not even remotely worth it. Jarnsax (talk) 19:32, 16 June 2021 (UTC)
yeah, this work will require a lot of table code. given the text layer typical output, might want to paste in a spreadsheet, to move cells around, and then into https://magnustools.toolforge.org/tab2wiki.php bot will not help. --Slowking4Farmbrough's revenge 01:27, 18 June 2021 (UTC)

Editor parameter in "header" and its use on subpages

Hi. I would like for us to consider that the editor in {{header}} have advice that it should typically only be used at the top level (or volume level) of a work, and not need to be added to every subpage of a work, eg. for article level. We need to be keeping the subpages cleaner and only have the information that is specifically pertinent to the subpage, not filled with secondary information that is available higher in the work. — billinghurst sDrewth 01:26, 18 June 2021 (UTC)

Proposal: allow bots the reupload-shared right


This will allow bots to be used to import files from Commons using the mw:Manual:Pywikibot/imagetransfer.py script (which is currently broken, but I'm working on a fix).

Currently this right is disallowed except for admins.

@Xover: ping since you pointed me to this script. Inductiveloadtalk/contribs 21:29, 2 June 2021 (UTC)

 Support Languageseeker (talk) 21:39, 2 June 2021 (UTC)
 Support I'm slightly ambivalent. There was a reason why this permission was only assigned to +sysop when added, and we don't do a great job of managing bot permissions currently. On the other hand there is limited harm that can be done with it, and we do have some vetting of +bot. Not having it would also prevent InductiveBot (and other bots) from localising files from Commons or require it to have +sysop just for those tasks. So on balance I land on support. Xover (talk) 07:10, 3 June 2021 (UTC)
  • not support proposal in current form. Firstly the pywikibot script doesn't even work for admins, and if an admin has the right to go and do it, then it is a pretty simple task once one is logged into toolforge; you don't need to activate a bot right.

    To the proposal, we don't have that many requests, and we don't have any backlog, so what is the justification for such a significant change? I do not wish to have all bots with unlimited right to transfer files from Commons, and if we were to progress I would want to see controls over that ability. Remembering that what this is doing is also removing a file from Commons, and that should never be an unregulated right. — billinghurst sDrewth 11:45, 3 June 2021 (UTC)

  • I've fixed (well, pending) the script.
  • Also, the script only deletes the file if moving to Commons and if the user has delete at the source wiki. Since I'm not asking for the delete right, (and I don't even have it myself at Commons) this is doubly moot if running a non-sysop bot account and moving from Commons.
  • The justification is not having to perform batch imports actions as a sysop, but instead under a normal bot account. Since bots are have a process to gain their flag, it's not like just anyone can do this; that's the control over the ability. Fundamentally, if someone wants to upload the files here, they can still do so with a bot account by messing with the filename (and or touching the file hash), so bot users already have the ability to make a mess and don't because then you get a rude message on your talk page and/or a -bot.
  • There's nothing wrong per se with doing it as a sysop on Toolforge, it's just a bit overpowered, IMO, and a (small) faff when I have my bot OAuthed locally (though now I have two sets of tokens, so it's not so bad). Inductiveloadtalk/contribs 12:06, 3 June 2021 (UTC)
So you are saying that while it is called imagetransfer it is actually a replication, not a transferral. Okay, that is reassuring. With regard to the right, it would be better to have a group created and have the ability to have that right allocated, either to a bot, or to an individual. That gives a better control and overt permission to do an act, rather than as a hidden action (remember bots are generally hidden from RC). I would much rather have a light procedure in place where a 'crat (or maybe a sysop) grants the right to an account on public application, then it can be set to expire, and we don't have issues with bots just doing things. — billinghurst sDrewth 13:24, 3 June 2021 (UTC)
To be fairrrrrr the first line of the script doc is Script to copy images to Wikimedia Commons, or to another wiki.
A separate group makes sense too, but it would be best if it can be sysop-granted, I don't think it needs 'crat oversight (unless uploading these images is specifically harmful in a way I haven't realised?).
I mean, I can just upload them on my own account, but it means I have to bot-flag myself and then anything I do in the meantime is b. Or I spam RC with the images (in this case, >100). Neither of which is ideal, IMO. Unless we do say that image transfers shouldn't be done with a bot flag, and RC spam is OK in this case, just like local uploads, which is fine by me too, I don't really mind.
If "bots doing questionable things" is a issue, IMO we have bigger problems than just the possibility bot operators nefariously copying files from Commons. Inductiveloadtalk/contribs 13:49, 3 June 2021 (UTC)
Answer: Bots' actions are typically hidden from normal processes, so a modicum of caution/risk management and oversight is always best. I have seen bot access abused, and while I don't think that it will happen here, a light touch approval process is not a high hurdle. Plus this way, it can be given to those without bot rights however we so choose. — billinghurst sDrewth 14:47, 3 June 2021 (UTC)

Proposal 2: create separate group

Okay suggest that the proposal becomes: Creation of a group called "upload shared" with the sole right of reupload-shared that can be added and removed by administrators and bureaucrats. We can then work out our procedures for how that is applied. We can say now explicitly that administrators can apply it to their bots as an extension of their rights; further detail to be confirmed by consensus, if community supports. — billinghurst sDrewth 14:34, 3 June 2021 (UTC)

Annotation: Override files on the shared media repository locally (reupload-shared) which is required to move a file from Commons to enWS as normal rights will stop that happening due to the existence of the file at Commons. — billinghurst sDrewth 14:37, 3 June 2021 (UTC)
 Support fine by me. Inductiveloadtalk/contribs 22:39, 4 June 2021 (UTC)
 Comment Proposing to close this request as having the community consensus for creation of a user group "upload shared" with both the sole right and addition/removal capability as described. — billinghurst sDrewth 01:41, 14 June 2021 (UTC)
requested at phab:T285130billinghurst sDrewth 08:22, 18 June 2021 (UTC)

Mobile version => collapsed licences?

I am wondering for our works and our author pages whether we should collapse the visible licences. — billinghurst sDrewth 04:31, 18 June 2021 (UTC)

author categories

Can I get a pointer to discussion of the categories in the Author ns that concluded the construct category:X as authors was a good idea? CYGNIS INSIGNIS 18:54, 17 June 2021 (UTC)

Check the archives for here over the last 18 months, the conversations were open for a long period. The basis was that we were getting a mix of authors and biographies in the categories, so they have been named overtly. — billinghurst sDrewth 11:58, 19 June 2021 (UTC)

Blackletter looking horrid (UnifrakturMaguntia)

Anyone else currently finding blackletter difficult to read.

  • Hard to read Hard to read
  • Hard To Read Hard To Read
  • HARD TO READ HARD TO READ
  • HARD TO READ HARD TO READ

When I proofread works I have used it over the umpteen years and not had a problem. Looking at it today, this representation is awful. I have no idea where we look in the ULS system for the font history of UnifrakturMaguntia. — billinghurst sDrewth 11:51, 19 June 2021 (UTC)

we are at the mercy of a free font provider http://unifraktur.sourceforge.net/ there are public domain fraktur fonts, but haven't found an accessible one. http://www.morscher.com/3r/fonts/fraktur.htm we need a special character set. german converts to latin, it is only english that continues to do mainly on title pages. (i thought the point was to make it hard to read) --Slowking4Farmbrough's revenge 14:43, 19 June 2021 (UTC)

Please provide input here or on Meta and during an upcoming Global Conversation on 26-27 June 2021 about the Movement Charter drafting committee

Hello, I'm one of the Movement Strategy and Governance facilitators working on community engagement for the Movement Charter initiative.

We're inviting input widely from users of many projects about the upcoming formation of the Movement Charter drafting committee. You can provide feedback here, at the central discussion on Meta, at other ongoing local conversations, and during a Global Conversation upcoming on 26 and 27 June 2021.

The Movement Charter drafting committee is expected to work as a diverse and skilled team of about 15 members for several months. They should receive regular support from experts, regular community reviews, and opportunities for training and an allowance to offset costs. When the draft is completed, the committee will oversee a wide community ratification process.

Further details and context about these questions is on Meta along with a recently-updated overview of the Movement Charter initiative. Feel free to ask questions, and add additional sub-sections as needed for other areas of interest about this topic.

If contributors are interested in participating in a call about these topics ahead of the Global Conversation on 26 and 27 June, please let me know. Xeno (WMF) (talk) 16:53, 19 June 2021 (UTC)

The three questions are:

  1. What composition should the committee have in terms of movement roles, gender, regions, affiliations and other diversity factors?
  2. What is the best process to select the committee members to form a competent and diverse team?
  3. How much dedication is it reasonable to expect from committee members, in terms of hours per week and months of work?

Proposal: add a "project marker" to new works (e.g. PotM, MC, Wikiprojects) in New texts

Despite a small drama a couple of weeks ago over this subject, I think it would be worth having a way to show some new works have come out of a community project, such as WS:POTM, Monthly Challenge, or any other Wikiproject that produces a proofread work. Specifically, it should link to the project in question to drive traffic to that project and allow the project participants to see their project's successes advertised.

Perhaps some fairly discrete inline tag that obviously not part of the work title: PotM (just an example, please don't bikeshed the exact formatting!). Inductiveloadtalk/contribs 10:16, 3 June 2021 (UTC)

@Inductiveload: Are you thinking of something that is logged within Wikidata through their d:Help:Badges as is the desired means to mark proofread and validated works? Or were you just wanting something local. Something that differentiates projects, or specific per project. Were you thinking something built into header templates? Root level, root level and subpages; or root talk page?

Further, what is your justification for driving people to projects for completed works? What do you think that it will achieve? Why do you see that project-produced works deserve that over single works?. I don't personally see that completed project works particularly need any special recognition post completion, well nothing more special than any other completed work. I would say feel welcome to develop something that sits on talk pages that allows works to be linked to projects, to be called up easily, otherwise there is nothing special about these works compared to any other work. Well nothing that warrants more recognition than something completed by a person or a couple of people outside of a project. At the moment our works are tagged with badges as delivered from WD records for Proofread/Validated, FT, and I would prefer to keep our header tagging to the badges, and to tag quality of works. — billinghurst sDrewth 11:31, 3 June 2021 (UTC)

@Billinghurst: Sorry, that wasn't clear, I mean in {{new texts}}, not on the mainspace pages or in the headers.
The idea is to show people that there are organized subprojects working on $whatever and that the projects are active (because otherwise they wouldn't be producing a new work) and available to join in. Inductiveloadtalk/contribs 11:42, 3 June 2021 (UTC)
@Inductiveload: I think that having some display options of what you are proposing in "new texts" sandbox would give the community an idea of the sort of thing, and the elegance that can be produced. It may also be opportune for us to think what else that template could easily and neatly do without overcrowding it, or making it ugly. — billinghurst sDrewth 11:59, 3 June 2021 (UTC)
Something (very roughly like Template:New texts/sandbox. Obviously exact styling is flexible, but it's a bit soon for nitpicking over that. Inductiveloadtalk/contribs 12:16, 3 June 2021 (UTC)
 Support I think this will help bring attention to community collaborations and reward users who participate in them. Languageseeker (talk) 03:34, 5 June 2021 (UTC)
 Support I think this helps promote the value of contributions to the collaborations and demonstrates to newcomers in particular that their contributions can quickly make it into a compiled product. One of the challenges with WS is that you can make contributions to an index that years later still isn't completed or transcluded. For some people, this extended time from contribution to realised product perhaps reduces engagement (I know it does for me). Nickw25 (talk) 10:19, 5 June 2021 (UTC)
 Comment On frwikisource, works in the New Texts section of the page produced by Mission 7500 are marked with: {{e|{{vert|Mission 7500}}}} which produces: Mission 7500 . I do not know how long this marker has been in place, but it has been there since at the very most, January 2019. CVValue (talk) 00:56, 21 June 2021 (UTC)

15:49, 21 June 2021 (UTC)

em-dash at page wrap

How can we suppress a space at the start of a page, when the preceding page ended with a character such as an em-dash? for example, the start of page 26 on /Report renders as "Steam.— Should" when "Steam.—Should" is wanted. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:10, 21 June 2021 (UTC)

@Pigsonthewing: You need to use {{peh}} (page-end-hyphen) to do that. It's the same way you'd keep the hyphen in the word "over-eager" being split over a page break. See H:HYPHEN for more details. Inductiveloadtalk/contribs 18:37, 21 June 2021 (UTC)

Editing news 2021 #2

14:15, 24 June 2021 (UTC)

New tool: Wikisource Image Uploader

The upload form.

There is a new tool available for more easily uploading images for use in Wikisource works: https://ws-image-uploader.toolforge.org/. Imaginatively, it is named Wikisource Image Uploader. There is also a local gadget for linking directly to the tool with prefilled fields from the Page or Index namespaces. More documentation at Help:Gadget-ImageUploader.

If you notice any bugs or feature opportunities, please file an issue at https://phabricator.wikimedia.org/project/board/5221. Or just let me know directly :-)

If you can't think of something to use it for here is some inspiration: Category:Pages with missing images. Inductiveloadtalk/contribs 01:01, 22 June 2021 (UTC)

Some questions: Is there a policy change about uploading images to the Commons and not Wikisource?— Ineuw (talk) 12:07, 25 June 2021 (UTC)
No, but some images cannot go to Commons due to licensing, and the tool supports that (though it's not quite smart enough to know what templates will or will not work at Wikisources, but at least {{information}} exists here). If you click "Wikisource", you will get a warning about not uploading to WS if you can upload to Commons instead: phab:F34527103. Inductiveloadtalk/contribs 12:21, 25 June 2021 (UTC)
Thanks, so nothing changed.— Ineuw (talk) 13:19, 25 June 2021 (UTC)

Server switch

SGrabarczuk (WMF) 01:19, 27 June 2021 (UTC)

500 works marked Ready for export

Today, the 500th work was checked and marked as Category:Ready for export: The Book of Scottish Song (1843). Clocking in at over 2500 pages on my e-reader, it's a hefty tome, and, moreover, is fully validated.

As well as being available for download via the usual download buttons, the books in Category:Ready for export are also presented as an OPDS catalogue at https://ws-export.wmcloud.org/opds/en/Ready_for_export.xml. If you have a capable e-reader, this provides a "book store" that you can browse directly from your device.

If you are interested in helping add to our collection of verified ready for export books, see Help:Preparing for export. If you you have any questions at all, please feel free to drop me a note (especially if the help page is unclear, so I can fix it!). For those of you feeling competitive, French Wikisource has over 5,700 works in fr:Catégorie:Bon pour export. Jus' sayin'.

Also, I'd like to thank the Community Tech team for the overhaul of of WS-Export that made the exported formats more readable and the export process faster, easier and (much) more robust in general. Thank you for your hard work on a difficult problem. Inductiveloadtalk/contribs 22:35, 28 June 2021 (UTC)

Page namespace gutter lost between text and image

In the page namespace the display has lost the gutter/padding between the text box and the image. Can we please get it back, the abutting of the two is disconcerting, and the whitespace improved readability. — billinghurst sDrewth 13:42, 28 June 2021 (UTC)

@Billinghurst: This is a side effect of a mostly technical change (modernising old HTML markup in PRP) that went out last week and has caused a couple of other regressions. I'm tracking the issue and have reported this problem. I expect this particular issue to be an easy fix, but due to a deployment train freeze this week (see #Server switch above) it probably won't go out until next week. Xover (talk) 16:12, 28 June 2021 (UTC)
Yes, sorry about this, it was my fault. I made a fix for this today and Sohom has merged it, so it'll be live after the next train. Sam Wilson 04:29, 29 June 2021 (UTC)

"Memoranda" blank pages for notes in texts

I just encountered a text that is in the public domain that has two sections, both labelled in the table of contents, that are both blank pages intended for taking notes within the book.

I looked for Wikisource guideline pages that would tell how to deal with this sort of thing, and I found nothing. Author intent was to allow for blank space for note-taking, and IMO we should be including them when labelled as memoranda sections for this reason. This really is more important for exports that can be printed than being viewed in the transcluded wiki pages.

I went ahead and created the pages as proofread (1, 2, 3, 4). I wanted to see if we can get consensus to include this material to stay true to the author/publisher intent in the original text, and if so we should mention somewhere in the guidelines pages that "Memoranda" pages are acceptable with something like {{dhr|30}} pasted onto them. PseudoSkull (talk) 18:08, 29 June 2021 (UTC)

I wouldn't have bothered, and I wouldn't have marked them as proofread. I don't think that it needs a consensus on what to do, I am happy to leave it to the initial contributor as long as it is neither confusing to the user, nor making the work problematic. — billinghurst sDrewth 13:03, 30 June 2021 (UTC)

Scan Localisation requests ...

(1869–1957) Not PD-UK.


ShakespeareFan00 (talk) 13:43, 30 June 2021 (UTC)

@ShakespeareFan00: I will attempt to import.
Re. but the scans are a US edition: this doesn't matter. A 1912 work always has an acceptable license at enWS because it's over 95 years old. It doesn't matter where the work was published: that only matter for Commons. Inductiveloadtalk/contribs 13:49, 30 June 2021 (UTC)
  • A number of these should be acceptable on Commons:
    • As first published in the U.S., but simultaneously elsewhere:
      • One Increasing Purpose
    • As first published in the U.S.:
      • Rebels and Reformers
    • As possibly (according to Commons, not sufficient) published in the U.S.:
      • The Way of Martha and the Way of Mary
  • The edition of The ABC of Relativity is still copyrighted, however, and should be deleted. (It maintains a distinct 1958 copyright date.) Some of the others may have been simultaneously published, but I am not sure; When We Were Very Young does not appear to have been, however. TE(æ)A,ea. (talk) 14:10, 30 June 2021 (UTC)

The body of the letter has been transcribed, but the signatures and dates below it have not, and I'm having trouble reading them. If anyone who's familiar with 18th century handwriting wants to transcribe/proofread(/validate) about half a page of signatures, I'd really appreciate it. The page to proofread is Page:Cook letter.jpg. —CalendulaAsteraceae (discusscontribs) 02:52, 16 June 2021 (UTC)

Hi, @CalendulaAsteraceae, that's a challenge I couldn't resist! I deciphered several of the names, and also made a table of the Fellows I could identify at Page talk:Cook letter.jpg. — Pelagic (talk) 12:02, 1 July 2021 (UTC)
@Pelagic: Thank you! —CalendulaAsteraceae (discusscontribs) 18:35, 1 July 2021 (UTC)

Index:A night in Acadie

This index includes stories A Night in Acadie. The old version of stories have no information about date of edition (besides the main page), no sources, and an items on WD are empty. Can I to overwrite this old texts after transcluding and to move stories to adequate subpages? I commit to update information on Wikidata. I know we said "don't overwrite and community will decide what with old texts". But nothing interferes to decide before proofreading and transcluding. Tommy J. (talk) 16:31, 23 June 2021 (UTC)

@Tommy Jantarek: What is usual for those short stories is that they were originally published in a journal. That piece of liter(ature) was published here in 2006, taking the Main for the name, not saying where it came from, etc. The wikidata item for A Night in Acadie should stay with the Main namespace. The version in the Main should be moved and a {{versions}} page placed there. Both the unsourced version and yours should get a new wikidata item and be "editions of" at the original wikidata.
None of this needs to happen until your sourced version is completed. On that, I have seen where a published, non-scan backed version becomes matched with a scan here, so maybe @Xover: can help with that, or maybe I was just seeing things....--RaboKarbakian (talk) 12:16, 24 June 2021 (UTC)
@RaboKarbakian: What is usual for those short stories is that they were originally published in a journal - but nothing about it is on header, so we don't know exactly. Do I must to create new versions necessarily? What I'm going to do? I'm going to rescribe this scan stories, later overwrite old versions with new, and to move them to adequate subpages, and to update Wikidata items as "versions". But I need community agreement because we determined that we don't overwrite old texts without sources and The community can then have the conversation about what to do with works. Tommy J. (talk) 16:03, 24 June 2021 (UTC)
@Tommy Jantarek: The reason for that "don't overwrite" thing is that the existing wikipage is connected to a wikidata item for a specific edition of a work. When we replace the wikitext in the wikipage with new text we change the contents in a way that can't be detected by the software. The resultt is a real mess.
Instead you can proofread your nice scan-backed version and transclude it to new wikpages in mainspace, properly disambiguated. Once done we can either speedily delete the old version if it purports to be the same edition, or propose it for deletion at WS:PD. If the community wants to keep the old unsourced / non-scan-backed version in spite of its problems, then we can create a {{versions}} page and host both side by side. We can theoretically propose it for deletion before proofreading the new version, but I very much doubt the community will go for that. Xover (talk) 19:02, 1 July 2021 (UTC)
@Xover: I understand. And thid is a reason why I ask about these old texts in this case since they is not connected mainly to a WD items and if they are the items are empty. This is a reason too why I proposed that I will overwrite text and update this items. Because of it I ask about now and do not to create 23 new texts and later do not debate about the old. But what you all will advice I will to do. Tommy J. (talk) 19:23, 1 July 2021 (UTC)

Tech News: 2021-26

16:32, 28 June 2021 (UTC)

OCR tool

[Since any further notifications by Comm Tech don't seem obviously forthcoming. This is my best understanding, so if I make a mistake, Comm Tech, please correct me.]

Firstly, I'd like to say thank you to the Comm Tech team for the new OCR tool, which represents a substantial amount of technical effort and design work that we have mostly not noticed here, because it's been largely test-driven at the Indic Wiksources bnWS, hiWS and taWS, which were not as well served as we have been by the OCR tooling we know and love here.

Secondly, I'd like to remind everyone that the tool as it currently stands is only the first pan-Wikisource implementation and will continue to be improved over the remainder of the OCR tool project. And once that project is complete, the tool will still be able to be adjusted. Notably, though the on-wiki tool as it stands does not bring much to the table that our existing non-default gadgets did except for not needing users to opt into gadgets, there are plans to add more features to the on-wiki UI (as I understand it, the entire feature set of the https://ocr.wmcloud.org tool will eventually be available directly from the Proofreadpage UI), as well as more features for the OCR back end as well. I'd ask everyone to be patient as the work proceeds. Community feedback has been requested at and will continue to be be sought at meta:Talk:Community_Tech/OCR_Improvements. Bugs and feature requests can be filed at phab:tag/wikimedia_ocr, or reported here and someone will file it for you.

Thirdly, I'd like to reassure people that, as your local friendly Interface Admin, I will be attempting to help integrate the new tool into the WS workflow effectively, and I am not intending to replace the existing OCR tool buttons with the new tool's UI unless there is strong community consensus to do so. The backend of the Google tool may be switched over, but it will still be available in the toolbar. The "phetools" black OCR button will only be switched over if the new tool provides equal or better results and speed (or if the phetools OCR service breaks and can't be fixed). Hopefully, this will eventually happen just to reduce the technical debt of the OCR system, but there's no point if it doesn't bring an advantage to the user.

Finally, on a note of slight admonition, can I suggest a slightly more engaged approach by the Comm Tech team when rolling out new things? If nothing else, just a quick heads up and explanation would be well-received. The vast majority of Wikisource users are not involved in the software development side of things, and having things change suddenly and without notice or ability to control it can make people feel anxious. This is pattern of upset often seen when any software tool or website changes UI or workflow, but even a small amount of sensitivity goes a long way to making people feel part of process of improvement, rather than the unwilling or unwitting subjects of capricious change for change's sake. Not that WS development is capricious, but very much web development is and without good communication, it's easy to assume that mysterious changes to WS are the same to users who are not involved in the development process. This is especially important when the work is done by formal teams like Comm Tech that have substantial back-channels that don't include general wiki users.

This is also important for another reason: if you tell us what you have done, we can say nice things about it and that will help you feel like your hard work is appreciated!

Again, thank you to the Comm Tech team for the work so far, and I am excited to see the new tool expand its abilities over the next iterations. Inductiveloadtalk/contribs 13:45, 30 June 2021 (UTC)

"engaged approach" yeah, this is a common pain point. the techs do not do comms or management well, so they do not try, when in an open community comms are vital to success. (it explains a lot of the WMF - wikimedia community conflict.) thanks for interfacing, i guess we will need some ambassadors / community diplomats to do the comms. Slowking4Farmbrough's revenge 00:40, 6 July 2021 (UTC)
@Inductiveload@Slowking4
Hello, thank you so much @Inductiveload for writing this up. My first release as a Product Manager of Community Tech has been a humbling experience.
I am new to the foundation and still understanding how rich the ecosystem of communications is here. I want to own this mistake. I wrongly assumed that updating the project page and talk page (which I also mistakenly forgot to properly sign with three ~ in one of my replies) would suffice. The goal is to prevent this from happening again. I have been retraoctively building a list of other places and channels we could have leveraged to bring the contributors along with the changes prior to release.
Here is the working list I have so far, and I'd love to understand what other channels we could have leveraged.
Places we updated prior to release:
  • Project Page
  • Project Talk Page
  • Wikisource email list
  • Signal groups
Places we should update prior to release in the future:
  • Tech News
  • Scriptoriums
~ let's flesh this out -- where else can we update?
We hope to fulfill more wishes, and are also hopeful that those releases will come with more visibility and trust. Please let me know if I missed any channels and thanks again for this very useful feedback. NRodriguez (WMF) (talk) 15:23, 12 July 2021 (UTC)
@NRodriguez (WMF): FYI, the email list is hardly used, and I didn't even know there was a Signal group (which is rich coming from me because there's an "unofficial" Discord server, which is still pending bridging to the official IRC). People won't see changes to the project page at meta on their local watchlists at the Wikisources either. Because the only place people are guaranteed to be present is on-wiki, Scriptoriums are the standard place for "long form" messages, and there is a global message delivery service for that purpose: I think this is meta:MassMessage#Global_message_delivery and the Wikisource distribution list, though I have personally never used it.
Tech News is good for simple notes, but they always keep it very brief, so if you have more than a sentence or two, a mass message is probably the way forward. Inductiveloadtalk/contribs 15:46, 12 July 2021 (UTC)
@Inductiveload got it! Great to know re: the incoming discord server. Great to know about the MassMessage option and the Wikisource distribution list. Thanks again, NRodriguez (WMF) (talk) 20:51, 12 July 2021 (UTC)
thanks for reaching out. not really your fault, as comms are diffuse with many channels, most abandoned. (kinda like tools) and the community is cranky, thinking it is perpetual september. but i thought mass message was the standard method. we need a learning pattern / SOP of how to engage community - Alex is expert in this. --Slowking4Farmbrough's revenge 21:00, 12 July 2021 (UTC)
I have been working closely with CommTech over the past year for Wikisource related work and I admit that we should have been more communicative. Especially, since it was about a change to the Wikisource extension. We will definitely keep all the points in mind and I agree @Slowking4: about the need for a learning pattern or an SOP. --SGill (WMF) (talk) 10:04, 14 July 2021 (UTC)

Changing the default layout when reading text

When opening a text to read directly on wikisource, the layout is very large on any modern screen. It is hard to read.

It is usually considered that the optimal number of characters per line should be around 50 to 80 to make it inclusive and accessible to all readership. The current layout makes lines around 130 characters large. I think it is far too much and I struggle finding the next word when I reach the end of the line.

Would you agree to change the layout to a more user-friendly width? --Cassiodore89 (talk) 21:53, 25 June 2021 (UTC)

@Cassiodore89: Have you tried changing the dynamic layout in the left side bar? If you're seeing wide text, it probably says "Layout 1". The default is wide, but Layout 2 is a much narrower layout: phab:F34527500.
Also, the default Vector skin will one day change to a fixed-width column layout for the entire UI, which you can use by unchecking "Use Legacy Vector" in your settings under "Appearance". Inductiveloadtalk/contribs 22:04, 25 June 2021 (UTC)
Thanks @Inductiveload:! Layout 2 is indeed much better. I think it would deserve to be the default.
Yes, I use the new Vector, but still the text is large with layout 1. Thanks. --Cassiodore89 (talk) 22:25, 25 June 2021 (UTC)
also try out timeless skin, which works better (for me) on smaller screens. Slowking4Farmbrough's revenge 23:31, 25 June 2021 (UTC)
  •  Comment @Cassiodore89: The cookie will remember our preference. Layout 1 is the version without restrictions that works best with all displays/devices and uses the user's default fonts and font sizes. Anything else has constraints, and therefore left to users to apply as they see fit. — billinghurst sDrewth 16:47, 26 June 2021 (UTC)
    @Inductiveload: I always think that all the blue links in the sidebar makes the display options disappear into the list. I sometimes wonder on the value of changing the color of the display options, and the a:link to a more distinctive colour. Even if we say something like. Display options (toggle) to be overt. Maybe we could change the background-color for pBody to lightgreen for that portlet box, or something that highlights the box. — billinghurst sDrewth 17:13, 26 June 2021 (UTC)
Thanks @Billinghurst:. I think this is where we disagree: I do not think Layout 1 works best with all displays/devices: the text is too wide in all modern widescreen displays, which is the display of, I assume, 95% of users reading wikisource on a computer. The problem here is not the font family or size, but the size of the column. For example, the French version of wikisource kept the default font (as in Layout 1), but limit the width of the column on wide screens (as in Layout 2). I think this is more readable. --Cassiodore89 (talk) 15:05, 27 June 2021 (UTC)
IMO, at least, the biggest problem with the readablility of Layout 1 is that it uses a sans-serif font. Sans fonts are inherently harder to track with the eye, especially in long lines of running text, and with modern high-res LCD displays the old 'use sans fonts for the web' rule is completely moot.... that was just for CRT monitors that couldn't sharply render serifs.
That, and not using kerning. Jarnsax (talk) 20:06, 27 June 2021 (UTC)
(not trying to stir up the 'sans vs. serif' debate here, actual point is to agree with others that Layout 1's sans text can render too wide to be readable for a lot of people) Something to actually consider might be adding a max-width on the order of 60-80ems to it, to help out people with high-res displays. Jarnsax (talk) 20:49, 27 June 2021 (UTC)
@Jarnsax: Use layout 2. In layout 1, we don't do anything with fonts, nor width, we impose nothing, we let it run and leave the user in full control. If you think that you can design a better layout, then go for it, and we can put it in for testing, there is that ability to add a test layout. — billinghurst sDrewth 13:47, 28 June 2021 (UTC)
  • Question Question Cassiodore89 and Jarnsax, I'm curious, why not make the browser window narrower to control line length? Is it that the desktop background is distracting, or some other reason? (Not asking to be nit-picky, but because the max-width in New Vector has been controversial.) — Pelagic (talk) 00:40, 1 July 2021 (UTC)
@Pelagic: It's certainly possible, but inconvenient (for me at least). Since things look distinctly different in Page: namespace as on normal pages (different font, no layout), a window that is 'too wide' when looking at a 'article' is usually way 'too narrow' in Page namespace, and I end up constantly messing with it. I think the more relevant point, though, is that most people are not going to know anything about the relationship between line lengths and readability.... if they see extremely long lines, they are just going to find the text difficult to follow and not know why. I think 'wikimedia in general' deals with this (and that sans-serif running text is really hard to track) by making the interline spacing really wide (compare to 'normal' spacing in the sidebar), while sticking to the (dubious) 'use sans fonts online' rule. I'm trying to avoid any font battle, I just think it would be a better choice (I don't really see a drawback) to prevent anyone from seeing text that is just unreadably wide. There actually is a 'limit' to page width on wikisource...the header and body stop getting wider, but it gives lines nearly 150 characters long, and a 'optimal' value is about half to a third of that (talking about Layout 1). Layout 2 actually has kind of the opposite problem...if viewed in a (very) narrow window, the text never reflows: it gives you a scroll-bar instead. I'm assuming it's set to 'fixed width' instead of 'maximum width', but that's unlikely to be an issue, I think, I only noticed because I was specifically looking to see how it wrapped words. Jarnsax (talk) 01:08, 1 July 2021 (UTC)
@Pelagic: As Jarnsax states, it is certainly possible, but inconvenient. My problem is not that there is no (complicated) way where I could eventually reach a better design (with more clics, changing browser size, increasing zoom of my browser or whatever else): my problem is that there is a design issue that user (like me) with a wide screen (like most people) has to solve BEFORE being able to reach the purpose of the site. Someone who came here just to read a book will first have to fix the design to then be able to read. This is a problem. I think this is partly why the English version of wikisource is now less popular (see the stats) than the French version of wikisource, where this design issue (and other ones) has been solved long time ago. You suggest a DIY solution, I would prefer it to be fixed for everyone.
Also, I complain about the wide default layout, not because of personal taste, but because I think it is a design issue. This is why I would love to have it fixed for everyone, and not through a DIY trick. I believe it is a design issue because it is too wide, by default, according to all the publications I could read on the optimal line length. It makes the reading experience more difficult for most reader. This also means that this design is not the most inclusive we could have. --Cassiodore89 (talk) 18:49, 4 July 2021 (UTC)
Thanks, @Jarnsax. Good point about the Page: view benefiting from a wider window (I see that as a consequence of having the text and scan side-by-side).
Instead of designers deciding what they think is the "optimum" width, could it be made adjustable to user taste? If changing the window width isn't appealing, then maybe an on-page slider? Whilst it's understandable to want something that works well for most people without having to fiddle settings, I suspect "one-size-fits-most" may be an unattainable goal.
I had a look at French Wikisource as @Cassiodore89 suggested. The line length is good, like our Layout 2 and 4. But it feels like so much of the page is wasted.
Traditionally, printed media would deal with large pages by breaking them up into columns, but multi-columns are terrible on a vertically-scrolling page. As far as I know, most PDF and e-book viewers (including web-based ones) are still paginated. Would MediaWiki benefit from a page-oriented reading view? I imagine the dev. work to make that happen would be significant. And most online page-turner interfaces I've seen aren't great to use.
What would be cool is a horizontal layout that flows to the current window height, and scrolls the page sideways across however many columns it needs. (Microsoft had a lot of side-scrolling UIs in Windows 8, but they weren't well received.) Coding such a thing is way beyond my abilities, however. :(
. Pelagic (talk) 07:05, 18 July 2021 (UTC)
@Pelagic: The layout currently is user-selectable via the "Layout XX" button. The UI for this is indeed atrocious and undiscoverable, but a new fancy UI will be quite a bit of work and I'd really rather not have to do half the job twice when the UI library change happens. At the same time, I'd like to see more features, including a separate font and size option and perhaps customisation of widths.
You can also choose to not allow pages to override your personal settings in Preferences.
Columns are a major, major problem on paginated displays (i.e. nearly all physical reader devices as well as in print/PDF contexts because of this: on paper, it's easy to typeset this:
Line 1    Line 4
Line 2    Line 5
Line 3    Line 6
-----------------
Line 7    Line 10
Line 8    Line 11
Line 9    Line 12
-----------------
Note that between "Line 6" and "Line 7", the text reverts back to the left column. On a long HTML page, the same thing in 2 columns is like an endless single piece of paper:
Line 1    Line 7
Line 2    Line 8
Line 3    Line 9
Line 4    Line 10
Line 5    Line 11
Line 6    Line 12
This is "OK" on a screen (still not ideal for a very long list because you have to scroll up again to the top), but once you try to display it on pages, it goes badly wrong:
Line 1    Line 7 <-- line 3 -> line 7?! where's line 4?
Line 2    Line 8
Line 3    Line 9
----------------
Line 4    Line 10
Line 5    Line 11
Line 6    Line 12
For this reason, as far as I am aware, all e-reader devices do not bother with column formatting as such, and they just fall back to single column vertical layouts. Which is usually what you want anyway because they have small screens and large text that don't need columns to allow the printers to save paper and ink for long works: a 2000 page ebook weighs and costs the same as a 1000 page ebook with half-size text.
This hasn't stopped the (mis)use of constructs like {{multicol}} which use tables for columns and therefore trash export output (as opposed to {{div col}}, which will at least degrade to a single column), and we still don't have a good, easy-to-use solution for "page-parallel" texts like the Loeb Classical Library or "paragraph-parallel" texts like some treaties.
With respect to a page-based view on wiki, I think that would be quite difficult, mostly for the same reasons: HTML works mostly on an "endless page" model, and forcing it into a page-based model makes lots of things not work. Which is why lots of works don't export well, even after various CSS features like {{div col}} are dropped. On the other hand, it would work rather well for texts like Loeb Classical Library, which are very, very difficult (impossible?) to present in any way other than page-by-page. That said, since most of our texts do come from paginated sources, and having a page-based mode that uses the same page breaks is likely less of a challenge, but I'm pretty sure it would fall over in many cases, probably involving tables or other "weird" things.
And all that said, it's also a major engineering effort in general to set that up in the first place, since it has to sit on top of the existing UI. Inductiveloadtalk/contribs 08:36, 18 July 2021 (UTC)
@Inductiveload: Surely we can change the colour of the display options, text, links, box as something easy to get some highlight to the options. — billinghurst sDrewth 13:03, 22 July 2021 (UTC)
@Billinghurst: sure, as quick fix perhaps something like this in global CSS:
#d-textLayout {
    font-weight: bold;
}
Any CSS is fine by me (you can have boxes, underlines, even an icon), whatever people think is acceptable. Inductiveloadtalk/contribs 13:07, 22 July 2021 (UTC)
I think that we can be bolder and change links' colours. Maybe even put a help link next to "display options" and link it through to Help:Layout. — billinghurst sDrewth 13:33, 22 July 2021 (UTC)

Extract Text button

I have started seeing an extract text button on my interface, I'm sure it wasn't there yesterday. Is there any way to turn it off? Sorry I don't know how to upload a screenshot of it here. Sp1nd01 (talk) 18:52, 24 June 2021 (UTC)

@Sp1nd01: That is the new WS OCR tool (which is a Community Tech project). I'll let someone from CommTech field the overall question (@Samwilson: I guess that's you!) but if you must disable it ASAP, you could use
.ext-wikisource-ExtractTextWidget { display: none; }
Inductiveloadtalk/contribs 19:16, 24 June 2021 (UTC)
Thank you, I edited my common.css file and it has removed the button. I just found its location to be distracting. Sp1nd01 (talk) 20:54, 24 June 2021 (UTC)
Hi there, this is part of a recent release to improve our OCR experience. I would love to understand what about it was distracting? Was the button inhibiting your ability to interact with the page? Feel free ot check out more details in our project page here. NRodriguez (WMF) (talk) 22:40, 24 June 2021 (UTC)
@ NRodriguez (WMF) Hello, we have problems with this button on dewikisource. Would you please join our discussion on s:de:Wikisource:Skriptorium#Hilfe_benötigt_für_ein_von_mir_nicht_näher_zu_erklärendes_OCR-Problem. Thanks --Mapmarks (talk) 22:46, 24 June 2021 (UTC)
@NRodriguez (WMF): It should be set as an option in Preferences, either in Special:Preferences#mw-prefsection-editing > General options or gadgeted. A gadget is typically how we have managed the OCR tools in the past, though if it is now for all WSes, sticking it with the other editing options is reasonable. I am unsure whether it would default on or default off, though happy to hear opinion. I would hardly ever require it, and would be comfortable to toggle it in my preferences as required. — billinghurst sDrewth 02:52, 25 June 2021 (UTC)
It should be configurable, but I think it probably makes more sense as a button next to the "H/V layout" and "show header/footer" in the PRP toolbar section so it can be toggled more easily. Inductiveloadtalk/contribs 08:54, 25 June 2021 (UTC)
Hi @Billinghurst@Inductiveload@Sp1nd01 thanks so much for paying close attention to all these changes and coming to us with feedback! Right now, we're in the middle of "transition" to the new functionality that you are seeing-- which introduces the ability to transcribe pages on all Wikisources for all contributors with an "Extract Text" button and the OCR options available. However, there is still some residue from the gadgets still being in the toolbar and configurable for users. The plan is to remove the gadgets next week, because as you pointed out everyone will have access to the transcribe functionality. For those who do not use the transcribe functionality with OCR, we conducted research to make sure the new button placement would be non-disruptive, intuitive, and accessible to all contributors. Here is the phabrictor ticket for removing the gadgets work, which we expect to tackle next week. Happy to answer any questions and hear your feedback. Hope to see you at our next wishlist! NRodriguez (WMF) (talk) 20:24, 25 June 2021 (UTC)
@NRodriguez (WMF): I'm curious what sort of research was conducted - who were the subjects of this research? And why was it done "for those who do not use the transcribe functionality" rather than for those who do use it? — Dcsohl (talk)
(contribs)
16:11, 26 June 2021 (UTC)
@NRodriguez (WMF): "what about it was distracting" It introduces another facet to the user interface; it looks different and is placed differently to everything else on the page, It should be a toolbar button, or menu item, or a tab, or something else that fits with the pre-existing page design. It also overlays (on my system, at least) part of the page image—nothing should do that, uninvited. Contrary to the ideals expressed above, ths is far from "non-disruptive, intuitive". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:49, 25 June 2021 (UTC)
@NRodriguez (WMF): Do you intend to respond? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:29, 10 July 2021 (UTC)
@Billinghurst@Dcsohl@Inductiveload@Ineuw@Mapmarks@Pigsonthewing@Slowking4@Sp1nd01 Hey everyone, apologies for the prolonged response. We have been following overlapping discussions inside this ticket https://phabricator.wikimedia.org/T285712 I tried to synthesize all the moving pieces and will be talking them over with the designer and the team to see how we can best address all needs. ~~~ NRodriguez (WMF) (talk) 14:45, 12 July 2021 (UTC)
@NRodriguez (WMF): with respect, the discussions are overlapping and have lots of moving pieces because:
  1. there is poor issue tracker hygiene resulting from over-broad umbrella tasks being used to track what should be smaller work items
  2. the smaller details are entirely hidden from users on the wiki (I assume Comm Tech is using a private communication back-end like Slack to co-ordinate work?), so we on-wiki don't really know what's going on, which, combined with slow replies and a slow update cadence leads to
  3. both phab:T283897 and phab:T285712 have been mostly one-way dumping grounds for people wondering what's going on (and providing back-seat development because it's only not clear what's going on, it's also not clear to what extent contributions are useful or wanted, what the "private" state of the OCR tool is, what work is in-flight, etc) which all goes back around to #1 - massive confused issue tracker tickets and no obvious movement.
This is a classic control theory example of the community's feedback loop being more or less totally open, resulting in powerful feedback impulses as the "system" tries to orient and correct itself. Just a little bit of comms and an incremental approach help to close that loop and decrease intense signals around sudden input changes.
The UI-first approach isn't helping either, IMO. Generally, I'd recommend to burndown the back-end tasks first, use the simplest possible UI for now and leave "innovative" UI for later because between fighting with soon-to-be-removed-anyway OOUI hydra and managing community bikesheddinginput, UI work will always expand to fill all available time. Plus all technical merit will be totally overshadowed by the UI issues (source: all comments about the OCR tool made so far). Just as in space science any exploration program which "just happens" to include a new launch vehicle is, de facto, a launch vehicle program [1], any software project that "just happens" to include a shiny new UI is, de facto, a UI project. And computer users are at least as grumpy as spacecraft certification authorities! :-) Inductiveloadtalk/contribs 16:22, 12 July 2021 (UTC)
@NRodriguez (WMF): I went looking for a place to lodge my experiences but ended up here because....
Three things:
  1. Having it as a preference option would have been nice; I would have used it for its vertical ocr'ing option which I think I saw. But re-enabling it via css hack was a pain, so the preference toggle, please!
  2. All in-house ocr is annoying to me because it doesn't add a space at the ends of the lines, so if not mindful of that, making paragraphs of ocr lines gets "sentencesthat look like this andare not worth the effortif you forget to addthem manually."
  3. The pulsing button is just too much! It reminded me of the charging Tesla, in the drive way, sucking up juice from some large plug in the garage. Eww. (a little bikeshedding)--RaboKarbakian (talk) 16:34, 12 July 2021 (UTC)
@NRodriguez (WMF): Thank you. You asked a question here, and here is where I (and others) answered it. I don't think it is fitting for you to now suggest I (or others) need to use anther venue for my concern to be acknowledged. Nonetheless, I have viewed that Phabricator ticket, and note that it says that a "con" of the status quo is that the button "Could potentially block document text" despite my having already informed you here that it does do so. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:29, 12 July 2021 (UTC)
@Pigsonthewing and @RaboKarbakian, I hear that. Going to post what is on the phabricator discussion here so make it easier to follow updates:
Hey all, I believe there is a lot of echo'ed consensus on the new requirements @Xover and @Inductiveload posed. From the discussion in the English Scriptorium, as well as other comments and tickets I've seen I do believe there will always be some competing needs. However, I am optimistic about these new requirements from @Xover

OCR should always be available in the normal editor toolbar OCR should be triggerable with a single button click (no menu/dropdown) Engine selection and advanced options should be available, but not required to run OCR with defaults

We will preserve the onboarding pulsating button on the proposed new OCR toolbar button which should disappear after a user first encounters the button for the first time.
Having the OCR/Extract text button in the toolbar, should take care of the painpoint around blocking text @Pigsonthewing-- thanks for flagging this!
As for your third concern @RaboKarbakian, if you click "ok" on it after your first time exposed to the pulsating button should disappear.
We think it will be helpful for onboarding new contributors who may not be aware of the transcription tools available to them, but we are aligned that it should not annoy established contributors and that they should be able to dismiss this onboarding after they are first exposed to it. We use this pattern in other places.
We hope that placing the functionality in the toolbar will mitigate your concerns for 1. and 2.
@Inductiveload I hear you on the process feeling like a bit of a black-box and the poor hygiene on task-tracking. We will brainstorm solutions. As per the tickets that centralize community feedback, my goal was to centralize all the moving threads in one task and cut more tickets based on the resolution there-- but I can see how that could very easily come off as a dumping ground. The team will be having a retrospective meeting about how to prevent the mishaps in the future, and this thread helps me learn so thanks again all! NRodriguez (WMF) (talk) 17:33, 13 July 2021 (UTC)
@NRodriguez (WMF): Thank you; that all sounds very positive. As for onboarding new users, I urge you and Wikisource colleagues to review the comic-strop style graphic shown to new users on Wikimedia Commons. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:18, 13 July 2021 (UTC)

┌─────────────────────────────────┘
How can this feature be removed? This addition broke my proofread edit page. The height of the textarea is over 600px and I can't see what I am doing. — Ineuw (talk) 12:25, 25 June 2021 (UTC)

i would ask developers to be nice and announce interface changes, before people have to query you about it. this is a distributed community, and changing task flows requires communication. maximum user flexibility, including turning features off, tends to increase user acceptance. --Slowking4Farmbrough's revenge 23:53, 25 June 2021 (UTC)
  •  Comment NRodriguez Please do not impose onto people what is not necessary, and what has always been a choice to members of the community, not mandated. Give the ability to turn it off and I will be comfortable. Don't be retrograde. — billinghurst sDrewth 16:42, 26 June 2021 (UTC)

Please, help!

I still can't find a way not to display that nasty new button in the scanned image. Can anyone please explain how to handle this? I really don't need a button like this. --Dick Bos (talk) 18:39, 21 July 2021 (UTC)

@Dick Bos: As mentioned way at the top, add
.ext-wikisource-ExtractTextWidget { display: none; }
to User:Dick Bos/common.css. Jarnsax (talk) 18:49, 21 July 2021 (UTC)
@Jarnsax Thanks a lot! It works. I had already tried it on "common.js", but that didn't work out. Thanks again. --Dick Bos (talk) 18:56, 21 July 2021 (UTC)
@Dick Bos: I realized after saving that that it might sound rude, since he didn't actually say where to put that snippet. Sorry if so. Jarnsax (talk) 19:02, 21 July 2021 (UTC)
@Jarnsax No problem at all! --Dick Bos (talk) 06:55, 22 July 2021 (UTC)
FYI, this was moved to the toolbar today (cf. phab:T285712), so expect to see that arriving on-wiki next Wednesday (unless it is backported before then).
You can see what it will look like at the Beta Wikisource (which gets all new code deployed more or less immediately): https://en.wikisource.beta.wmflabs.org/w/index.php?title=Page:Wind_in_the_Willows_(1913).djvu/47&action=edit Inductiveloadtalk/contribs 21:06, 21 July 2021 (UTC)
@Inductiveload (and @NRodriguez (WMF)) It is good to see that the new button now will be integrated, more or less, in the "normal" interface of the edit-page.
Apart from that in general it is really a good idea to introduce these kind of innovations first in some "beta"-environment, before causing widespread troubles for "old-style" end-users like me. Good luck. --Dick Bos (talk) 06:55, 22 July 2021 (UTC)
Technically it was introduced in a beta environment: https://en.wikisource.beta.wmflabs.org gets all new code applied immediately it is merged. That said, there was no message at enWS (or any WS) to let people know to look at betaWS, even at meta:Talk:Community_Tech/OCR_Improvements only the rollout to Indic WSes (the main early beneficiaries of the new OCR tool) was mentioned. I was not pinged on that message, either. I get the feeling that since phab:T285311 was done through a config change and therefore happened out-of-sync with release trains, that messaging got extra confused (normally you would have the opportunity to use Tech News, at least).
Certainly, a Mess Notification to local Scriptoriums to tell people "hey, look at betaWS" is the gold standard if you're seeking feedback or planning a change that users should know about. If then deemed necessary, local admins can add to MediaWiki:Watchlist-announcements if it's useful to tell all users, not just Scriptorium-readers. Inductiveloadtalk/contribs 09:04, 22 July 2021 (UTC)
The button is now in the toolbar. @Ineuw, @Billinghurst, @Sp1nd01, @Tommy Jantarek, @RaboKarbakian, @Dick Bos, @CVValue: people with the CSS set to disable it may now want to re-enable the display of the button, as it does provide the menu with access to the advanced OCR page (which you can use to change languages or OCR models, page segmentation modes, plus any other features that will be added over time). Inductiveloadtalk/contribs 11:11, 29 July 2021 (UTC)
@Inductiveload: thanx for the message. I re-enabled the display. That means, now there are two OCR-buttons in my toolbar (and 99,9% of the time I don't need one)! But at least now I know how to hide it. Greetings, --Dick Bos (talk) 13:23, 29 July 2021 (UTC)
@Dick Bos: The other OCR buttons can be enabled/disabled in your Gadget preferences (it's just the new one that for various technical reasons can't be controlled that way). Xover (talk) 13:27, 29 July 2021 (UTC)
@Xover. Thank you for that. I found it. --Dick Bos (talk) 13:32, 29 July 2021 (UTC)

Index:A Study in Colour - Augusta Zelia Fraser.pdf

I have removed the first (Google) page from this PDF. The first page (which was blank) needs to be deleted, and the remaining pages (which have been created) need to be shifted appropriately. The OCR will also need to be adjusted. TE(æ)A,ea. (talk) 18:35, 6 June 2021 (UTC)