Wikisource:Scriptorium/Archives/2021-02

Please do not post any new comments on this page.

This is a discussion archive first created in February 2021, although the comments contained were likely posted before and after this date.

See current discussion or the archives index.

Tech News: 2021-05

Latest comment: 3 years ago7 comments4 people in discussion

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Problems

IPv6 addresses were written in lowercase letters in diffs. This caused dead links since Special:Contributions only accepted uppercase letters for the IPs. This has been fixed. [1]

Changes later this week

You can soon use Wikidata to link to pages on the multilingual Wikisource. [2]
Often editors use a "non-breaking space" to make a gap between two items when reading but still show them together. This can be used to avoid a line break. You will now be able to add new ones via the special character tool in the 2010, 2017, and visual editors. The character will be shown in the visual editor as a space with a grey background. [3][4]
Wikis use abuse filters to stop bad edits being made. Filter maintainers can now use syntax like 1.2.3.4 - 1.2.3.55 as well as the 1.2.3.4/27 syntax for IP ranges. [5]
The new version of MediaWiki will be on test wikis and MediaWiki.org from 2 February. It will be on non-Wikipedia wikis and some Wikipedias from 3 February. It will be on all wikis from 4 February (calendar).

Future changes

Minerva is the skin Wikimedia wikis use for mobile traffic. When a page is protected and you can't edit it you can normally read the source wikicode. This doesn't work on Minerva on mobile devices. This is being fixed. Some text might overlap. This is because your community needs to update MediaWiki:Protectedpagetext to work on mobile. You can read more. [6][7]
Cloud VPS and Toolforge will change the IP address they use to contact the wikis. The new IP address will be 185.15.56.1. This will happen on February 8. You can read more.

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

22:38, 1 February 2021 (UTC)

As the way Tech News is written is the most common feature to be pointed out both in positive and a negative feedback, I have been making an en-gb version of Tech News for some months now. It's not necessarily British, but it merges extremely short sentences together and allows for terms that anyone well-versed in the English language would understand. Well-versed for example is a term you probably wouldn't find in the regular Tech News. Here's the above issue in en-gb. I also occasionally add some additional explanation or the situation for a particular issue on English-language projects. I haven't really advertised it, pageviews are generally disappointing. I'd like to hear some feedback. Do you like it? Do you have suggestions? Alexis Jazz (talk) 06:18, 2 February 2021 (UTC)
@Alexis Jazz: I like information that's not pared down to the smallest common denominator, and I would like my Tech News to be written for a fairly technical audience. I would also like to see more information for the items that are of interest to me, and the way you've expanded the item on the Minerva changes are a good example. However, I don't think you're doing this effort any favours with the br-eng schtick (it took me a while to figure out what you were on about), and I probably won't go looking for an alternate version of the text. I scan Tech News when it lands on my watchlist, so whatever is in that MassMessage is what I'll read. If there's to be any point to your effort it needs to replace Tech News' writing, not supplement it. --Xover (talk) 07:50, 2 February 2021 (UTC)

"Fine print: Tech news in British English (en-gb) does not necessarily use British spelling. It is only British in spirit, unoversimplified when compared to the regular English version, allowing more complex terms (e.g. "issues" instead of "problems") and constructions." is offensive bullshit. Desist. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:40, 2 February 2021 (UTC)
Certainly, Henry Watson Fowler would disagree that British English is in opposition to plain English, along with George Orwell! Inductiveload—talk/contribs 13:30, 2 February 2021 (UTC)

Made a central feedback thread at m:Talk:Tech/News/2021/05/en-gb. Maybe should have done that from the start. Alexis Jazz (talk) 15:02, 2 February 2021 (UTC)

Wikisource highlights

I wanted to highlight two points above, which are (or in the latter case may be) particularly relevant for en.Wikisource:

You can soon use Wikidata to link to pages on the multilingual Wikisource. [8]
Minerva is the skin Wikimedia wikis use for mobile traffic. When a page is protected and you can't edit it you can normally read the source wikicode. This doesn't work on Minerva on mobile devices. This is being fixed. Some text might overlap. This is because your community needs to update MediaWiki:Protectedpagetext to work on mobile. You can read more. [9][10]

-- Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:59, 2 February 2021 (UTC)

The overlapping text thing doesn't apply to Wikisource, as one could have gathered from the en-gb version of the newsletter. Alexis Jazz (talk) 15:08, 2 February 2021 (UTC)

Editor's note

Latest comment: 3 years ago3 comments2 people in discussion

Newbie here. I was working on this page and noticed a reference to page 69 of Hittell's History of California, which doesn't appear to be on Commons or Wikisource as yet. Is there a way for me to add an editor's note or similar pointing the reader to this page on the Internet Archive? —unsigned comment by AleatoryPonderings (talk) .

Comment Not directly helpful to you right now, but perhaps what we could do in this case is have a template that links to the work via Wikidata with something like {{work link|Q105297614}}. If a WS page exists as a sitelink there, then the link leads to the sitelink (i.e. enWS's own History of California). If it does not, the link either leads to Wikidata, and/or pops up a dialog with whatever authority control data exists at WikiData and some useful links like "visit the IA", "import from the IA", "see OCLC record", "create this page", etc. (which would need a JS gadget).

In the mean time, we usually just redlink the work title and cross our fingers and hope that one day it'll get linked up if the page is created. Inductiveload—talk/contribs 15:41, 4 February 2021 (UTC)

@Inductiveload: {{Wikidata entity link}} looks like it would work well as a stopgap measure in this case? Unless there's policy against that, I might just drop that in to the relevant page. AleatoryPonderings (talk) 18:05, 4 February 2021 (UTC)

Aha, such a thing already exists: {{Wikidata link}}! It will link to WS if possible, and then fall back to enWP and finally Wikidata. I don't think there would be a policy against it, since it's a lot better than a redlink IMO, but I could be wrong. Inductiveload—talk/contribs 18:25, 4 February 2021 (UTC)

Can I have a differing opinion. If we are indicating a work that we wish to have a red link is better in the body of the work as it indicates that we don't have the work, and aligns with Wikisource:Wikilinks. We definitely would create the author page and list and link to the scan there. We can also add a note in the notes field of the subpage work if it is considered pertinent to do so. Our task is producing English language works, and obfuscating that we don't have the work nor to where it leads is contrary to my understanding of our current philosophy for main ns.

Machine readable

Latest comment: 3 years ago2 comments2 people in discussion

Uploading my first book is impossible for maschiene to read, I think that's the prerequisite for further editing. Can someone give me a tip how I can continue. Thank you https://en.wikisource.org/wiki/File:The_Renaissance_In_India.djvu --Riquix (talk) 15:16, 4 February 2021 (UTC)

@Riquix: thanks for the upload! This was because you did not include a license template. That is mandatory when uploading files, as we can only accept works that are public domain in the US at Wikisource.

Because Aurobindo died in 1950, this work should have been uploaded to Commons, as it is PD in both the US (because it was published before 1926) and in India, because the author has been dead for more than 60 years. I have now moved it - the filename remains the same. You can now create Index:The_Renaissance_In_India.djvu as normal. Inductiveload—talk/contribs 15:33, 4 February 2021 (UTC)

Wiki Loves Folklore 2021 is back!

Latest comment: 3 years ago1 comment1 person in discussion

Please help translate to your language

You are humbly invited to participate in the Wiki Loves Folklore 2021 an international photography contest organized on Wikimedia Commons to document folklore and intangible cultural heritage from different regions, including, folk creative activities and many more. It is held every year from the 1st till the 28th of February.

You can help in enriching the folklore documentation on Commons from your region by taking photos, audios, videos, and submitting them in this commons contest.

Please support us in translating the project page and a banner message to help us spread the word in your native language.

Kind regards,

Wiki loves Folklore International Team

MediaWiki message delivery (talk) 13:25, 6 February 2021 (UTC)

A table of a geometrical progression

Latest comment: 3 years ago5 comments3 people in discussion

Please help to format a table in reference: Page:William Blackstone, Commentaries on the Laws of England (3rd ed, 1768, vol II).djvu/218. I can't figure out how to do it properly Ratte (talk) 19:19, 6 February 2021 (UTC)

See User:Beeswaxcandle/Sandbox4 for four possibilities. I'm not happy with the last, and wouldn't use it. Depending on my mood at the time, I'd go for either the second or the third, as they both adequately reproduce the authorial intent. Beeswaxcandle (talk) 02:20, 7 February 2021 (UTC)

Thank you very much for the whole four possibilities! Each one is good. Ratte (talk) 11:12, 7 February 2021 (UTC)

@Ratte: Gave it my best shot minutes ago. (My approach involved a little something they call {{dotted cell}}, which I've ushered into my toolkit thanks to its occasional need in the Malagasy grammar primer I'll finish covering by Monday.) --Slgrandson (talk) 03:53, 7 February 2021 (UTC)

Thank you, you’re amazing! Using dotted cell is a very elegant solution, such option never even crossed my mind. I have also used your approach on another table. Ratte (talk) 11:12, 7 February 2021 (UTC)

Commons categories—editions vs works and wikidata items

Latest comment: 3 years ago1 comment1 person in discussion

Hi to all.I have recently noticed that there has been some movement of categories at the edition level at Wikidata to the work level. This should not be happening where we have populated the Commons category with images from one edition. I would like to recommend that we look to better name such categories at Commons so they are clearly evident that they relate to a year of publication or the edition. I think that there is value in looking at what we have done historically at Commons and start on a means to do some clarification. If we don't do this, the consequence is that we lose the automated Commons categorisation in our headers when they are attached to the work, not the edition. — billinghurst sDrewth 01:21, 8 February 2021 (UTC)

Portal vs Author

Latest comment: 3 years ago3 comments3 people in discussion

Is it ok to create a Portal and have it redirect to an Author page? --RAN (talk) 01:54, 7 February 2021 (UTC)

No, that would be a cross-namespace redirect and those are speedy deleted under rule M3. One or the other, but not both. Beeswaxcandle (talk) 02:22, 7 February 2021 (UTC)

If you are moving from one to the other you can utilise a substituted {{dated soft redirect}} as a temporary measure so that the bots catch up. — billinghurst sDrewth 06:15, 8 February 2021 (UTC)

Tech News: 2021-06

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Recent changes

The Wikipedia app for Android now has watchlists and talk pages in the app. [11]

Changes later this week

You can see edits to chosen pages on Special:Watchlist. You can add pages to your watchlist on every wiki you like. The GlobalWatchlist extension will come to Meta on 11 February. There you can see entries on watched pages on different wikis on the same page. The new watchlist will be found on Special:GlobalWatchlist on Meta. You can choose which wikis to watch and other preferences on Special:GlobalWatchlistSettings on Meta. You can watch up to five wikis. [12]
The new version of MediaWiki will be on test wikis and MediaWiki.org from 9 February. It will be on non-Wikipedia wikis and some Wikipedias from 10 February. It will be on all wikis from 11 February (calendar).

Future changes

When admins protect pages the form will use the OOUI look. Special:Import will also get the new look. This will make them easier to use on mobile phones. [13][14]
Some services will not work for a short period of time from 07:00 UTC on 17 February. There might be problems with new short links, new translations, new notifications, adding new items to your reading lists or recording email bounces. This is because of database maintenance. [15]
Last week Tech News reported that the IP address Cloud VPS and Toolforge use to contact the wikis will change on 8 February. This is delayed. It will happen later instead. [16]

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

17:42, 8 February 2021 (UTC)

Ellipses and asterisks

Latest comment: 3 years ago3 comments3 people in discussion

One curiosity I've noticed in old texts (eg, this page of "Some Fundamental Legal Conceptions as Applied in Judicial Reasoning") is that ellipses in quotations are rendered in asterisks (* * *) as opposed to periods (. . .). Is this the sort of thing that should be rendered as it was in the olden days, or is it OK to silently modernise it? I'm thinking the asterisk-ellipsis might confuse the contemporary reader (extrapolating from the fact that I was initially confused until it clicked). AleatoryPonderings (talk) 18:38, 8 February 2021 (UTC)

Both ways are quite OK, imo depending on what the contributor prefers. I personally would stick to the original, showing the readers that historically other ways of expressing ellipsis were used too. It comes from my experience as a reader: when I read old texts, I enjoy them more when they contain original typography, which gives them a sort of historical touch :-) --Jan Kameníček (talk) 18:59, 8 February 2021 (UTC)

@AleatoryPonderings: I'd go with the original. Also, I'd use {{...|3|*}} as that inhibits line-breaks between two of the dots/stars. Inductiveload—talk/contribs 20:34, 8 February 2021 (UTC)

Sharing en.ws main page in Facebook

Latest comment: 3 years ago11 comments4 people in discussion

I have shared the link to en.ws main page in the Czech Wikipedia Facebook page, but the displayed thumb contains an extract of the text on a Well’s novel (which was featured in our main page some time ago) instead of the current featured text. Is there anything that could be done about it next time? --Jan Kameníček (talk) 12:07, 8 February 2021 (UTC)

<shrug> Facebook??? <shrug> — billinghurst sDrewth 13:19, 8 February 2021 (UTC)

Well, I thought that they take the thumb from us and so the problem is in updating something here… --Jan Kameníček (talk) 14:18, 8 February 2021 (UTC)

@Jan.Kamenicek: I think it's a good thing to put Wikisource out there in from of new eyes! I don't know anything about Facebook's systems, but a quick good leasts me to thing we should be embedding something called "Open Graph" metadata in our pages (specifically in the <head> section). Also apparently, there's a way to refresh the "preview", but you might need to "own" the domain to do that. Whoever operates https://www.facebook.com/Wikisource/ might know (and also make it possible to view the page without logging in?)

There is a tool to show what Facebook "sees" when it looks at Wikisource: https://developers.facebook.com/tools/debug/?q=https%3A%2F%2Fen.wikisource.org (needs login). Pressing "Scrape again" there refreshed the page, but it seems not to have propagated to the Wikipedie page yet (not sure if it will).

Inductiveload—talk/contribs 15:09, 8 February 2021 (UTC)

Further digging on why we have no image but enWP does: enWP uses an extension called mw:Extension:PageImages to provide the image via the og:image tag, so when someone social shares a link on certain platforms, the link gets a nice image to go with it. enWS doesn't have this installed. Should we ask for it to be installed? Inductiveload—talk/contribs 15:38, 8 February 2021 (UTC)

That would be great, but it also depends on how difficult it is. I personally post a link of en.ws to Facebook about once a year, so yes, it would help me, but I am not sure if it is enough to bother about it for you… On the other hand, we never know, who else wants to link us to a social networking site and maybe there are more people (not necessarily contributors) who it would help to as well. --Jan Kameníček (talk) 16:33, 8 February 2021 (UTC)

I think I've managed to work around it by using a larger image in our header template on the Main Page, which is big enough for Facebook to see it as a default image candidate. By making it a JPG, it's about 30kB smaller than the 200px PNG we were using before. There is also a description for Wikisource in general when you share the main page (articles should use the lede of the text on that page, which sadly doesn't always go brilliantly). Inductiveload—talk/contribs 17:35, 8 February 2021 (UTC)

@Quiddity (WMF): Can you add to this discussion with your gorgeous hat on? Thanks. — billinghurst sDrewth 01:10, 9 February 2021 (UTC)

Hallo. I will ping the Reading Web team to take a look at this, and they should be able to either answer or find the best place/person to ask. There are a large number of related Phabricator tasks (e.g. phab:T213505 and phab:T56829 and the many tasks they connect to), and I cannot easily determine what the current short-term or mid-term options are. (Plus I try to never click Facebook links, so I can't see the examples above!). Cheers, Quiddity (WMF) (talk) 23:56, 9 February 2021 (UTC)

@Billinghurst: (or other Commons admins): it might be a good idea to protect File:Accueil scribe invert.png and File:Accueil scribe invert_300.jpg, since they're both often used at Wikisourcen and they're open for abuse at Commons. Inductiveload—talk/contribs 17:43, 8 February 2021 (UTC)

Done permanent protection for both images at Commons. You may wish to also consider protecting the empty files here at enWS so someone doesn't upload into the space (though vaguely think that there may be some system components that inhibit that.) — billinghurst sDrewth 01:08, 9 February 2021 (UTC)

Tarzan and the Ant Men (1924) done transcribing

Latest comment: 3 years ago4 comments2 people in discussion

I am now done transcribing Tarzan and the Ant Men. There are a couple of pages with images I marked as problematic. The rest of the text needs proofreading/validation. This was one of the requested texts from 1924 and I hope I was helpful. (SurprisedMewtwoFace (talk) 22:01, 9 February 2021 (UTC))

Great! If you believe you transcribed the individual pages well and checked typos, you can mark them directly as "proofread", so that another person who may have a look at it later could mark it as "validated". I went quickly through a couple of pages and they look fine. The only minor problem I have noticed was that sometimes you used curly quotation marks (such as at Page:Tarzan and the Ant Men.pdf/171) and sometimes straight ones (such as at Page:Tarzan and the Ant Men.pdf/347), which should ideally be unified (no matter which of the two possibilities you choose), but it is not a big issue. --Jan Kameníček (talk) 22:26, 9 February 2021 (UTC)

I'll unify with straight quotation marks, as I suspect that's the main issue and I think Wikisource prefers straight quotes. Thanks for your advice! (SurprisedMewtwoFace (talk) 22:38, 9 February 2021 (UTC))

@SurprisedMewtwoFace: Well, different contributors have different preferences, you can really choose, as far as you are consistent within the given work :-) --Jan Kameníček (talk) 22:46, 9 February 2021 (UTC)

Improving old editions by Wikisource contributors

Latest comment: 3 years ago11 comments4 people in discussion

I would like to ask about the general opinion of changing some parts (e. g. pictures) of old book editions by WS contributors in our main namespace. An example is at Page:Field Book of Stars.djvu/25 where the original picture from the 1907 book was replaced by a modernised version by RaboKarbakian. To me it seems like creating a new edition of the work instead of transcribing the original 1907 edition. The problematic interventions imo include:

Colouring originally black and white pictures
Adding legends to pictures which were not present in the original
Changing the orientation of the labels with names of stars was imo also not necessary and the original idea of the author, i. e. to enable the reader to rotate the picture, was lost by it (although readers cannot rotate monitors, they can still rotate mobiles or they may want to print it).

Another example from the same book is here, notice especially the added legend.

As a result readers may get a false idea of what the original pictures in the book really looked like.

Although neither WS:Annotations nor Help:Annotating speak about altering original pictures, I would say that this falls under the annotations policy. However, before I intervene, I would like to know other opinions. --Jan Kameníček (talk) 01:11, 10 February 2021 (UTC)

Tending to agree that these changes are undesirable. Surely it would have been possible to make a faithful representation of the original in SVG, and we want a faithful representation of the original. I think your first example is definitely covered by the (proposed) policy, because the added legend is "additional text that is not part of the source work". BethNaught (talk) 08:39, 10 February 2021 (UTC)

That is true. But I think that the colours, whose aim is probably to make the picture more comprehensible or something like that, are a sort of annotations too. --Jan Kameníček (talk) 08:47, 10 February 2021 (UTC)

It is the result of a clash of wikimedia projects. The commons has guidelines for making suitable images. I requested that the images for that book to be redone as SVG. In my mind, at the request, were sharp, readable, and scalable line-drawings and what I got were the beautiful, legended, and colorized for improved readibility for color-blind as per commons guidelines.

First of all, I couldn't say no to the artist for several reasons. The first being beauty, which I am so weak for. The second would be the good guidelines there -- I would need a different set of guidelines, (authored by people who are not looking at beautiful as I was) for the requirements of images for wikisource to be made and posted at the commons, eventually to be translated for artists of different languages. Third, this particular book is timeless(?) or only a little dated. In several hundred thousand years, a different star will be "the north star" but even then, the constellations will be mostly the same.

So I have taken one more step towards the modernization of this Field Book by using {{SIC|binoculars|an opera-glass}} to put the name old device into the tooltip and to display the name of the modern device.

It is a rare book here that is both technical and (mostly) current. It is also an example of how separated the wikimedia projects are. I think that there can be a style that would flip between the faithful old version and the mildly modernized new version, which might be useful for other projects. By other projects, I mean in particular, math books and technical drawing books. I found a bunch, looking with SVG on my mind. Mathmatics too is (mostly) timeless.

Summary:

Couldn't say no, guidelines (and beauty)
Need for guidelines
Need for style (innovation suggestion)
The need for this one indulgence regardless of decrees against innovations.--RaboKarbakian (talk) 14:07, 10 February 2021 (UTC)

@RaboKarbakian: Thanks for explanation. The book is really interesting and I understand you are trying to make it as perfect as possible. The clash of the projects is not as big as it might seem, because (as discussed above) Wikisource enables contributors to make annotated versions which may contain some kinds of improvements to the original works. So the correct solution is to make a version as faithful to the original as possible and with original pictures (called A Field Book of Stars), and in addition to it the annotated version with the coloured pictures, added legends and possibly other improvements allowed by WS:Annotations (which would go into e. g. A Field Book of Stars (Annotated)). What do you think of that?

There is only one problem: it is probably not possible to make two versions of the same book based on the same scan (each scan can have only 1 index), and so it would probably not be possible to make the annotated version scan-backed and the text would have to be just copypasted. Or does anybody have any other advice? --Jan Kameníček (talk) 14:39, 10 February 2021 (UTC)

As for the binocular/opera glass: {{SIC}} should be used for typos, "The purpose of the template is not for indicating a different or obsolete spelling, nor for attaching definitions, synonyms, commentary…". For this purpose {{tooltip}} may be used and it should also be done only in the annotated version.

As for "colorized for improved readibility for color-blind": I personally doubt that turning the sharp black ecliptics into light yellow (on white background) helps people with sight problems. The same with turning sharp black lines into light grey ones. As for "beauty": that is really subjective, the original black and white pictures seem much more beautiful to me and imo they also suit the old book better than the loud flashy colours. But I am not writing this to discourage you from the work, this book is very useful and it is great that you decided to work on it! Just the improvements should be done only in the annotated version. --Jan Kameníček (talk) 15:01, 10 February 2021 (UTC)

BTW Re: two versions of the same book based on the same scan this is tracked in phab:T259963. We did previously have a {{modern}} template but it created an enormous mess in the wikitext and was killed. Inductiveload—talk/contribs 15:09, 10 February 2021 (UTC)

Can we call it "traditional" rather than "correct" solution? I used the word "innovation". I think it is possible to switch between images via a style, making it possible to present one book both ways. Although, I admit it would take a "style jockey" much better than me to do this.

"Correct" was done, by everyone, right up to the up-cycling of the SIC template by me. I really would have needed guidelines at the commons to make things "correct" for here.

Another thing, that should be in guidelines there is something I picked up from gutenberg, that JPEG is better for ereaders than PNG, especially if you are like me and want to read/use these books on ereaders. Commons guidelines are that all non-photographic images be rendered as PNG, but the losslessness of the format really has an affect on the size of the epub or mobi (probably PDF also). It is another ocassion at Commons where my "knowing No" was silenced by existing guidelines.--RaboKarbakian (talk) 15:05, 10 February 2021 (UTC)

"Beauty" is subjective except for it was the reason I was stymied. "My weakness" is also subjective. The conflict of guidelines is real, objective, and correct. Wikimedia is what wikimedia is.--RaboKarbakian (talk) 15:14, 10 February 2021 (UTC)

@RaboKarbakian: I hope I did not offend you by anything, it was not my intention. As a non-native speaker I sometimes fight with expressing what I want to say and forget about the form in which I present it. By "correct" I meant in accordance with Wikisource policy.

As for "switching between images, making possible to present one book both ways": I am not technically very skilled so I do not know how to do it and I am afraid that only a tiny fraction of our readers could do it, so it would not help us at all.

As for Commons guidelines, I do not know of any that would really forbid to do what we need. It is possible that they prefer png, but they have never deleted any non-photographic jpg file I uploaded there (and they were really many), so you do not have to be afraid of that. Some of their guidelines may prefer coloured SVGs, but if a contributor makes it black and white, nobody will delete it because of that either.

Nevertheless, in Wikisource we have to comply with Wikisource guidelines above all, and these tell us to have one version faithful to the original and optionally we can also have an annotated version too. If Commons did not want to accept some image that we need here, we can upload the image directly to Wikisource, but this will surely not be necessary. --Jan Kameníček (talk) 16:36, 10 February 2021 (UTC)

When it comes to PNG vs JPG, the issue is really that for images with very adjacent "flat" coloured areas and sharp edges, JPG compression is a poor fit, because the compression artifacts that are stronger near sudden colour changes are visible over the adjacent flat areas. For example this is a zoomed-in version of a JPG'd bitonal image next to a lossless image:

Engravings and illustrations like these star maps show the same kind of properties as "non-photographic" graphics. For example, here's the difference between a JPG and PNG with compression damage in red:

This is why they are generally recommended to be saved as PNG. Furthermore, the JPG compression is irreversible, so every time it's edited and re-saved, further damage creeps in. Furthermore, a greyscale PNG is often about the same size as a JPG (depends on the image). However, for the purposes of e-books, it might (might) indeed be better if the export tool re-encodes from the Commons PNG to JPG, as there are probably some savings to be had, and on an e-reader screen the difference is unlikely to be noticeable. If a user really wants a full-quality image, they will still need to come and get it anyway, as export images are nearly always much, much smaller than the available Commons image. Inductiveload—talk/contribs 17:47, 10 February 2021 (UTC)

A Sample
original
guideline accordance

New download button in mainspace

Latest comment: 3 years ago9 comments5 people in discussion

The new 'download' button is now live on pages in the main namespace. There were some tweaks to Vector.css that were making it overlap the bottom border of the title, so I've removed them; let me know if there are any issues with that (most changes were for IE, which is no longer supported by MediaWiki). There's a plan to make it possible to control when the download button appears. — Sam Wilson 23:07, 10 February 2021 (UTC)

@Samwilson: wonderful, thank you! Is it supposed to appear in mobile, or is that separate task? Also, it should appear in the Translation namespace. Inductiveload—talk/contribs 23:12, 10 February 2021 (UTC)

@Inductiveload: Good point about Translation; there are other namespaces that we will support too, but first we have to figure out which ones (probably it's a matter of first stopping ProofreadPage adding Index and Page to $wgContentNamespaces, and then we can use that, and display the download button wherever the sidebar links are used). —Sam Wilson 23:24, 10 February 2021 (UTC)

@Samwilson: why are license templates not included in the exported work (PDF at least)? For non-PD works, we'll now be offering people downloads without the necessary free content license information. BethNaught (talk) 23:17, 10 February 2021 (UTC)

@BethNaught: The {{license}} template stopped being exported in August 2019; that's unrelated to the recent work with WS Export. Worth talking about though! I think we could do something about adding license information to the About page that's appended to every export. —Sam Wilson 23:24, 10 February 2021 (UTC)

@Samwilson: Ok, that's ridiculous IMO but not your fault. Thanks for the diagnosis. FYI I raised phab:T274452 before seeing your reply. BethNaught (talk) 23:27, 10 February 2021 (UTC)

What should I do in the preferences if I don't want to see this button? How to hide it? Ratte (talk) 10:47, 12 February 2021 (UTC)

@Ratte: There's no facility for that just now, but it is likely this can be achieved with user stylesheets or a local gadget. But the download button is ongoing work so it is premature to implement something for this just now. --Xover (talk) 11:25, 12 February 2021 (UTC)

The following user CSS will work for now (unless the ID changes), but implementing a proper toggle depends on wether that's something the extension will build in natively, or whether it's something we end up addressing locally via JS.

#mw-indicator-\~ext-wikisource-download {
  display: none;
}

Inductiveload—talk/contribs 16:21, 12 February 2021 (UTC)

Trumps 2nd impeachment trial

Latest comment: 3 years ago9 comments3 people in discussion

Just wondering if there are any plans at Wikisource to capture the proceedings of this impeachment? Many USA stations have been broadcasting the complete proceedings for the 3rd day. Thanks in advance, Ottawahitech (talk) 21:34, 11 February 2021 (UTC)

@Ottawahitech: Not sure if there are any active efforts, but the best source is probably the Congressional Record PDFs, for example, yesterday's, which can be uploaded to Commons Category:Congressional Record Volume 166 and an Index created as usual. Inductiveload—talk/contribs 22:07, 11 February 2021 (UTC)

Thanks for responding. I asked after I saw at least one editor at Second impeachment trial of Donald Trump putting in a lot of effort to , what seems to me, manually duplicate the effort of documenting these proceedings. Since wikiquote is short of volunteers it would sure be nice if another wmf-wiki could help in this respect. Opnions? Ottawahitech (talk) 22:45, 11 February 2021 (UTC)

We can and do accept the Congressional Record, which is solidly {{PD-USGov}}, and since the recent ones are PDFs with text layers, with additional text transcripts (e.g. here), it's not too hard to add extracts. Ideally, they would be scan-backed with the PDFs, which are the authoritative text source.

I'll be happy to assist in induction, if needed. Inductiveload—talk/contribs 16:09, 12 February 2021 (UTC)

Thanks again @Inductiveload. Does wikisource maintain any audiovisual records? The testimony presented at the impeachment trial, very effectively IMO, is mostly audiovisual testimony. Ottawahitech (talk) 16:29, 12 February 2021 (UTC)

Note that commercial video broadcasts are likely to be protected by copyright and cannot be used, unless they are just rebroadcasting video captured by a PD-USGov source. Video recordings are generally eligible for independent copyright. Audio is… complicated. Best to stick with the Congressional Record is what I'm saying. --Xover (talk) 16:38, 12 February 2021 (UTC)

Thanks for commenting @Xover. It was my understanding that the audiovisual recordings are done by the US goverment itself and provided to the various media outlets. I am not sure though. Ottawahitech (talk) 16:51, 12 February 2021 (UTC)

as far as " Audio is… complicated. Best to stick with the Congressional Record " Yes it might be more complicated, but that should not stop the effort to have it incorporated. I know Commons has an audiovisual category, don't know much about it though. Ottawahitech (talk) 17:00, 12 February 2021 (UTC)

@Ottawahitech: Video from the Senate may be captured by CSPAN, or the news networks may have st up their own cameras. We'd need to determine which it is to be sure. Regarding audio, I was referring to the copyright situation. Audio recordings are sometimes eligible for independent copyright and sometimes not, so it'd be complicated to figure out the copyright situation. Merely uploading audio or video is relatively straightforward, but that would be an issue for Commons. Our bread and butter here on enWS is text, so the PDF transcripts are the most apposite source. We can transcribe audio and video too, but our tooling isn't optimal for that. --Xover (talk) 17:06, 12 February 2021 (UTC)

Films on the main page

Latest comment: 3 years ago5 comments5 people in discussion

It has been suggested in some other discussion to introduce a new section on the main page containing new films. I do like this idea not only because of the current admirably large supply of film transcriptions by PseudoSkull. It is great that they are so many and so they deserve to be promoted in their own section. Other reasons are:

the New Texts section is getting quite monothematic with only a minority of non-film texts, which do not stay there long before they are pushed away by other new films
our readers have no chance to distinguish films from other texts before they open each of them, which is slightly uncomfortable for those seeking only something to read
at the same time quick passer-bys who just glance over our main page without clicking any entry may not notice we have some films here, and while some of them may not be interested in the other content, the films could catch their attention, if they had a chance to notice them

At the same time the section Current collaborations is currently placed in a box which is unnecessarily large and partly empty. So I suggest to move the section Highlights to the left under Featured Text, making the Current Collaboration container smaller, and placing the new section with films under New Texts.

The name of the new section can be discussed, I suggest New Films or New Film Text Transcriptions. --Jan Kameníček (talk) 10:44, 9 February 2021 (UTC)

Support I agree that film presentation here is at certain places a bit confusing, and a lot of that is because there isn't enough specific labelling of information about the films, or that they are even films at all. So I do think that films should have a separate section on the Main Page, as this will provide clarity to people looking for new books versus people looking for newly added films (as films and books are two completely separate kinds of media).

For the record as well, I am not the only one popping film transcriptions out, there are also M-le-mot-dit and TE(æ)A,ea. who have been contributing admirably in this area. While films had not received very much attention from contributors at Wikisource until just recently, there is some level of interest now to the point that I think considering adding a section to the Main Page for them is reasonable.

I also think that there should be a new template like Template:New text/film item, or a modification to the current template to support films better. As I have noted in the past, saying that a film is simply "by D.W. Griffith" because D.W. Griffith was the director is too vague. No other source on film words things this way. I think templates, including our header template, should say "Directed by D.W. Griffith" or "Produced by Biograph Company", or something of that nature, instead.

On another note, I think that a mention of our film content should also be in our Highlights section, if we are also to have an entire section for new films recently added. PseudoSkull (talk) 11:14, 9 February 2021 (UTC)

Strong

Support: Perhaps we don't as many of the films as texts, since they're currently very short and can cycle quickly. Having them under their own heading would mean we wouldn't need a special film indicator on the new items.

With reference to the formatting of the item (director, etc), you can always use the nowiki option, but a Template:New text/film item is probably easier to use.

Film absolutely should be in the highlights, but Portal:Film isn't looking too smart at the moment, I think this is an opportunity for it to get a haircut. Inductiveload—talk/contribs 17:56, 9 February 2021 (UTC)

To assist with the idea of giving this a box under the collaborations, I've trimmed back the overly long list of previous PoTM works to 10. In general, I'm supportive of moving films out of the new texts area. Beeswaxcandle (talk) 21:22, 9 February 2021 (UTC)

Support Some form of featuring these in a way more appropriate to their different format. --EncycloPetey (talk) 06:14, 13 February 2021 (UTC)

Her Benny page 220 blurred.

Latest comment: 3 years ago6 comments5 people in discussion

Page 220 of Her Benny appears blurred. I have been using the copy of the Book from Hathi Trust to add the images, and this copy has a clean page here. Is it possible to overwrite the blurred page with the clean page? Thanks Sp1nd01 (talk) 10:21, 10 February 2021 (UTC)

While I have no objections to replacing the page by somebody skilled with DjVus, it is imo not absolutely necessary. It might be enough to present the link to the other copy in the discussion page. --Jan Kameníček (talk) 10:56, 10 February 2021 (UTC)

We can certainly replace the page if needed. But since this work is already proofread I don't really see the point. Fixing the DjVu makes the most sense (and has the most value) before starting proofreading a work. Fixing it after that is mainly if there are problems that makes proofreading difficult. --Xover (talk) 13:03, 10 February 2021 (UTC)

I understand, Thanks for the suggestion, I've add the link to the clear page on the page discussion for reference. Sp1nd01 (talk) 14:21, 10 February 2021 (UTC)

you could overwrite the page image over on commons, by inserting image in a publisher program, but not for faint of heart. Slowking4 亞 Rama's revenge 16:33, 10 February 2021 (UTC)

I'd already given the URL of the source of the text for my transcription, in both my edit summary and an HTML comment. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:07, 13 February 2021 (UTC)

Download links appearing in Page: namespace

Latest comment: 3 years ago13 comments7 people in discussion

Tracked in PhabricatorTask T274027

Very recently a "print/export" box has started appearing when editing a page in the Page: namespace. This is a nuisance as it then pushes the "templatescript" and "page tools" boxes so far down my screen that when using them the page image goes off the top of my screen. This is increasing the time it takes to proofread a page. I can't conceive of any reason why a page in this namespace would be exported, particularly while editing it. Is it possible to have the box restricted to not appear while editing in the Page: namespace? Beeswaxcandle (talk) 22:27, 5 February 2021 (UTC)

@Beeswaxcandle: the download links are now provided by mw:Extension:Wikisource, not local JS. @Samwilson: is there a configuration hook for this?

As a short-term workaround, this would work in personal or global CSS:

.ns-104 #p-wikisource-export-portlet {
    display: none;
}

Inductiveload—talk/contribs 22:56, 5 February 2021 (UTC)

@Inductiveload, @Beeswaxcandle: Yes, this isn't terrific is it? I wanted to know what namespaces to display the links on, and $wgContentNamespaces seemed a reasonable cross-wiki choice. However, ProofreadPage adds the Index and Page namespaces to that. I think possibly sticking with $wgContentNamespaces makes sense, but removing Index and Page — that way we still get Translation, Work, etc. The same list of namespaces could also be used for the download button, which currently is only going to show on mainspace. I've made a ticket. — Sam Wilson 04:57, 6 February 2021 (UTC)

@Samwilson: Removing Index: and Page: nss from Content namespaces would be my preference. I cannot think of why we would would want any working namespaces in Content search. It would be preferable any wiki that wants that added that they add a phabricator tasks to make the case, not as a default. — billinghurst sDrewth 05:19, 6 February 2021 (UTC)

$wgContentNamespaces doesn't affect on-wiki search, just indexing hints for external search engines (i.e. Google). But I agree, conditionally on the docs being up to date, none of the things conditioned on $wgContentNamespaces seem relevant or desirable for Index: and Page:. @Tpt: Is there a particular reason Proofread Page needs these as $wgContentNamespaces? --Xover (talk) 08:42, 6 February 2021 (UTC)

have you tried timeless skin? (as it pushes sidebar menus to the top as a dropdown) Slowking4 亞 Rama's revenge 15:23, 6 February 2021 (UTC)

Is there a particular reason Proofread Page needs these as $wgContentNamespaces? That's a good question. I don't think here is but I might be wrong. $wgContentNamespaces is mostly used as a namespace whitelist on special pages so I don't think it would break ProofreadPage itself if we remove the Index: and Page: namespaces from this list. Anyway, we probably don't want the export link to be displayed on some namespaces that are definitely content namespaces, like Author:. Tpt (talk) 17:35, 11 February 2021 (UTC)

Thanks! @Samwilson: Based on this it doesn't look like $wgContentNamespaces would work for this purpose, and I am unsure a namespace-based approach will work at all unless you make it entirely configurable per-project. But every page that should get the links would by definition contain an invocation of <pages … />: can you trigger off that? It could certainly be done client-side, but I don't know what visibility extensions have at what stages of the pipeline. --Xover (talk) 18:05, 11 February 2021 (UTC)

@Tpt, @Xover: That's a good point about Author namespace etc. I don't know about just doing it when there's a <pages /> element though, because that'd exclude works that are manually done (like lots of Translation pages are). Maybe it catches the bulk of works though, and then we can have some auxiliary method such as a magic word to enable the button on other pages? It does feel like it'd be nice to not have to have a per-wiki config, especially if it doesn't catch everything anyway. — Sam Wilson 00:44, 12 February 2021 (UTC)

Yes, but … There is (inevitably?) a problem with the pages command approach. Not everything with pages commands is complete. As I write this, I have at least 13 works in progress that have pages commands, but are incomplete. There's also the problem with the 42% of our works in the mainspace that are not scan-backed and therefore cannot have a pages command at present. Beeswaxcandle (talk) 04:39, 12 February 2021 (UTC)

Hmm. Ok, thinking out loud trying to find an angle where it makes sense…

What if we start from an assumption that no sidebar links are shown and no download button is shown.

Anything with a <pages /> tag is at least nominally scan-backed and should get both sidebar links and button by default. If they are {{incomplete}}, or have other magic word-containing maint. templates, they have in effect been manually tagged as inappropriate for export (for the target audience/purpose). Maybe the toolbox links should be there so that WS contributors who know what they're doing/getting can easily use them, but the big honkin' download button is suppressed to avoid "false advertising" to visitors.

Texts that do not have a <pages /> tag but the local community wants to make available for epub export need to be manually tagged as such; either in bulk by placing them in a special category, or individually by including a magic word. Wikisources (like enWS) that have a lot of non-scan-backed works can choose to bot-add the category to all of them (I would argue against doing that, but the technical facility gives the community the choice). For these works too I would like to differentiate so that one category turns on the download links in the sidebar and a different one turns on the download button, and with equivalent magic words for individual control.

I think that gives all links by default for scan-backed works, except where tagged as deficient. Links by default on pages we choose to tag as eligible even if not scan-backed. For either category we have individual overrides using magic words. And we can differentiate between just the sidebar links and the prominent download button. And it should work for all the Wikisources, not just enWS, without a lot of project-specific config for eligible namespaces.

And since wikipages without <pages /> don't get the links without manual tagging, just done as bulk tagging, the logic in the software is much simpler (no divining based on $wgContentNamespaces, and no "Do we show on sub-pages?").

@Samwilson, @Beeswaxcandle, @Tpt, @Inductiveload, @Billinghurst: Makes sense? Anything I didn't think of? Thoughts? --Xover (talk) 08:09, 12 February 2021 (UTC)

@Xover: That sounds pretty good. So the features needed are: a) show the button whenever the pages tag is used; and b) two magic words, for opting in and out. Also: does it make sense to show the download button and the sidebar links with different rules? Additionally, phab:T272254 is about customizing which links are shown, and if that means we have to have an on-wiki config, then it could also be used for other purposes. — Sam Wilson 01:00, 15 February 2021 (UTC)

Promotional aside, the "Quick Access" tool allows keyboard-driven access to all the sidebar tools, whether or not you can see them in the current viewport. Inductiveload—talk/contribs 22:38, 6 February 2021 (UTC)

Reinstate license templates in exported texts

Latest comment: 3 years ago9 comments4 people in discussion

In August 2019, Kaldari stopped license templates from being exported on the grounds that they're not part of the original work.

I think this should be reverted for the reason I noted above: if a non-PD work is exported--that is to say it is copyrighted but under a free license--then typically we will be required to indicate that license to reusers. This may be different from our site licenses of CC BY-SA 3.0 and GFDL. Therefore PDF downloads for e.g. a CC BY 4.0 work, not including the template, may be a copyright violation.

Not sure how much consensus is required for this and Kaldari should have an opportunity to weigh in, so posting here. Thanks, BethNaught (talk) 23:41, 10 February 2021 (UTC)

The licenses certainly should be exported, not least because its a hard requirement of some licenses, but it's also good digital citizenship.

However, I think it probably makes sense to have the licenses use a class like "ws-license" to allow the exporter to identify them unambiguously. This firstly allows it to be written into ebook metadata in a machine readable way, but also allows it to be placed in a dedicated section of the TOC for easy access (perhaps next to the "about this book" section). Inductiveload—talk/contribs 23:53, 10 February 2021 (UTC)

That's an interesting point. I note that {{license}} already has licenseContainer and licenseBanner classes, would those do? But then, how does that relate to {{translation license}} or {{license container begin}}? (Both have licenseContainer, it turns out.) Just throwing it out there, I really should go to bed now... BethNaught (talk) 23:59, 10 February 2021 (UTC)

P.S. [17] would also need undoing under this proposal. BethNaught (talk) 23:59, 10 February 2021 (UTC)

@BethNaught, @Inductiveload: If a text is freely licensed, wouldn't it normally mention that as part of the work? If there are any examples of the situation you are describing, that might help inform our discussion. I'm not entirely opposed to the idea of including the license template, but if we do, it would be nice if it were included in a standardized way that didn't interfere with the formatting of the original work. Personally, I would prefer that it be included on its own dedicated page and not inside a damn box. Perhaps the template could have different CSS for the exported version (via the ws-noexport class). Let me know your thoughts on this and I'd be happy to work on it further if we can come to a consensus on the best way to implement it. Kaldari (talk) 00:38, 11 February 2021 (UTC)

@Kaldari: not really, plenty of works are released under a license that's not in the text itself. For a start, most of the Translation namespace is CC-BY-SA by default under the general WS contributor license, and anything with OTRS clearance is also licensed "out-of-band".

I think the first step will be to get the exporter aware that "things with x class go in a separate license section somewhere at the end". Then it's a matter of the CSS to style (or not) as needed. Inductiveload—talk/contribs 00:46, 11 February 2021 (UTC)

20 cent opinion The license and the wikisource components sit outside the work, though should be present and exported. They should be presented with a separation point from the work itself. Page pagination with a section break of some sort sounds fine to me. — billinghurst sDrewth 03:54, 11 February 2021 (UTC)

@BethNaught, @Inductiveload, @Billinghurst: I reverted my 2019 change, but added a page break before the license so that it doesn't get dumped into the content of the original work. Once the caches clear, the change should take effect in exports. Please let me know if anything looks amiss. Kaldari (talk) 03:48, 13 February 2021 (UTC)

Thank you, I've checked a couple of works and it looks good so far. BethNaught (talk) 09:02, 14 February 2021 (UTC)

Tech News: 2021-07

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Problems

There were problems with recent versions of MediaWiki. Because the updates caused problems the developers rolled back to an earlier version. Some updates and new functions will come later than planned. [18][19]
Some services will not work for a short period of time from 07:00 UTC on 17 February. There might be problems with new short links, new translations, new notifications, adding new items to your reading lists or recording email bounces. This is because of database maintenance. [20]

Changes later this week

The new version of MediaWiki will be on test wikis and MediaWiki.org from 16 February. It will be on non-Wikipedia wikis and some Wikipedias from 17 February. It will be on all wikis from 18 February (calendar).

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

17:56, 15 February 2021 (UTC)

Found another version of The Lost World (1925)

Latest comment: 3 years ago4 comments3 people in discussion

@PseudoSkull: I have found another version of The Lost World (1925) without the inserted credits and music credit. The only challenge I now face is getting the older file removed from Wikimedia Commons so I can upload the new one. What should I do? (SurprisedMewtwoFace (talk) 22:29, 15 February 2021 (UTC))

I am sorry for the late response. For some reason, I didn't get the ping and I'm not sure why. Anyway, thanks for finding this. You could name it whatever you want, as it could always be renamed later. The important thing is not necessarily the naming, but that we have the film archived in the WMF space, so it can be presented on Wikisource and other WMF sisters. @SurprisedMewtwoFace: PseudoSkull (talk) 17:01, 17 February 2021 (UTC)

You can also nominate the old file for deletion on Commons, press CTRL+F to find "Nominate for deletion" or similar in the Tools sidebar if I remember correctly. PseudoSkull (talk) 17:02, 17 February 2021 (UTC)

Don't fuss the existing file, give the new file unique name and explanatory name, and all will be fine. Different versions of files are acceptable at Commons. — billinghurst sDrewth 03:54, 18 February 2021 (UTC)

Uploading Open Access Scientific papers

Latest comment: 3 years ago5 comments4 people in discussion

Probably been asked many times, I am the lead author of Taxonomy based on science is necessary for global conservation which is an open access paper from the journal Plos Bio. As an author on this paper and it being Open Access can this be uploaded to Wikisource. A second paper I am interested in uploading is this one Principles for creating a single authoritative list of the world’s species of which I am also an author and is also Open Access. This second paper mentions Wikispecies and a couple of editors have asked me to upload this paper to Wikisource. However I wish to check on policies for this. Cheers Faendalimas (talk) 17:37, 18 February 2021 (UTC)

@Faendalimas: At a quick glance, these appear fine for upload. PLOS Biology meets the previously published and notability requirements of our scope, and since "Open Access" in this case means specifically {{CC0}} it also meets our copyright policy. Note that "Open Access" is an orthogonal issue to copyright, so it is the specific license that matters; and not all "Open Access" articles will be acceptable (the weaker forms use non-commercial licenses that are not acceptable here).

I'll leave our standard welcome message on your talk page shortly, with some useful links to get started. The process in short is: upload the PDF to Commons; create an Index: page for it here; transcribe the text page by page in the Page: namespace; and then transclude it for presentation on a wikipage derived from the article's title. Feel free to ask for assistance: our tooling is specialised and not the most intuitive for people trying to use it for the first time. --Xover (talk) 18:14, 18 February 2021 (UTC)

@Faendalimas: Peer-reviewed and a suitable public domain licence are our requirements, so yes. Conflict of interest for us in main ns is not particularly an issue for us due to the governing principles. Preferably you could do author pages if possible, so we don't have the works sitting in isolation. Modern science is not a well-populate space for us, so we may also need to do some categorisation work too. Asking you to do wikidata links is not necessary to be said. Just to remember that typically we do publish editions, and that may impact how you think about the work. — billinghurst sDrewth 21:30, 18 February 2021 (UTC)

@Billinghurst: The second paper I have started on with help from some more seasoned editors here, it already has a wikidata item as do all the authors of the paper. I will look at an author page for each of the authors also. Many of my own works are open access as I am no fan of paywalls, so a number are probably able to be uploaded here, though noting Xover's point on this, it will depend on the individual journal. Cheers Faendalimas (talk) 21:46, 18 February 2021 (UTC)

@Faendalimas: the primary thing is that it's licensed in a way that means our users can reuse it for any purpose. So the journal and/or all authors need to agree on a license that's suitable. Generally CC0, CC-BY or CC-BY-SA are the most common these days. Any commercial restrictions (eg. CC-BY-NC) are not acceptable here as it restricts the ability to reuse the material.

If all the authors agree (and are allowed) to release under a license different to the journal, they can use the commons:OTRS system to file a formal declaration of that, if it is not in the work itself. Inductiveload—talk/contribs 22:04, 18 February 2021 (UTC)

Problems with opening DJVU files and their Index pages

Latest comment: 3 years ago3 comments2 people in discussion

I am not able to open any index of a DJVU file. I always receive the message "Incorrect file format. The provided file is not a DjVu document". I receive also the same message when I try to open the djvu file directly. Does anybody know what is happening? --Jan Kameníček (talk) 23:48, 20 February 2021 (UTC)

I am having no issues with Index:The Indian Biographical Dictionary.djvu or Page:State directed emigration.djvu/14 — billinghurst sDrewth 00:23, 21 February 2021 (UTC)

I am still experiencing the problem, including the above linked index page of the dictionary (not the "Page:" page) but it is limited only to Firefox, now I can see that I do not have the problem in Chrome, so it is probably related only to something connected with my browser. Thanks. --Jan Kameníček (talk) 09:19, 21 February 2021 (UTC)

Tech News: 2021-08

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Recent changes

The visual editor will now use MediaSearch to find images. You can search for images on Commons in the visual editor when you are looking for illustrations. This is to help editors find better images. [21]
The syntax highlighter now works with more languages: Futhark, Graphviz/DOT, CDDL and AMDGPU. [22]

Problems

Editing a timeline might have removed all text from it. This was because of a bug and has been fixed. You might need to edit the timeline again for it to show properly. [23]

Changes later this week

The new version of MediaWiki will be on test wikis and MediaWiki.org from 23 February. It will be on non-Wikipedia wikis and some Wikipedias from 24 February. It will be on all wikis from 25 February (calendar).

Future changes

There is a user group for developers and users interested in working on Wikimedia wikis with the Rust programming language. You can join or tell others who want to make your wiki better in the future.

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

00:17, 23 February 2021 (UTC)

Articles from The Conversation

Latest comment: 3 years ago4 comments3 people in discussion

In the series I am trying to do there is a short article in The Conversation (this is the original article as published by them) that I want to include. The article itself is free to be redistributed. Their licence info is on the right hand side, they include permission to use their logo. However, the images are not ours. This was kind of like an interview where we explained how this seminal series came about. The pictures are all stock images, one from a museum one from the EPA the other two I do not know. They put the images in not us. When the story is downloaded as a pdf it includes the images, I have no interest in using the images here, but they are in the pdf. I can upload the pdf I have the rights for that, but I cannot separate out the images and do the same as far as I can tell. I can leave the images out, they are not actually relevant to anything in the story, or I can replace them with images I do have rights to. What would peoples advice be? Thanks Scott Thomson (Faendalimas) ^talk 10:16, 22 February 2021 (UTC)

this blog has a Creative Commons — Attribution/No derivatives license https://theconversation.com/us/republishing-guidelines . which is a problem, as commons will delete it. also need to research the license on each photo. we could talk about a "ND fair use" but need a consensus. unclear what the value added is, given search engines can find the text layer of the blog. (we tend to work on public domain texts) the PLOS biology article has a different license, and lacks the images.Slowking4 亞 Rama's revenge 00:32, 23 February 2021 (UTC)

We have no issue with author: ns pages having links to such works where we cannot host them. — billinghurst sDrewth 04:45, 23 February 2021 (UTC)

Thanks for the comments, honestly I do not want to use the images, they have been added as fill, amongst some ads also if you look at the online version, because it is as you say a form of a hybrid blog/online magazine. The images were not ours, the one from the QM is likely CC-0 at a guess as its a stock image that many Museums provide for press usage. The others I would have to check. This will be used as part of the seminal series I am creating, which will also have a page on Wikipedia about the IUBS Working Group for the Governance of Global Lists a newly formed international commission that I am Secretary for. This "blog" gives some good easy to digest background and discussion on how this all came about and hence is useful for some context to the general reader. I hope it will all make more sense when all the papers are up and the Wikipedia Page created and I categorize it all to tie it all together. I have created this its at the stage of needing to be proofread which I will do today. I have deleted refs to the images at this point. Cheers Scott Thomson (Faendalimas) ^talk 17:11, 23 February 2021 (UTC)

Tarzan and the Ant Men (1924) done proofreading

Latest comment: 3 years ago5 comments2 people in discussion

@Jan.Kamenicek:I have finished proofreading https://en.wikisource.org/wiki/Index:Tarzan_and_the_Ant_Men.pdf There are only two problematic pages here, one that is an image and one where I cannot create Minunian glyphs represented on page without an image. (Gutenberg's pdf has the glyphs as an image, so they're readily available online.) This work should be done by someone more skilled with image manipulation on Wikisource than I. All the rest can be validated now. My work on this text is now complete, thanka for all the help you provide. (SurprisedMewtwoFace (talk) 20:13, 22 February 2021 (UTC))

@SurprisedMewtwoFace: Well done! I have added the image, but I am afraid I am not able to hellp with the glyph :-( --Jan Kameníček (talk) 22:39, 22 February 2021 (UTC)

@Jan.Kamenicek:: Update: I have now rendered and implemented the Minunian glyphs on the page using .jpgs of them from the Gutenberg file. I have also rendered the cubed numbers correctly (the original file on Wikisource messed them up) using the Gutenberg file as a reference for how they should look. I think https://en.wikisource.org/wiki/Page:Tarzan_and_the_Ant_Men.pdf/205 looks quite good now and have put it down as Proofread as well now. Since the entire text is now Proofread, my work on it ends and others will have to validate it. Thanks for the work on the other image for me! (SurprisedMewtwoFace (talk) 23:36, 22 February 2021 (UTC))

@SurprisedMewtwoFace: Perfect! I have just improved some formatting in the front matter of the book. Usually it is not necessary to wait for validation and the work can be transcluded into the main namespace immediately after proofreading. However, if you wish, I will go through the rest of the book first. If you want, I can do the transclusion too. --Jan Kameníček (talk) 10:07, 23 February 2021 (UTC)

@Jan.Kamenicek: It looks excellent! Could you please do the transclusion for me? I don't think you need to validate the entire text beforehand, I've gone through it twice and it looks quite good to me now, certainly good enough for people to read for their enjoyment. Your help will be much appreciated, and I'll enjoy seeing it in the main space! (SurprisedMewtwoFace (talk) 20:38, 23 February 2021 (UTC))

Long s and Ligatures

Latest comment: 3 years ago3 comments3 people in discussion

Although I'm fairly new here, it seems that there is no consensus on the inclusion of long S and ligatures during proofreading. Currently, there appears to be no requirements for either. The help page states that "Some works, especially older ones, use ligatures, diacritics, and alternate letterforms. Whether or not to transcribe this formatting is left up to the transcriber. " I was wondering if we could have a vote on whether or not to make these a requirement. It appears to me that the most accurate proofreading would requiring their preservation. Could we have a vote on whether we should make transcribing "ligatures, diacritics, and alternate letterforms" a requirement? If the vote passes, then could we make templates for all the ligatures and then add them to the editing toolbar? Languageseeker (talk) 20:52, 22 February 2021 (UTC)

The only ligatures that are accepted here are æ Æ œ Œ. All the others (ct, ff, fl, fi, &c.) are print artefacts that are kerned automatically in modern fonts. When the ligature glyph is used for these text-string searches are broken and thus are not accepted. Diacritics such as accents, diereses, and macrons should be reproduced (this is easily done through the extensive lists of characters in the drop-down menu. Which leaves only alternate letter-forms from the list you quote. We have gone through several long discussions on the long-s character, which is the main alternate letter-form that we see (with the two forms of the lower-case kappa, being the second). The consensus from these discussions is that the decision as to which form to use is up to the initial transcriber/proofreader of a work, but that it must be consistently applied throughout that work. Other editors assisting and/or validating the work should not change the form decided on to the other. Beeswaxcandle (talk) 01:06, 23 February 2021 (UTC)

@Languageseeker: We do not mandate that ligatures like æ / Æ are used, and if someone through the work did ae / AE, that is acceptable and the wording is meant to cover. We want consistency through the work.

To use long s in the Page: namespace the template is {{long s}}/{{ls}} and that displays with the character in the Page: ns, and as a standard letter when transcluded to main namespace. Long s is allowed in main namepsace on a case by case basis where the long s is significant to the work. For example a modern work made to look like an older work, or evidence that long s is purposeful to the actual work; again consistency through the work. Community reached a consensus on this PoV. Otherwise it is exactly as Beeswaxcandle says. We are not looking to be a replica site, we are looking to reproduce works of those times. — billinghurst sDrewth 05:31, 23 February 2021 (UTC)

Request to upload The Greek Anthology

Latest comment: 3 years ago9 comments3 people in discussion

Tracked in PhabricatorTask T275100

Following the discussion at User talk:FabioDiNinno#Anthologia Palatina I would like to ask for uploading of the DJVU version of The Greek Anthology to Commons. The file has already been uploaded there by somebody as PDF, but its quality is very bad. Unfortunately, I cannot use the IA uploader because it does not allow to upload the file as DJVU once it has already been uploaded as PDF :-) I tried to convert it to djvu using some online converters, which sometimes works, but this time the result was very poor. --Jan Kameníček (talk) 22:24, 17 February 2021 (UTC)

@Jan.Kamenicek:

Done file:The Greek Anthology, Vol. 3.djvu — billinghurst sDrewth 01:14, 18 February 2021 (UTC)

@Billinghurst: Thank you very much for uploading it and my big apologies for not thanking for your help immediately… Actually, I thought I did so, but apparently I did not :-( --Jan Kameníček (talk) 16:36, 24 February 2021 (UTC)

@Jan.Kamenicek: all good, think you clicked the thank button which is enough from my perspective. (It is why it is there) — billinghurst sDrewth 22:06, 24 February 2021 (UTC)

@Samwilson: I think that we should remove that restriction in IA-upload, or at least only have it as a warning? With Fæ's uploading of the PDFs, it has become a horrible nuisance in so many senses to be forced to use the pdfs with inferior images, and inferior scans. — billinghurst sDrewth 01:25, 18 February 2021 (UTC)

I have added a phabricator ticket with three options in order of preference (depending on complexity of task) if IA's uri fragment exists: 0) decline if same file type; 1) challenge, if file type differs, accept if forced, or 2) accept if file type differs, 3) just accept. — billinghurst sDrewth 01:25, 18 February 2021 (UTC)

@Billinghurst: yes, actually I started work on this yesterday. There were three tickets for it; I merged into phab:T269518. I was thinking of just adding a warning that the IA item is already referenced (and link to the file it's referenced from). Do you think we'd get too many duplicates uploaded if it was that simple? Would it need an extra checkbox to allow overriding or something? — Sam Wilson 01:35, 18 February 2021 (UTC)

I would give them sort of positive action/choices buttons … "DO IT", "MEH, DOESN'T MATTER". I don't think that we will get too many uploads that shouldn't be. No one wants to repeat an extant transcribed and transcluded work. — billinghurst sDrewth 04:01, 18 February 2021 (UTC)

Actually, I consider it very positive when there are duplicates of the same file in PDF and DJVU in Commons and it should be encouraged. We do not upload the files only for Wikisource, but for other users too, some of whom may prefer PDFs, other DJVUs. I have already uploaded there several duplicate files in the two formats. --Jan Kameníček (talk) 07:36, 18 February 2021 (UTC)

The Lost World (1925) New version available

Latest comment: 3 years ago1 comment1 person in discussion

@PseudoSkull: The new version of The Lost World (1925) is ready. You can find it at https://commons.wikimedia.org/wiki/File:TheLostWorldSilent.webm This should clear up any of the problems with the music credit and anything else available on the other file. Hope you find it useful! (SurprisedMewtwoFace (talk) 04:58, 24 February 2021 (UTC))

Two versions of The Adventures of Sherlock Holmes

Latest comment: 3 years ago3 comments3 people in discussion

We currently have two different versions of The Adventures of Sherlock Holmes: one that – according to the talk pages – seems to be proofread from facsimiles of the original publications from The Strand Magazine (though we have no links to these scans, as this seems to be an early text from before the current sourcing and proofreading mechanism), and a recently uploaded one sourced from a PDF scan of book that collected the stories, published circa 1892.

However, these aren't exact duplicates of each other. The magazine version has a number of illustrations not present in the book publication. I haven't compared the text content, so I'm not sure if there are other differences (such as typos) between these editions, but based on the illustration differences there would be value in keeping both versions and just distinguishing between the editions.

I'm fairly new and not sure if there's a policy for situations such as this. @Slgrandson has begun replacing the content of the magazine version pages with the PDF one, but I'm not sure if this is the bes way of doing things. What if we instead moved the magazine edition pages, appending something like "(The Strand Magazine)" to the page names, reserved "The Adventures of Sherlock Holmes" for the book version, and made a distinction between the magazine and book editions on pages that link to The Adventures of Sherlock Holmes or the individual stories? Or is there already a naming policy/precedent for cases like this? Veikk0.ma (talk) 18:51, 26 February 2021 (UTC)

@Veikk0.ma, @Slgrandson: Yes, indeed. We do not generally replace one edition with another, but preserve both. In this case we would have a versions page at The Adventures of Sherlock Holmes that pointed to disambiguated names for the two editions, typically disambiguated by year (e.g. The Adventures of Sherlock Holmes (1892)). We only replace one edition when it is problematic for other reasons, for example that it is an amalgamated text that cannot be matched to any specific edition (or that it just raw OCR, or other such problems). --Xover (talk) 18:58, 26 February 2021 (UTC)

@Veikk0.ma, @Xover: WP:AGF applies here; otherwise, after weighing my options some twelve hours ago, here's how things might play out from here. To compensate:

The base page of Adventures is likely here to stay...
while, for the eight segments affected so far, things might get a bit complicated based on the above evidence that slipped right past my radar at the time.
- First, the original non-scan, Strand-sourced text will be restored, and the titles will follow the schema of "Title (The Strand Magazine)" by the next page move. (Although, I've just realised, WS does already have a base page for The Strand Magazine's coverage—which might as well prompt further renaming of those versions once we locate [and port over] the original copies.)
- Next, the original WS links for the segments (e.g. "A Scandal in Bohemia") will be recycled and converted into "Versions of..." rundowns, each both listing the Strand version and its Adventures counterpart.
- Thirdly, the Adventures subpage links will likewise be renewed with the transclusion code from the latest edits, fetched from the history.
- Finally, the remaining four segements will be retitled with "(The Strand Magazine)" tag, while the old name will point to "Versions of..." and fresh new Adventures subpages will be created.

I've been recently aware that conflated Gutenberg texts are one thing altogether—but a unique case like this one (no pun intended), right now, is another. (Enough to hold up my workflow from the time I got that message.) Of course, this sort of stuff can happen once you're an overeager, prodigal proofreader such as myself. —Slgrandson (talk) 20:23, 27 February 2021 (UTC)

Missing pages, unsure how to proceed

Latest comment: 3 years ago5 comments3 people in discussion

The following discussion is closed:

Replacement scan uploaded.

So after a brief discussion at Author talk:David Hume, I uploaded two volumes of the same work to the Commons: File:Hume - Essays and Treatises on Several Subjects - 1809 - Vol. 1.djvu and File:Hume - Essays and Treatises on Several Subjects - 1809 - Vol. 2.djvu. After uploading both of these files, I proceeded to make Index pages for each work, only to find (very disappointingly) that Vol. 2 was missing about 4 pages. I documented the issues on the Vol. 2 Index Talk page.

I'm unsure how best to proceed. Part of me feels that I could/should just delete both files and try again with a different edition of the same work that is in fact complete, as no edition of Essays and Treatises on Several Subjects appears on wikisource yet at all. I selected the 1809 edition because it was the first complete edition with good-looking scans and no use of the "long s", but the scan quality on volume 2 is actually kinda bad the further you get into it, and of course the missing pages. There is not another scan to my knowledge of the same edition+volume (1809, vol. 2) on Archive.org or similar, though I wouldn't be surprised if the missing pages are exactly the same as the same numbered pages in a later edition.

Advice? Thoughts? -- Mathmitch7 (talk) 15:31, 8 February 2021 (UTC)

@Mathmitch7: A wild Google scan appears: it's super messed up! There's a nicer 1809 v.2 scan at Hathi (also scanned by Google but not compressed half to death), but it'll take a short while to download and convert to DjVu. Inductiveload—talk/contribs 16:06, 8 February 2021 (UTC)

@Mathmitch7: updated version uploaded: Index:Hume - Essays and Treatises on Several Subjects - 1809 - Vol. 2.djvu. Enjoy :-) Inductiveload—talk/contribs 20:53, 8 February 2021 (UTC)

@Inductiveload: omg THANK YOU, I truly love the wikisource community -- Mathmitch7 (talk) 19:50, 10 February 2021 (UTC)

This section was archived on a request by: Xover (talk) 18:34, 3 March 2021 (UTC)

Collective work inclusion criteria

Latest comment: 3 years ago67 comments12 people in discussion

[This is a proposal stemming from the #Policy on substantially empty works section below.]

Since there has been no more input for a month, here we go. This is only a proposal, so any part of it can be changed, or the whole idea rejected. Inductiveload—talk/contribs 10:58, 25 August 2020 (UTC)

Inclusion criteria for articles

Some works are composed of multiple parts that can stand alone as independent pages. These works are generally encyclopedias, biographical dictionaries, anthologies and periodicals such as magazines and newspapers and so on. Such "collective works" have slightly different criteria for inclusion in the main namespace. The aim of these criteria is:

To allow individually-useful articles, or sets of articles, to be transcribed to the main namespace without requiring active transcription of hundreds of pages of unrelated articles

To nevertheless make it easy for other users to "drop in" and add more articles to the work.

To be eligible for inclusion, a component of a collective work (e.g. a single magazine article), should satisfy the following criteria:

The component should be "non-trivial" in scope and importance. For example, only a title page or single-paragraph "notice to subscribers" in a magazine is unlikely to be considered useful on its own. However, it would still be part of a full transcription of the rest of the parent unit (e.g. a magazine issue).

The work should be scan-backed.

Main namespace pages should be created for the work at the top level and any intervening levels (e.g. Volume and Issue/Number ranks should exist). Sometimes, the Issue/Number rank redirects to a section on the Volume page.

Front matter of each intervening level the "parent unit" (e.g. a magazine volume and issue) should be transcribed and transcluded

A table of contents is required for the parent unit in question. Use {{AuxTOC}} if the original work doesn't contain a TOC.

Appropriate infrastructure around the work should exist. This might include internal plain link templates ("lkpl"), dedicated article link templates for use on author pages, formatting templates for repeated formatting elements, etc. All templates should be fully documented.

The article should be linked to from any relevant author pages and suitable portals

Oppose. An article is a complete work. The only requirement for inclusion should be that it actually is an article. This proposal would result in, for example, the deletion of huge numbers (at least hundreds) of perfectly good short stories and similar articles created over more than a decade for no good reason. I can see no reason for demanding every piece of front matter, which might consist of large quantities of indexes, adverts and other material of no great importance but massive bulk and technical difficulty. Insisting on scan backing would be extremely damaging if a particular article is or should be used as a source for Wikipedia. The need to provide online copies of sources to maintain and improve Wikipedia is overwhelmingly more important than the luxury of scan backing. Requiring the creation of templates would be a crushing burden, because most people do not know how to create them. It is in any event wholly unecessary. Whether the article is linked to is irrelevant to inclusion. I can understand the desire for a main page that links to the article (and even that would take a lot of effort to effect in some cases where a lot of articles have already been created), but the rest is just obstructive. The problem with this proposal is that it would create a massive crushing burden that is wholly unecessary and produces no useful benefit to the project or readers. It is burdensome restrictions for the sake of restrictions. James500 (talk) 20:18, 29 August 2020 (UTC)
Support. Without a system like the one you have described in place, sub-pages of works could be created wantonly without any means of completing the works from which they were derived. If an article, which is a selection from a larger work, is created without any infrastructure, it will be very difficult for other Wikisourcerors to complete the work which has been started, as they will have to find and upload a scan and set up the complicated not-article material without the aid of the person who created the first article. The new system will also make it easier for other contributors to work on smaller parts of a larger work, without worrying about demanding formatting concerns. TE(æ)A,ea. (talk) 12:30, 30 August 2020 (UTC).
- Content creation should not be described as "wanton". There are means of completing the works from which the sub-pages were derived. If an periodical article is created without so-called infrastructure, it is very easy for other Wikisource editors to complete the work which has been started. It only becomes difficult when someone goes on a deletion spree. And it is massive numbers of nominations that cause problems. James500 (talk) 18:33, 30 August 2020 (UTC)
  - This page is a fine example of what I refer to. A novel contributor, with no previous involvement with this work, or one like it, would have to generate an entire system for reproducing (transcluding) articles from that work. The example I provide is more complete than other pages, and is much more complete, in relation to the whole work, than a single article. It would be very difficult to add to larger works, where the basis is merely articles or other pages in the state of which I complain. TE(æ)A,ea. (talk) 21:21, 30 August 2020 (UTC).
    Oh sheesh is that happening again. Fully agree with you TE(æ)A,ea that it is wanton and of little value. That content does not belong in main namespace. Main namespace is for transcribed work. Constructs and curation belong in portal namespace. I have created the portal and moved the non-mainspace material. — billinghurst sDrewth 23:17, 30 August 2020 (UTC)
    - That page was created more than a year ago. Nothing is "happening again". You did not move the bibliographic information from the mainspace page to the portal. I had to add it to the portal myself. If that important bibliographic information had been deleted by mistake, that is an example of how seriously disruptive the proposed deletion criteria could be. The word "wanton" is needlessly offensive. The primary meaning of the word "wanton" is "sexually promiscuous" and it is applied to other things by analogy. Please do not use that word. James500 (talk) 00:49, 31 August 2020 (UTC)
      - what's "happening again", is the periodic pearl clutching of the deletionists, who are opposed to an open project, and seek to provide a tl;dr of the "one right way" to do transcription. if a text is useful, and people can work to organize it, then we should include it. put a maintenance category, and move on. making up exclusion rules is a waste of time with the prospect of a growing backlog, or filters turning away newbies. take a look at german wikisource, if you want to know how that turns out. [24] Slowking4 ⚔ Rama's revenge 21:38, 1 October 2020 (UTC)

Comment @Inductiveload:
The proposal, as is, would require inhibit the ad hoc transcription of articles from "The Times", eg. The Times/1914 and things linked from {{The Times link}}. Is that in or out of scope for your proposal? Maybe there should be a declaration of some governing principles first. What is looking to be achieved, and indications of what is trying to be stopped. Then we can get onto a structure. I know that we created {{header periodical}} to capture where we have more sporadic collections of articles from newspapers. [Now I could be convinced that such constructions are better to be in the portal namespace rather than main ns.]
Some examples of pages considered problematic would be useful for context. If the proposal is an effort to have articles from a periodical becoming part of a hierarchy of the periodical, ie. subpages, then YES, I fully support that, in contrast to a random root level pages without context to the publication. If the proposal is to set up a fully qualified structure for every periodical where we just want to reproduce one article, then NO. This is self-interest as I regularly want to reproduce an obituary for an author to establish biographical information and we are never going to get all that requisite newspaper construct data, and we are virtually never going to get the scans.
For any newspaper article I have transcribed I will generally do "Periodical name/YYYY/Article name" to give it grounding, and the article would have some "notability". The Times I did an extra hierarchy level. I will accept that there will be early works that I transcribed that may be incomplete by that standard and I would not transcribe them that way today. — billinghurst sDrewth 15:31, 30 August 2020 (UTC)

To be eligible for inclusion, a component of a collective work (e.g. a single magazine article), should satisfy the following criteria:

The component should be "non-trivial" in scope and importance. For example, only a title page or single-paragraph "notice to subscribers" in a magazine is unlikely to be considered useful on its own. However, it would still be part of a full transcription of the rest of the parent unit (e.g. a magazine issue).

~~The work should be scan-backed.~~

Main namespace pages should be created for the work at the top level and ~~any intervening levels~~ a suitable, logical subpage hierarchy developed ~~(e.g. Volume and Issue/Number ranks should exist). Sometimes, the Issue/Number rank redirects to a section on the Volume page.~~

~~Front matter of each intervening level the "parent unit" (e.g. a magazine volume and issue) should be transcribed and transcluded~~

A means to navigate the subpages of the work is required; a table of contents is preferred, though alternatives exist. ~~A table of contents is required for the parent unit in question. Use {{AuxTOC}} if the original work doesn't contain a TOC.~~

Appropriate infrastructure around the work should exist. This might include internal plain link templates ("lkpl"), dedicated article link templates for use on author pages, formatting templates for repeated formatting elements, etc. All templates should be fully documented. (additional) Parent template exist to make this readily easy.

The article should be linked to from any relevant author pages and suitable portals; (additional) orphaned pages are not acceptable.

- If an article is orphaned, that is certainly a reason to add links to the relevant author page or portal. It is not a reason to delete the article. Issues that can be addressed in a very straightforward way by adding links to other pages are not suitable for use as deletion criteria. Why would you delete the page instead of just adding the links? This kind of thing belongs in a style guide. I suggest the words "eligible for inclusion" are the problem with some of these criteria. James500 (talk) 01:33, 31 August 2020 (UTC)
  We are wanting to get people to link. We don't delete a work for lack of a linking, we are not that petty. What that criteria does is limit the transcription and addition of the trivial, linking indicates that it requires some relevance. — billinghurst sDrewth 14:57, 31 August 2020 (UTC)
- @Billinghurst: I mostly agree with your formulation - that's more flexible in the case of newspapers. @TE(æ)A,ea.: has already given an example, but there are several more examples in the #Policy on substantially empty works below.
- I do still think we should be requiring the front matter, but perhaps only when we have scans. Usually, it's just a title page or issue banner, it usually provides the date and number as in the original and it prevents the main-space page being just a floating TOC: e.g. The Chinese Repository/Volume 1 and The Chinese Repository/Volume 1/Number 1, versus, say, The London Quarterly Review/39 (which doesn't have a scan, so it's kind of fair enough in this case, but if it had a scan, it should get the front matter).
- I was going to disagree with the removal of the scan section, but if it is downgraded to "if possible", since the current global policy is pretty much "scans if at all possible", it doesn't need to be repeated.
- For clarification: by "Parent template exist to make this readily easy." do you mean things like Template:Authority/lkpl? Inductiveload—talk/contribs 11:11, 31 August 2020 (UTC)
  I was meaning template:article link primarily as it is more what we have used for journals. template:authority/link is more aligned to dictionaries and the like. But yes, one of those as the parent template, or used directly. If we have a scan, then yes to front matter, so we can qualify in the regard of its existence.
I have a question; let's take Golfers Magazine. I expect that there will be exactly one article ever transcribed from this--Ask the Egyptians, by Rex Stout, an obscure short story by a not so obscure author. I'm glad to provide scans; I think we should demand scans for stuff that wasn't originally published digital. And it will get tucked under a Golfers Magazine/Volume 28/Issue 3/Ask the Egyptians. But how much work do you expect here? I would begrudgingly create a ToC for the issue, but messing with templates seems completely unnecessary.--Prosfilaes (talk) 14:03, 31 August 2020 (UTC)
Personally I think that scans are nice, maybe preferred, not mandatory. Sometimes getting scans is either not possible, or just problematic. I have numerous newspapers to which I can get access through subscription sites, but producing scans to upload is just MEH! especially if I just want an obituary reproduced. (Noting that where I just want a rough transcription or a snippet that these days I put it on an author talk page.) Have a poke at Category:Obituaries for a range sources that myself and others have used.
For your example, I would have gone for "Golfers Magazine/YYYY/article name" and then slapped down {{header periodical}} at the root level, as we get more years, then we can break it down further. — billinghurst sDrewth 14:57, 31 August 2020 (UTC)

@Prosfilaes:, what I think would be nice here might be:
- The top level page, pretty much as it is. Doesn't look like there's much more to say about this work.
- I can't really see any sensible templates (note "might include" in the proposal) to create for this work. It's not a dictionary so it doesn't obviously need a lkpl, and it's not big enough to merit an article link template of its own. Perhaps if all the headers are identical, there could be a formatting helper, but not critically needed.
- Personally, I'd like to see the cover if there is one and it's "nice" like this one (obviously not a library binding), and the issue header on the issue sub-page, but I can see the argument that it's a bit pointless if there is no intention to transcribe the rest of the issue. The TOC (which already exists in the original work) is something I'd prefer to see if possible, but I do get that it's a bit of an imposition in this case, where only one article is "interesting".
- A list of the known scans somewhere (90% of periodicals seem to do this in the mainspace, but that's evidently controversial). It looks like Hathi has an incomplete list and the IA has another Google-fied copy of v.12, so in this case probably just what Hathi has. A lot of the time a mish-mash is needed to get a set of links. Uploading is strictly optional - obviously preferred, but we all know how much of a pain it is, and page-listing and checking periodicals is pretty masochistic, so it's absolutely not needed.
- Again personally, I prefer "Golfers Magazine/Volume 28/Issue 3/Ask the Egyptians" than "Golfers Magazine/1916/Ask the Egyptians" since we might as well put things in the correct place ahead of time and it provides the obvious place for things like front matter. But I know that's not how it's always done, especially for newspapers where the content is often even more sparse, proportionally speaking, than magazines. Inductiveload—talk/contribs 15:54, 31 August 2020 (UTC)
  @Inductiveload: If we can get that data, then that is definitely preferred, and I would think that for journals we would encourage it. For newspapers, I doubt that we are going to get the coverage, and they are just a lot harder due to how those beasts are constructed. Probably a case of differing guidance, and difference tolerances. — billinghurst sDrewth 14:23, 20 September 2020 (UTC)

Caveat: I was hoping I would find the time to really dig into this and contribute something with some thought behind it, but I keep being disappointed, so instead I'm just going to do the drive-by thing. Sorry!
I Support Inductiveload's proposal as written. I disagree with Billinghurst's proposed softening, in particular regarding scans. We need to start getting a hard scan requirement (with the obvious exceptions) into policy, and partial works like this is where the requirement is most urgent as it is a de facto requirement for other contributors to be able to work effectively on completing the work. I am open to, and lean towards, removing the templates requirement. Templates are very hard for most people, and a somewhat tall order even for long-term Wikimedians, and I don't consider bespoke templates to be a critical factor.
I also support soft application of this policy, the same way we allow for {{incomplete}} and {{missing image}}. Billinghursts concern regarding gigantic efforts required for front and end matter (long tables of contents, indices, etc.) is a legitimate one, but I think this is better handled by softing application than softening the policy. If the text is put in a sub-page structure, is scan-backed, and the front matter is coarsely there, I can live with something like a hypothetical {{toc part missing}} or {{issue toc missing}}. With all the coarse structure in place, filling in detail is eminently doable by crowdsourcing.
I also stress that I don't consider the establishment of this policy a bright-line immediate cause for deleting existing texts. I oppose an explicit grandfather clause in this policy, but I !vote in favour of it in the context that our practice is not to proactively mass-delete historical texts just because we raise the standard for quality. I do, however, expect that individual texts that do not meet this new policy will be proposed for deletion piecemeal over time, as people happen to run across them, with no progress toward meeting the standard, or are too pathological to fix (which should certainly be the first approach whenever possible). And my expectation is that in those discussions those texts will either be improved to comply with this policy or they will be deleted in accordance with this policy. I also very much expect contributors who disagree with this to express their disagreement politely and constructively: prioritising different factors (e.g. quality over quantity) is in no way shape or form cause for name-calling or ascribing ulterior motives to other contributors. --Xover (talk) 13:22, 22 November 2020 (UTC)

Re. I am open to, and lean towards, removing the templates requirement: note that as written, this was intentionally worded as "may include". Not all works need their own templates and {{article link}} and plan links instead of lkpl are sufficient. Both are fairly trivially substituted later if a need becomes apparent. For example, the Journal of Classical and Sacred Philology probably does not need a "lkpl" template, because articles rarely refer to each other in the text. Notes and Queries very often does refer to other issues, so {{NAQ lkpl}} is useful.

Re. With all the coarse structure in place, filling in detail is eminently doable by crowdsourcing. this is exactly the outcome I am hoping to facilitate. Piecemeal crowd-sourcing is the only way periodicals are going to get any work done on them, because there are literally millions of pages of them, but any one article is a valid work in its own right. However, having the pieces nuked (or proposed for nuking) because they're not completed to a level that's not actually documented to be required is somewhat discouraging to drop-inners. Inductiveload—talk/contribs 01:36, 13 February 2021 (UTC)

Re my supposed softening. Unless you are going to provide scans of The Times and thousands of other newspapers and going to align the articles from within a page in strips for individual proofreading, then you need to allow for scanless. If maybe this only needs to be at the daily/weekly periodicals level (newspapers), rather than for journals, then I am okay with that. I do agree that we should always strive for a whole journal, or a whole scan, however, with newspapers this is just too high a bar to impose. If you look at our existing transcriptions of scans of newspapers they can be just as butt ugly, and unuseful; particularly around the advertising material, the columns, etc. and on a whole page determining when that page is actually proofread. Sites like Trove have enough issues and they have line by line systems based on article by article. — billinghurst sDrewth 05:48, 13 February 2021 (UTC)

While there should be no requirement for creating a link templates, however, where link templates are created for the purpose of linking to a work in either the short of long form, I believe that they should be used. Firstly they allow for reverse tracking of uses of works, secondly they allow for checking of the build of a work, and they allow for uniform display of works in namespaces. And for link templates they will allow for us to quickly update to a standard form of citation if we ever get to that point. Then my pipedream is that we can utilise WD to generate citations without having to manually curate, to do that we need the link templates.

Re lkpl one will often find that they are useful for interwork links in main ns where these works are cited within other publications, it is not solely for intra-work links. That said all these templates can come later. — billinghurst sDrewth 05:48, 13 February 2021 (UTC)

Not having read the intervening discussion, I

Support this approach to adding content, but it should be a "helpful guidance" rather than a policy. In the same way that adding works without a scan is discouraged, but not against any policy. —Beleg Tâl (talk) 02:15, 23 February 2021 (UTC)

@Beleg Tâl: to be clear: the reason this is presented as policy is that is has a bearing on multiple deletion requests at WS:PD such as American Jurist and Law Magazine/Volume 1. There is no policy right now that says anything about such pages, so they are technically perfectly valid. This then causes understandable friction when they are proposed for deletion. There is also no threshold indicated for when such a work is clear of the threat of deletion, which is actually my bigger concern, because if someone wished to improve this AJLM, there's no "safe" benchmark standard to aim for. For example, is Journal of Classical and Sacred Philology/Volume 1 (my proposed "model" for an incomplete journal below) "safe"? Inductiveload—talk/contribs 09:27, 23 February 2021 (UTC)

@Inductiveload: that is an understandable reason for wanting to establish policy. I think your proposal is overthinking it however. A few lines in WS:D#Precedent and WS:WWI#Excerpts should suffice for this purpose. Your proposal is better suited to be an improvement to Wikisource:Periodical guidelines, which is a guideline rather than a policy.—If I have time, I'll stop by the discussions in WS:PD and see what's up. —Beleg Tâl (talk) 13:48, 25 February 2021 (UTC)

@Beleg Tâl: Calling it a guideline is fine by me, as the line is rather blurry to me anyway. Moreover, like I said, my main motivation is to establish a baseline for works to aim for, and specifically not to set a rule to enable easier deletions (which is why I started this bun-fight, because I don't think deletion is right method to force improvement, but it's currently our de facto "oi, fix this" signal, which isn't very friendly).

I just want "the roolz" written down somewhere, with some semblance of consensus, somewhere other than a discussion archive. I'd also like to introduce the concept of using Portals for periodicals at both Wikisource:Portal guidelines, and Wikisource:Periodical guidelines, neither of which mention Portals for periodicals at all. But first we need to settle on how Portals are supposed to be used for periodicals, and how they relate to their mainspace counterparts. The main examples are Portal:Popular Science Monthly and Portal:Weird Tales, neither of which I would call examplary (though the subportal lists of PSM are exactly what I think portals are for). Inductiveload—talk/contribs 14:48, 25 February 2021 (UTC)

I read the intervening discussion and I generally

Support this proposal. AnotherEditor144 ^{t - c} 08:24, 1 March 2021 (UTC)

Support I think that this site can stand out from other sites by providing verifiable etexts. I know that in scholarship you need to be able to see the original text to confirm that the etext is accurately transcribed. So, I think that all texts should be scanned backed. I'm also against hacking up texts and just posting snippets. It ends up being a disorganized scrapbook. At the same time, I don't think that we need to accurately reproduce the exact text layout of a newspaper. It's ok to have something like NYT/March 26, 1911/Fire Kills Women at Triangle Factory as long as I can click on the page and see the original scan. Do we need to transcribe and transclude TOCs? Absolutely, they are an integral part of the work. I'm also against a grandfather clause because I believe that we need to gradually purge all etexts that are not scanned backed. If we are completely scanned back, it will be easier to attract the intrest and support of academic institutions that are looking for an alternative to expensive databases. If we have a mix, then we're half Project Gutenberg, a quarter of random texts pulled from some website, and only a fraction of decent, scan-backed texts. Languageseeker (talk) 02:35, 5 March 2021 (UTC)

No-content mainspace pages

This one is probably even more controversial so it's a separate proposal:

Collective works are commonly referenced by other works. Due to this, it is permitted to pre-emptively create the top-level main namespace page to collect incoming links, even when there is no content ready for transclusion. This also allows labour-intensive research into location of scans to be preserved and presented to users even when no transcribed work has been completed. The following is required for such a work:

A header with a brief description including active dates, major editors, structure (e.g. series) and so on

Redirects from alternative names (e.g. when a work has changed name or is referred to by other names)

A listing of volume scans should be added, and it should be as complete as possible, based on availability of scans online. As always, creating Wikisources index pages is preferred, but external scans are acceptable.

Creating sub-pages (volumes or issues) should follow the article inclusion criteria. This means a sub-page should not be created if there is no content.

Oppose As above these restrictions are an unecessary burden that would produce no real benefit and presumably result in lot of deletions. We do not need lists of editors. We do not need a complete list of volumes. (There may be hundreds of volumes of a particular periodical that have scans. For example, a page with links to scans of twenty volumes should not be deleted because the creator failed to link to scans of another eighty volumes.) Lack of redirects is not a reason to delete these pages either. James500 (talk) 20:37, 29 August 2020 (UTC)
Support, mostly. Generally speaking, I think that if a periodical changed its name, then there should be a separate page under the new name; however, redirection pages from alternate titles would be preferable. The other requirements are not overmuch burdensome, and would make useful a page that is otherwise empty, due to a lack of transclusions. TE(æ)A,ea. (talk) 12:30, 30 August 2020 (UTC).
- None of our periodical pages includes the names of the editors, as far as I am aware. Not one. Under this proposal, every single periodical we have would be deleted. Further, it is not possible to include the names of the editors when they are anonymous. James500 (talk) 18:24, 30 August 2020 (UTC)
  - @James500: "every single periodical we have would be deleted" - or we could make the effort to improve such works as we find them. Generally, an except from Wikipedia or some other source would do just to provide some context. E.g. The Condor vs The Journal of Jurisprudence, which has the dates, but not other useful info, not even the country. For example, even a quick trawl would allow to write something like "The Journal of Jurisprudence was a Scottish law journal published in Edinburgh from 1857 to 1891. The first successful Scottish law journal, it covered all aspects of the Scottish legal system and included editorials, biographies and short articles as well as case law and reporting of legislation. It merged with the Scottish Law Magazine in 1867. It was largely replaced by the Juridical Review in 1891.". The editors aren't particularly obvious here (so they're not "major editors"), but sometimes editors are important to the work's history and are explicitly noted, e.g. All the Year Round or The New-England Courant.
  - Basically, if a page has zero or near-zero transcribed content, in my mind it can edge over the line into acceptable as long as it's providing useful auxiliary bibliographic information, which might also include collation of various names. This is somewhere WS can actually provide value-add - nowhere else online, as far as I know, provides a venue for this information (IA/Google metadata is terrible, OCLC is not very good at periodicals, Hathi is not can't download easily, none are editable, often a complete scan list uses various sources, etc). However, "it was a periodical and here's a handful of raw external links, kthxbai" doesn't quite cut it, even for someone who thinks these pages can be useful like me.
  - I've said it before several times, but the aim here is not, not, not to get all the pages like The Journal of Jurisprudence deleted, but instead figure out what needs to happen to keep them. To me, a decent blurb and a tidy list of volumes and scans will do it, but that's far from consensus. As it stands, as far as I can tell, the only reason half of Portal:Periodicals isn't getting unceremoniously dumped into Portal space (something I personally would like to find an alternative outcome to) is no one really wants to deal with it. We can fix that by coming up with a minimum level which the pages should meet and then fixing them up. Inductiveload—talk/contribs 12:37, 31 August 2020 (UTC)
- @TE(æ)A,ea.: about the names, above is an example, where the The Journal of Jurisprudence absorbed the Scottish Law Magazine in 1867. Though technically after the merge TJJ became The Journal of Jurisprudence and the Scottish Law Magazine (e.g. here, but not the title pages), it was still the same work. So in my mind, we could have The Scottish Law Magazine running up to 1867 and then The Journal of Jurisprudence for 1857–1891, with notes about the merge in both headers.
- Another example of a work that changed name, but remained the same fundamental work is Monthly Law Reporter, which was just The Law Reporter for the first 10 years, and even kept the volume sequencing over the name change (though it added a "new series" number). So The Law Reporter should probably be a redirect. Inductiveload—talk/contribs 12:37, 31 August 2020 (UTC)
  - The Scottish Law Magazine [and Sheriff Court Reporter] was originally called the Scottish Law Journal and Sheriff Court Record. It has a page already which includes the volumes up to 1867. James500 (talk) 15:10, 1 September 2020 (UTC)
    - @James500: Then a link to it should have been in the description already. I have added it and expanded the description as above. Feel free to add more details. Inductiveload—talk/contribs 15:50, 1 September 2020 (UTC)

Comment Periodical main namespace pages should not contain the curated information of scans, etc., that is the job of the Portal: namespace. Main namespace should only contain published information for works that we have prepared. So under your proposal, the main ns can exist, and it should contain contents of works that we have transcribed, and there should be a corresponding portal: or there can be a constructed Wikisource: project page where there is a project to do the work. This was discussed years ago, and we have been moving those constructs to portal namespace for years. If there is zero content at the page, and we are unlikely to have it, then it can be redlinked, or maybe if it is that obvious then we don't need a link at all, Examples would be useful. — billinghurst sDrewth 15:42, 30 August 2020 (UTC)
- You are the only person moving these pages into the portal space. I would like to see a link to the alleged discussion you refer to. James500 (talk) 18:24, 30 August 2020 (UTC)

@Billinghurst: I personally don't see huge value in simply shunting just scan links to Portal and leaving them there:
- It eventually leads to having two parallel volume lists, one with links and one without, sometimes with divergence.
- It tends to end up with "scratchpad-level" content in Portal, which is supposed to be a nice presentation space.
- Portals are badly integrated and will probably not be noticed by casual users, or even many Wikisource editors. Especially as the Portal headers never seem to actually link to the mainspace works that exist, but we can fix that.
I suggest Portals like Portal:Punch provide some useful value-add, whereas Portal:Notes and Queries does not (yet), and its current content, if anywhere, should be on a WikiProject, just on the mainspace talk page, or even nowhere now all the volumes are uploaded. If the consensus truly is to shunt this all to Portal and move back once there's content, then fine, but I do wonder if that's truly the most ideal strategy. From a pure "only reproduced content in mainspace" angle, perhaps, but does that serve readers best? Inductiveload—talk/contribs
@Inductiveload: Main namespace is content for the reader. There is nothing worse for a reader to go to a page and have to drill down multiple pages to find that there is no content just some dashed skeleton of hierarchy. Main namespace is not built to drive transcribers and transcriptions, that is our other content spaces. We can create a page there once we have content to display what we have to read, and point to the portal for what we have to transcribe. It is the reason we put in place the portal namespace. — billinghurst sDrewth 15:08, 31 August 2020 (UTC)
I also wish to avoid the really ugly situation of people uploading a work, creating the front page, and then just leaving it for other people. That facadism of a work is just problematic, and we know that nothing happens to it. It is why we developed {{ext scan link}} and {{small scan link}} for use in the author namespace to do that role of managing that list build. So portal and author namespaces play that role and keep main namespace cleaner and more functional. — billinghurst sDrewth 15:15, 31 August 2020 (UTC)

@Billinghurst: I'm not say that we should be creating pre-emptive "empty" hierarchies. I'm saying that I don't really see the point of shunting all the scan links off to a portal where they will basically never be found by anyone who isn't extremely familiar with Wikisource and the mainspace/portal split. If a casual reader, is after, say, Volume 22 of The Atlantic Monthly, for which we have neither scans nor content, do we serve them better by placing a scan link to the IA on the mainpage next to the redlink so that can at least find what they wanted, or is better to have no redlink at all, skip Volume 22 in the list and maybe put the IA link at a portal? If the latter, I'm fairly certain 95%+ of people will just not find that link at WS. We can certainly adopt a stance of if it doesn't exist here, we don't even want casual readers to be presented with an external resource, but that seems slightly walled-gardenish for an open project.
"Facadism" is annoying, and it (or the perception of it) is what has brought us to this point via the proposals at WS:PD. As an example from that page, I don't find the concept of the page American Law Review intrinsically offensive in mainspace, even without any content (though perhaps it's a little untidy as-is), but I don't really see the point of American Law Review/Volume 1 as it stands (only a title page and redlinked TOC, though it's a single article away from being useful to me).
- Notably, I find "facadism" of a collective work much less annoying than, say, only having the preface to a novel. Collective works can have individually-useful things slotted in bit by bit, and if there's a framework around the work, it's even easy to do.
And if we do want to ditch this proposal and be strict with Portals in this way, then 1) it needs to be documented that that's how it works (Wikisource:Portal guidelines and Help:Portals don't mention use of Portals for this purpose at all, they focus more on thematic curation) and 2) most existing periodicals need to be converted over: many people reasonably imitate of existing structures, we can't blame them for that.
And do we allow redirection from a non-existent mainspace page to the portal so it can be found via "normal" linking until such time as there is content? Inductiveload—talk/contribs 17:09, 31 August 2020 (UTC)

The word "facadism" is needlessly offensive and should be deprecated in favour of something that doesn't sound like it refers to habitual dishonesty. I would urge that care be taken when coining neologisms to consider how these words might be taken. James500 (talk) 15:32, 1 September 2020 (UTC)
What? It means that there is a face only. Nothing more. There is no offensive with it and I don't even see where you can draw that inference. You are digging to deep or looking for insult. Front-pageism is meh! So unless you can ind a better term can you please AGF. — billinghurst sDrewth 18:58, 1 September 2020 (UTC)

Oppose I disagree with Inductiveload's position, and agree with Billinghurst's (provided I have understood them both correctly, which is not a certainty). We should significantly raise the bar in this area for mainspace pages, and anything that is not a (part of) an actual published work should be shunted to other namespaces. I acknowledge the downsides to that approach that Inductiveload brings up, but I think we should find other ways to ameliorate those. I also agree that the main purpose in setting a higher bar is to have a clear and predictable standard for contributors to aim for to enable keeping a work, with deletion being an admission of failure (i.e. deletion is a sometimes necessary, but never a desirable, outcome). I disagree that shunting content to other namespaces is a bad thing, as it is a great way to preserve content that would otherwise be deleted. Maintaining clear purposes for the namespaces makes possible technical innovation in the long term, through better integration with Wikidata and similar measures. --Xover (talk) 13:44, 22 November 2020 (UTC)

@Xover: Re: Maintaining clear purposes for the namespaces: I think part of my problem here is that Portal namespace is overloaded with two kinds of content: curated "exhibition-style" information and a dumping ground for lists of links shunted from mainspace, where they are all-but-invisible to the average user. IMO, either all the "volume list" pages should be in one namespace or the other. For example, The Times and Portal:New York Times are basically the same thing, but one is in Portal space and one is not. And very rough lists probably should go somewhere in Wikisource-space if they're so rough they're not suitable for public display.

As an aside, this is somewhere I think a cross-namespace redirect (if the mainspace page doesn't exist yet) isn't a summary hanging offence. Inductiveload—talk/contribs 17:33, 8 January 2021 (UTC)

@Inductiveload: I agree, I think, with all your points here; but I fall down on the other side of the line on them. Portal:'s purpose is a bit overloaded, but I prefer that to overloading mainspace's purpose. I think The Times and Portal:New York Times is a bit of a distinction without a difference, and I have no clear idea of what we'd actually put at The Times that would be materially different (in terms of the principles we're discussing here) from Portal:New York Times, but I'd rather have a bright-line rule for mainspace with common-sense exceptions (after community discussion) for works like The Times.

Or put another way, my highest priority is raising the quality bar for our main presentation namespace. I also care about the quality of other user-visible namespaces (Author:, Portal:, Translation:), and about practical issues like organization of work, findability of scans and bibliographic info for not-yet-proofread works, and barriers to entry and effort required to contribute; but all with a lower priority than maintaining quality in mainspace.

My immediate instinct regarding the duality of Portal: is not to "pollute" mainspace, but to find some good way to clean up Portal:. Typically by thinking up some better alternative along the lines of a new namespace or pseudo-namespace (like WikiProjects) for those purposes. Ideally with some form of technical innovation that would make that alternative desirable, not just tolerable, for the relevant stakeholders. Perhaps there's an opportunity for tooling to manage scan links and bibliographic data in a structured format, possibly even integrated with Wikidata? Overlapping with the WikiCite/Worldcat-killer/VIAF-replacement effort perhaps? This might even fit into a grander vision of tooling and integration for structured data on enWS, where everything that's in {{header}} in mainspace pages today would be editable in a GUI, backed by Wikidata, and inherently structured; and where we have a defined and tool-supported workflow from creation of Author: pages, populating them with works, adding scans, creating indexes, proofreading transcluding, promoting, etc. There is a lot of potential there, and a lot of it can be solved piecemeal: maybe a better alternative for "scan-list pages" could be the first piece of that puzzle? --Xover (talk) 09:28, 9 January 2021 (UTC)

I have no comment on other aspects of these proposals, but I think portals should not be used as the main linking places for volumes of specific magazines. I vehemently oppose having portals for the titles of magazines for example, in order to link to relevant volumes of that magazine. If we're going to, in the mainspace, call these volumes for example "The New Yorker/Volume 1" then having the main page be "Portal:The New Yorker" is a bit contradictory. Also, it would make searching the volumes easier, not harder to have a list of volumes in the mainspace rather than in other namespaces. I know that to the long-time editors of Wikisource and other wikis it seems trivial to ask someone to search for the portal when you need the volumes of a magazine listed. But think about the majority of our readers, who are not familiar enough with how wikis work to think "hmm, there must be a portal for this, in another namespace". I imagine most people who come to our site stumble upon it via Google search fairly randomly, and don't spend all that much time digging around here. So making it harder by moving these things to other namespaces is counterintuitive especially for readers who are generally unfamiliar with the site's practices or how the site is laid out. PseudoSkull (talk) 12:11, 29 January 2021 (UTC)
@PseudoSkull: They are not meant to be the place to where you link. All links to works are meant to be works in the main namespace. The problem is the contentless linking to some jumbled construct, so separate real transcribed ready content from some attempted misrepresenative dump of a load of links. — billinghurst sDrewth 12:20, 29 January 2021 (UTC)
Anything in main ns is meant to have been proofread, and ready to read. It is meant to be standard, it is not meant to have links off to here and there, it is meant to conform with our standard outputs. These pages with tens of external links, or tens of small scan links as the focus are not presentation material. If you are adding a structured or link in any work then it is meant to be main ns. Portal links are typically through portal parameter in a header, it would not be normal to put a link in the body of a work to the portal ns. WS:Links essentially says that. To me it is a discussion of where do we have an encouraging and coordinating space for these large multi-volume works. Previously, it has been the opinion of the community coordination is WS:WikiProject or Portal: sort of depending what you are doing, and as work is proofread, that is done in main ns. This may mean that there are bits in main (proofread works) and Portal: nss (coordinating components) as work continues. — billinghurst sDrewth 12:40, 29 January 2021 (UTC)

@Billinghurst: Not to say that policy or precedent aren't important, but you mostly pointed me to policy precedent and did not address my concern that our reader base (not our editing community) would be negatively affected by having links for the same work in a different namespace. For example, the page Popular Science Monthly would not be "contentless linking to some jumbled construct". Everything on that page would link to volumes of Popular Science Monthly, not some jumble of other magazines along with it. (This particular page should not be linking to Index pages though as it actually is now, it should be linking to the actual transcluded works, especially since it appears that Popular Science Monthly coverage is pretty complete. So it unfortunately is not a very good representation of my ideal scenario, but I don't know of any more complete magazine coverages here.)

I will agree that our magazine/newspaper/etc. coverage needs a massive lot of work and a lot of our mainspace magazine/newspaper pages have very little content which is an issue, but this issue should be solved by more work from the community, not by moving the pages to a place where they're harder to find. One should consider a page with very little content a sign that the coverage should be improved, not that it should be moved to some other namespace because it's so incomplete. These proposals seem to have this focus on the incomplete pages and not the final results of those pages which would be complete eventually (hopefully). In the case of a complete or nearly complete magazine coverage especially, definitely having all the volumes linked to in the mainspace is necessary. PseudoSkull (talk) 14:00, 29 January 2021 (UTC)

@PseudoSkull: What do readers come here to do? READ. What is produced in those pages to read? Nothing, but a volume list and a title list. What is its value? What is its purpose?

In our early years people came here and dumped scans of OCR'd text that was scraped from IA and then walked away. We are still cleaning up that unholy mess. It was useless, ugly, and unproductive and just created work for others, and made the site look poor. The original works were never out of scope.

This scenario is pretty similar. These bulk loads and dumps into the main namespace are not much different. I am all for someone coming in a working on a project to get things set up properly to present proofread data to our readers, and pathways for our proofreaders. But it takes the effort and the diligence and the proofreading, and the rigour of the community to not have a polluted main namespace that is full of someone's ideas with the sole reasoning that someone may wish to transcribe it at some point. Show me with the pathway to have quality proofread page. Also if you want to know the history of PSM and the hard work that has been done there, then come to me and talk. If you want to know about setting up a project to transcribe and transclude 63 volumes of Dictionary of National Biography then come to me and talk. If you want to talk about setting up compilations then come to me to talk. If you want to know about fixing up works and ongoing maintenance for people who come here with a good idea, dump some text and then walk away and have to have it remedied and tided and managed, then come to me and talk. I have an account with a few edits, and have a couple of bots with a few edits too.

I am all for displaying good proofread work in a logical manner, and for pathways to proofreading. One shouldn't just give us shit or shit to tidy up, and not expect some kickback because they have had a "good" idea or a feeling. — billinghurst sDrewth 21:27, 29 January 2021 (UTC)

Idea: can we work on a very small periodical (only a few issues) and bring it up to "Wikisource standard", so that 1) we can all see what is "right" and 2) we have something to use as reference when discussing things? Perhaps Journal of Classical and Sacred Philology, which appears to have only 4 volumes.
Unless anyone knows of an periodical that's already "perfect" (not necessarily complete) by Wikisource standards? Inductiveload—talk/contribs 13:10, 29 January 2021 (UTC)
- Further re. best practices (@Billinghurst: as the mover): what should be done with incoming links to things like Irish Builder, which was moved without redirect to Portal:Irish Builder? As it stands, several pages, including a redirect, now have redlinks, and the Portal is a functional orphan (only linked to from WS:PD and its own subpage)? What is best practice for directing would-be readers to the Portal page (since cross-namespace redirects are a CSD)?
- And should d:Q6070563 link to the portal or what? Inductiveload—talk/contribs 12:15, 2 February 2021 (UTC)
  The linking policy describes actions for links; the content policy describes action for content. The red linked redirects should be deleted per the deletion policy. Done Why would we directing _readers_ to the portal pages when there is no content? — billinghurst sDrewth 22:12, 2 February 2021 (UTC)
  Directing prospective reader to what is, to my knowledge, the only (free) list of volumes with links to scans of a periodical (and certainly the only one that a user can expand themselves) on the entire Internet does not seem a totally pointless endeavour to me. Shunting the content to Portal pages just to leave them functionally orphaned doesn't seem ideal to me.
  
  It's fine to point at policy, but if the policy is actively removing valid content (maybe not valid mainspace content, maybe imperfect, but valid content nonetheless) from the web, I might suggest that the policy needs work. Which is why this topic exists up here in Proposals. Inductiveload—talk/contribs 09:08, 3 February 2021 (UTC)
  The answer is to build proper content, not practice façadism. The content was moved and is still readily available. On that landing page the text says ...
  Searching Search for "Irish Builder" in other texts. If the text you are looking for is not English, see its corresponding language Wikisource. If the text is not a source text, check one of the other Foundation wikis.
  The link search of "Irish Builder" finds the work. The same search in the search box finds the work. This community has had this conversation over and over and over about some people coming and creating a front page, and then do nothing else with the works. It has not been evident that this has been a successful strategy; compared with putting in the effort to create the framework, and building from there. Build the content, work with the users to build the content. I tried with this user, but they were not interested in building proofread content. I was basically told I didn't know what I was doing. I moved on, life is too short. — billinghurst sDrewth 10:33, 3 February 2021 (UTC)
  Sorry, but I don't agree that expecting the average non-editor to know that there might be a useful resource on a "portal" page which is unlinked from anywhere they might plausibly find it organically and throwing them to the search tool is a good idea.
  
  I'm not talking about any specific user, I'm not even talking about a specific work: I want to make best practice clear, and so far, it I do not see a satisfactory description of how to structure these things. Hand-waving at nearly 2 decades of archives split between WS:S, WS:S/H, WS:PD and who knows where else and saying "we've said this before" isn't helpful. If you want people to edit according to your expectations, you must make those expectations clear. And it is far from clear what expectations are. I don't know what they are, that's why I'm asking. We don't have, as far as I know, a single periodical work that we might call an exemplar. Even the PSM has a confused mainspace/portal paradigm.
  
  I mean, there isn't actually even any thing that lays out any expectations for proofreading of pages transcluded to mainspace, as was made abundantly clear to me here, with reference to a (mostly stalled, almost entirely unhelpfully titled, and, other than the place it was pointed out to me, never before or since referenced) 2011 discussion at Wikisource:Scriptorium/Archives/2011-11#Variance_to_a_rule_is_requested. So we really should not be surprised when people think that anything goes, because, in practise, it does (and that's just the ones that are tagged with {{incomplete}}).
  
  As I mentioned higher up, there is also zero mention of the use of portals for this purpose in Help:Portals or Portal guidelines, and there is also no mention of portals on Wikisource:Periodical guidelines. Inductiveload

—talk/contribs 11:39, 3 February 2021 (UTC)

- One periodical I haven't seen mentioned yet is The Atlantic Monthly, which excepting the dozen or so extra articles at the bottom, is generally backed by scans, sorted into volumes and issues, has pretty good TOC coverage, and even attempts to show TOCs for periodical works that appeared in chunks across multiple issues. -- Mathmitch7 (talk) 18:42, 2 March 2021 (UTC)
Support, generally. I've always thought of Weird Tales as a best in class example of a periodical page.--Prosfilaes (talk) 21:36, 3 February 2021 (UTC)
That is a nice periodical, but the associated Portal:Weird Tales has a stale issues list (which is an example of what happens if you duplicate information in two places) and there's also a blurred line between "synthetic" content and "real" content (for example the manual TOCs on the year mainspace pages like WT/1923 and the external links). Inductiveload—talk/contribs 11:47, 4 February 2021 (UTC)

Comment Periodicals are major undertakings, and unless you approach them as such then expecting magic to occur with them is just ridiculous. I come back to the point that just getting volumes of works dropped into place and not all the project and curation aspect of such large undertakings is highly problematic and leads to a rubbish outcomes, and the problems we are encountering. Dropping them into portal is not ideal but it is better than dropping them into main namespace. That people think that making them available is going to be helpful or get a quality product is naï thinking. That people ignore the experience people who have tidied these up previously because they have a better idea, just is wearing on those who have been there before. I keep having to come back to fixing up works that are just dropped into main namespace and it is entirely shitful the inconsistency and the management that needs to be undertaken to fix these up. Excuse me if I get frustrated with explaining it, but it is not up to me to write policy or have to do the documentation, it is not my skill set. Make your decisions, do you want quality product in main namespace, or do you dribbles of inconsistent content. — billinghurst sDrewth 23:15, 12 February 2021 (UTC)

I have to disagree with it's not up to me to write policy or have to do the documentation. As admins, it's is exactly our job to do this and facilitate other users to add to the collection. And if we don't, we can't blame people when they do it not to our liking. And it's even less surprising people aren't "doing it right" when, after 5000 words in this section alone, there's still no actual explanation of what the current expectations even are, so I can't even write the documentation, because I have no idea what it should say. If you're sick of these conversations, perhaps that's because WP:SHY people will keep asking questions that they don't see an answer to, and the WP:BOLD people will just do it a way they find reasonable.

Also re Periodicals are major undertakings, and unless you approach them as such then expecting magic to occur with them is just ridiculous: Like it or not, the default state of any periodical will always be incomplete, firstly because periodicals are generally enourmous corpuses and secondly because the individual articles stand alone, and sometimes only one article is actually available and of specific value.

As I said, there's nothing (policy, guideline or otherwise) that even recommends not to transclude half a book of raw, unproofread OCR, and throwing people into nearly 20 years of Scriptorium/WS:PD archives actually results in finding discussions that support a case that it's allowed.

There has still been no comment on a proposal to discuss a specific, concrete example of a small periodical that we can whip into an exemplar state for an incomplete journal, which can then be used as a template for all the other, and future, periodicals, as well as a decent set of guidelines at a putative Periodical or Wikiproject Periodicals. "Don't put no-content top-level pages in mainspace." Fine, that message is received, loud and clear. Then what should be done? And what happens in mainspace when there's a single article? What about the portal at that point? Where does Wikidata point to? In the previous section you said you'd just use {{header periodical}}, for an unstructured list of content. At what point do we move to a more structured approach? Etc, etc, etc. Inductiveload—talk/contribs 00:38, 13 February 2021 (UTC)

Community writes documentation, it is neither an adminship role nor responsibility

I am not blaming people for not knowing. Though don't expect me to be chipper when a person is specifically told the process and they flip you the bird, and we end up with this conversation again

What do you mean it hasn't been said? It has been said and said => Main namespace takes published works. It takes content, per Help:Namespaces. And the community expectation is beyond a table of contents, a cover page, or a cover page and a preface.

I know that periodicals are a never-ending story. That is why they need their framework, and the work to set them up. Which is why someone dumping a list of volumes and external links into main namespace is not the process. There is no magic. That is also why the only work that has progressed so far is PSM.

I have no interest in the matter of delivering that work into an exemplar status. If I had an interest I would have expressed it. If I had a particular interest in presenting any periodical in a creation to termination work, then I would have done so. I don't. In previous years, I have helped plenty of people in guiding them in setting up particular works in guidance of this community. Developed templates for them for the display of their works. Even given them my dashed experience and dratted opinion.

The article in Journal of Classical and Sacred Philology and the ToC are how I would expect how a work would start where one is working with an scanned index. [I wouldn't expect to see {{small scan link}} used in the body of main namespace.] There have been multiple decisions through WS:PD that the minimum that we expect of a work is that there not be a subpage alone, there needs to be a basepage. We long ago said that basepage holder text was insufficient for a work. [Each of these decisions clearly has had, and should continue to have, latitude where someone is actively working and developing a work.]

If main ns exists and has reasonable content, then I would be pointing WD to main ns. If there is nothing there, then I would point it to the Portal: ns. A cascading approach to the links is reasonable and already part of WD practice.

{{header periodical}} is removed when there is structured content. Its purpose is to capture unstructured works to ease curation, and works at basepage or subpage

As I remember it, I argued that I would prefer that all these periodical builds go through Wikisource:WikiProjects as it was an organisational effort and the more that we have this argument, the stronger I wished I had fought that fight. One of the reasons that I ceded as it was argued that the journals and publications existed and could be developed with complementary information, and that was reasonable to interwiki link to enWP; that, and we didn't have many personnel to run wikiprojects—they are intensive.

Back at the end of the 2000s, and early 2010s we said that we wanted to concentrate on a quality product in the main namespace. No misrepresentation of what we had. We got rid of a stream of pages that had a header only and no content. We decided that we would embark on having scanned backed works, and that scans would be encouraged. That we could proliferate with index and page namespaces at the backend even if works sat there forever unprogressing. And I couldn't tell you whether it is written down or not in our help pages. I know that I have worked with 100s, maybe 000s of people over the years to assist them to work that way.
Further, to this time the removal of works that are not scan-backed, and are grossly incomplete are still problematic to remove in this community. usually with the commentary, "there is a scan for that". It is the nature of our community.
All that said, we have main namespace that we try to have to be the purest and the most presentable. Secondarily we have Author: namespace which we look at our main curated space. Followed by Portal: and Category: namespaces which we sort of do.
What should happen? People should be bringing in works they are going to undertake in the index: and page: namespaces, and start undertaking them. People should not be dumping links to internet archive into main namespace that they think others should be working on, or might work on; and rinse and repeat that. We need transcribers, and people to do the work. We could dump tens of thousand pages into main ns with links to archive.org and that would look totally shit, and help no one find anything.
To the those pages that have been created in main ns without content, we can continue to move them to another namespace to be curated; or we can delete them, or someone can bring them up to the standard of transcluded works with content. I am happy to do the first or second suggestions to resolve.
You don't want me writing policy. The few skills that I have lie elsewhere. — billinghurst sDrewth 05:19, 13 February 2021 (UTC)

And the community expectation is beyond a table of contents, a cover page, or a cover page and a preface. It might be, but that expectation is unwritten, and we all know what happens with unwritten rules: they end up in the Supreme Court and there is drama. And when it is stated "this is just how we do things, buck up your ideas" and there is literally nothing to back that statement up (other than 100k words shotgunned though 20 years of archives, mostly in discussions that ended without consensus), I can see how people are not jumping for joy when they're in the dock.

So what you would like to see is a redlink if there is no content (and summary deletion of any attempted redirects to the Portal to guide people to what we do have). Then, once a single article is created is the list is are duplicated between main and portal namespace, but without the scans links? Like this:

Main namespace	Portal namespace
Volume 1 (1854) Volume 2 (1855) Volume 3 (1857) Volume 4 (1859)	Volume 1 (1854) (transcription project) Volume 2 (1855) (transcription project) Volume 3 (1857) (transcription project) Volume 4 (1859) (transcription project) And then, by-subject, by-author, related things and other curated content (like Portal:Punch has the makings of).

Where the acceptability of the actual mainspace content is currently uncodified (and is de-facto "anything goes, even raw OCR" until it's trawled though WS:PD, which is what started this) and could be covered by an outcome for #Inclusion_criteria_for_articles. Do I have that right?

I want to set up a Wikiproject Periodicals to mirror the projects at WD (where they are signally uninterested in the data that would help us organise our periodicals better), but I would like to clarify the "done thing" first, because as far as I know, we don't have any periodical we can point to and say "this is how we do it". Not even the PSM is quite right in terms of Portal/Mainspace. Inductiveload—talk/contribs 13:21, 13 February 2021 (UTC)

I am backing out as I am now sure whether you are taking the piss. [I tried to prevent this train wreck at the beginning when I saw and was basically told to FRO by the user, so ...] I also have enough on my plate and have tasks taking my time and need my focus, and this is disrupting it. So ... I want proofread work only in main namespace; all other things that are aids, helpers, pointers, whatever belong elsewhere. I don't think that a morass of redlinks on any page is the answer. [Noting that I don't think that is the answer in any namespace, and I disagree with that approach in author: ns, but that is just probably me as a dinosaur.] I am very happy with the approach of a project as that is where all coordination belongs for long term works per Wikisource:WikiProjects, and as I did at Wikisource:WikiProject Biographical dictionaries. I don't think that hard hard hard rules are the answer either, this is about a practice, and one that explains what we do and why we do it, which we did try to cover at help:namespaces. The issue is always that people just want to jump in and do their thing, they don't want to read pages and pages of help text. Which was why we set direction and guided newbies. [Now I need to back out of this conversation.] We need to encourage people to find works under progress in the Index: / Page: namespace though I think that is primarily through WS: ns somehow, and incomplete works in the Page: ns is perfect, they progress at their own rate. — billinghurst sDrewth 00:37, 14 February 2021 (UTC)

Oppose in general the creation of "pre-emptive" title pages where no actual content is present. However, I oppose even more the placement of such title pages in Portal space when they contain nothing more than would appear in Mainspace if the subpage links were blue instead of red (and perhaps the temporary presence of {{scan}} or {{small scan link}}). Generally, title pages should not be created at all until there is at least one portion of the work present. Of course, if a part of the work is present, even if it is just one article within an entire periodical, then the title page should be created (in Mainspace). —Beleg Tâl (talk) 02:09, 23 February 2021 (UTC)

Also, if the only part of the work that has been added is the front matter (i.e. title page and TOC), then we should treat it like any other work in progress, and the fact that it is a collective work is largely irrelevant. —Beleg Tâl (talk) 02:19, 23 February 2021 (UTC)

@Beleg Tâl: what do you think should we do with pages like Canadian Law Times where there is no content (and, what should one do if had the same for another journal)? I understand that having no content is not ideal in any namespace (duh), but the volume link list itself is a useful resource that takes quite a bit of care and effort to ferret out and compile and is useful to anyone who does want to help later. WS is, as far as I know, the only community-editable place for this kind of information. Also, having something stable to link to is, in itself, helpful for collecting incoming links (e.g. end of here), IMO. My proposal is to allow this to remain at Canadian Law Times (in mainspace), because otherwise it is shunted to Portal (leaving redlinks despite the presence of the list), then back (partially? without scan links? duplicated entirely? moved?—this isn't clear) to mainspace on creation of a single article. Inductiveload—talk/contribs 09:12, 23 February 2021 (UTC)

@Inductiveload: Consider any "normal" work that is set up this way - say, a novel - where the only content is a list of chapters, none of which has actually been added. Depending on the circumstances, we would either delete it as G1 No meaningful content or history, or we would retain it as a work in progress. In some cases, we might move it to User (not Portal) space. Generally our discussions have resulted in keep if an Index page is present, and delete otherwise. I don't see why we would treat Canadian Law Times differently. —Beleg Tâl (talk) 13:27, 25 February 2021 (UTC)

@Beleg Tâl: my thinking (which I do understand is not universally shared) is that the list of volumes, dates and scan links (internal or external) is, in and of itself, a resource with non-zero value. It takes quite some time time to seek out a the links (perhaps synthesising a complete list from Google Books, IA and Hathi and others) and present them. Moreover, that list, as a user-editable collection of links is entirely unique on the Internet (the only challenger AFAIK is OnlineBooks serials, which doesn't have entries for many periodicals, and can't be edited), so it's a place WS can actually add value (certainly of more value, IMO, than the hundreds of copy-pasted PG texts). I'd like to leverage Wikidata to help here, but they seem almost entirely uninterested in coming up with a model for storing this data.

This is unlike an empty novel with redlinked chapters, because the correct place for that bibliographic information (the date and scan link) is the author page(s), but periodicals don't have authors, as such. In addition, incoming link to periodicals are much more common. Even the Canada Law Journal, a not-very-notable journal has something near 100 incoming references that could be linked, and that's just in the pages that have text layers saved. Not many non-existant novels have that. And, moreover, periodicals are also different in that single articles ("chapters" in the novel analogy) are complete works in themselves.

Now, if we say the volume list should go in Portal space, that makes good sense. However, it has not been explained in all the verbiage above, at what point the Portal becomes a mainspace page, and when it does, what happens to the Portal, and whether we clone (and, unless something clever is done with LST or a template, inevitably diverge from) or delete the volume list.

I'll happily go through all our periodicals single-handedly and update them to whatever, but I simple do not know what to update them to. Inductiveload—talk/contribs 14:30, 25 February 2021 (UTC)

@Inductiveload: I do see your point. I think that there is a spectrum of how these pages could be presented.

Something like Canadian Law Times really is nothing more than a list of redlinked chapters. Although the list of external links is beneficial, it's not really in scope. I personally put such lists in my User space. If there were an Author page, I would put them there. Perhaps I would add it to Portal:Periodicals as a single line item with a collapsible note of some sort. Or, if I really wanted to provide value, I would upload the scans to Commons and get an Index page started so that the work will have at least some content. But as it stands, neither empty TOC pages nor lists of external links are within scope here, so I would consider it a noncontroversial deletion as it stands.
At the other end of the spectrum we have something like the old version of Sacred Books of the East. Observe that the page includes significant bibliographic information, including links to the text in the form of subpages, links to external scans, links to Author pages, lists of the works contained in the compilation and links to their Versions pages, and more. This would have made an excellent Portal page. However, I was able to retain most of that data in the current version of Sacred Books of the East through a combination of judicious wikilinks in the scan and supplementary use of {{AuxTOC}}. (It also helps that some of the subpages do exist.)

To sum up: a list of volumes, like a list of chapters, is not in scope per se, nor is a list of external links. Either delete it or work on it so that it is in scope as a multi-volume work (or put it in User space I guess). A Portal likewise should not be just a list of volumes/chapters and external links, it "should bring together everything Wikisource has to offer about the subject" (Wikisource:Portal guidelines). For example, look at Portal:Exeter Book and Portal:The Bible - they contain contents but also other relevant bibliographic information.

(sorry, kind of rambly) —Beleg Tâl (talk) 15:07, 25 February 2021 (UTC)

@Beleg Tâl: as you can probably tell, I kind of think they should be in scope, but I'm not dead set that they have to exist in mainspace. In fact, I have a very sparse similar thing in my sandbox that needs more work before going anywhere. Like I said, I'm just not sure at what point you move what to mainspace. E.g. if the Journal of Classical and Sacred Philology was out of scope and at risk of summary deletion when it was just a list of four volumes, is that entire subtree now in scope because a single article exists? This is a kind of a reverse Sorites paradox. And then should the volume list + scan links exist there or elsewhere? Should the redlinked volumes be removed from mainspace until they also have any content? Redlinked issues in a volume with partial issues?

Ideally, the Portal would contain things like a thematic index, a by-author index and, like PSM, recurring section collections. That's the natural value-add that we should be aiming for, but which might (might) be better served by something like Listeria or some other data-driven process than relying on hundreds or thousands of manually-maintained portals. Inductiveload—talk/contribs 15:24, 25 February 2021 (UTC)

Portal and Author names

Latest comment: 3 years ago21 comments6 people in discussion

Is there a strict rule about the name of an Author entry? I created Author:Eddie August Schneider (2,220 Ghits) and another editor insists on moving it to Author:Eddie August Henry Schneider (453 Ghits), twice now. I could see moving it if we had to disambiguate between two people of the same name. "Eddie August Henry Schneider" only appears on a birth certificate. Is there a hard rule, or is the editor enforcing their personal taste as if it were Wikilaw? They also keep truncating the description of the person from a few sentences to a fraction of a sentence, is this a rule or are they again enforcing their personal taste? I could see if the other editor was contributing to the transcription project, they would have more of say on the esthetics. --RAN (talk) 22:40, 23 February 2021 (UTC)

Wikisource:Naming conventions You are talking about what I am doing and what we have been doing for years, so clearly there is some decent background to what I am doing. The description field is not meant to be an essay or cut down biography, it is a simply identifying the person, and may have a little explanatory especially relating to how other components on site. With names we expand to full names, and you can have a redirect from a shortened version. Please follow the conventions of our other author pages. — billinghurst sDrewth 00:27, 24 February 2021 (UTC)

Wikisource:Naming conventions reads: THIS IS A DRAFT!!!, and it was created by you, so yes, you are selectively imposing your personal style preferences on my entries, and pretending it is Wikilaw. Please stop. You are also migrating them to name that may not even be correct, you are taking my alias entries from Wikidata, and assuming that they are the proper full name, when they may only appear in a single document. The example you give in THIS IS A DRAFT!!! is with "E. E. Cummings", my entries, that you are changing are not pen names, you are are changing from their proper name, to a rare variation of their name because it is longer. I also don't see anything saying that the description has to be cut from a few sentences to a fraction of a single sentence. Wikidata already has a problem trying to connect to the correct person here, the less information here, the more likely they will be linked to the improper person. For instance VIAF and LCCN may have a dozen entries on an author with a common name, because there is no way to connect them, they are so information sparse. If it is your personal preference why don't you concentrate on entries that you have created. Since you are concerned about author space for Author:Eddie August Schneider, I can migrate him to Portal:Eddie August Schneider so all the author entries can be consistent. I would also like to hear from other people. --RAN (talk) 01:00, 24 February 2021 (UTC)
Meh! I don't record conversations of the community, I get hassled. I do record conversations of the community and I get hassled.
If you bothered to read the history of the document, I transferred the information to the document from a conversation that took place here 11 years ago and has been the guidance that we have been utilising. You will also see that there is information on the corresponding talk page. You will also see that there have been multiple other people editing that document. That guidance and convention has been in place for ten years, and came about due to issues that we were having with author pages at that time, and multiple versions of pages. In the end the community made the conventions document. So, no, I am not imposing my personal style preference.
The description lines is just that, it takes the relevant data, it is not a pen biography. If the person is an author they belong in Author: ns, if they are not an author they belong in Portal: ns. Nothing is perfect. The author page is linked to WD and to WP, it has birth year, death year, and occupation categorisation. I understand some of the difficulties of WD, and the issue is not those with VIAF data, and there are numerous ways to connect them. There is also the ready ability to merge items at WD when there are duplicates. — billinghurst sDrewth 01:33, 24 February 2021 (UTC)
Do not impugn reputations just because you don't like their interpretations. Would you like me to reflect on your editing? Please be civil. — billinghurst sDrewth 01:35, 24 February 2021 (UTC)

I have not impugned anyone's reputation. You misrepresented THIS IS A DRAFT!!! as !Wikilaw to justify moving something from what already was a full name to a fuller name, just because it existed. THIS IS A DRAFT!!! says nothing about moving entries to the longest possible version of someone's name. I simply said you were imposing your personal preference, and misrepresenting it as !Wikilaw. As I pointed out "Elizabeth II" does not exist as "Elizabeth Alexandra Mary Windsor", which also means you are selectively moving my entries. --RAN (talk) 13:31, 24 February 2021 (UTC)

Comment I have removed "this is a draft" from Wikisource:Naming conventions. As that is problematic and the only argument that is being brought. — billinghurst sDrewth 01:38, 24 February 2021 (UTC)

I started this thread to hear from other people, I already know your opinion. Why don't you give others a chance to express their opinions. If we go by strict naming rules will have to change "Elizabeth II" to "Elizabeth Alexandra Mary Windsor". Even excluding nobility, I can easily see a dozen people that have a longer birth name at Wikipedia or Wikidata. --RAN (talk) 05:28, 24 February 2021 (UTC)

If someone has authored anything that has been published (which it appears that Schneider has), then they have an Author page. If the person is significant but has not authored anything, then they have a Portal page. This has been covered with you on multiple occasions that I can recall. The naming convention of using the author's full name (and redirects from the shorter forms) was developed over time and therefore there will be examples that are waiting to be moved. Those that appear in Recent Changes come to our attention first. The suggestion of moving HM is a red herring and does not derogate from leaving Eddie Schneider at his full name in the Author: namespace. Beeswaxcandle (talk) 06:12, 24 February 2021 (UTC)

While I would personally prefer we adopt a policy somewhat closer to enwp's COMMONNAME for person pages, Billinghurst is entirely correct that the enws policy is to use full names, and it has stably been so for over a decade following community discussions of the issue. If you believe there is a problem with the specific name being used for a particular person then do please feel free to bring up that issue for discussion, but railing at the policy and the person you disagree with is unlikely to persuade anyone of much of anything. --Xover (talk) 06:55, 24 February 2021 (UTC)

I am not against the policy, I have always advocated for using a full name. As I pointed out above, "Eddie August Schneider" is his full name. "Eddie August Henry Schneider" only appears in a single document and should not be used. Can I change "Elizabeth II" to "Elizabeth Alexandra Mary Windsor" in accordance with the stated policy? Now answer why we cannot have a few sentences to describe a person, why it has be chopped down to 4 words? Does it take up too much space? --RAN (talk) 07:45, 24 February 2021 (UTC)

Description should only be brief because the detail belongs on the Wikipedia article. We only need enough to explain nationality and occuation(s). In other words, a single sentence that covers what is being put in the author categories. If there are significant relationships with another author, then links to those are appropriate (see Author:Robert Browning as an example). Beeswaxcandle (talk) 08:14, 24 February 2021 (UTC)

Where is that written down as a rule? A few sentence is "brief", no one is advocating copying and pasting the entire Wikipedia article here. Look at Author:Alfred Neave Brayshaw where is nothing about him, why not have a few sentences. Look at Author:André Breton, why does he get a few sentences instead of a few words? I can't understand why you fight to the death over such a trivial issue. --RAN (talk) 13:18, 24 February 2021 (UTC)

What is that "single document"? --Xover (talk) 08:24, 24 February 2021 (UTC)

I can see use of the full name in two documents from a really quick check. Used in a news article "The Des Moines Register" 28 Jun 1934, Thu, Page 12 for his wedding. Used in a baptismal record "Evangelical Lutheran Church in America Archives; Elk Grove Village, Illinois; Congregational Records in 1911. Seems reasonable enough to use the extended name. — billinghurst sDrewth 10:41, 24 February 2021 (UTC)

Can I change "Elizabeth II" to "Elizabeth Alexandra Mary Windsor"? --RAN (talk) 13:21, 24 February 2021 (UTC)

Instead of arguing over minutia, why don't we work together on what needs to be done. There are millions of entries waiting to be transcribed, and as I pointed out in a previous post there are half a dozen different ways that newspaper entries are being named and indexed, and they eventually need to be harmonized. There is plenty of opportunity for cooperation. --RAN (talk) 14:34, 24 February 2021 (UTC)

Without getting into the specifics, I'd like to validate RAN's frustration. There are many things in Wikisource that people treat as though they're formal policy, but are not clearly documented as such anywhere, and in some cases not clearly documented at all. It makes things difficult for those of use who have not been deeply involved in these discussions for decades. That said, it's not necessarily any individual person's fault. IMO the frustration is valid, but I don't think anybody is trying to snow anybody else here. It's a good idea to have a common standard for how we title author pages, and clear guidelines for how much text about the author goes into them. Seems worthwhile to have some discussion about that, and document the outcomes. I'll try to come back with my take on the specifics after reading through the comments more thoroughly. -Pete (talk) 18:03, 24 February 2021 (UTC)

If I have not made it clear, this is an amazing project, everyone deserves praise for their hard work. The project reminds me of the medieval monks preserving literature during the dark ages between the Fall of Rome and the Renaissance. --RAN (talk) 00:11, 26 February 2021 (UTC)

See Wikisource:Author names for the most-often used point of discussion on the names used on Author pages. --EncycloPetey (talk) 00:44, 26 February 2021 (UTC)

That page reads: "This is an essay; it contains the advice and/or opinions of one or more Wikisource contributors. It is not a policy or guideline, and editors are not obliged to follow it." Again, we are fighting to the death over minutiae, while millions of texts await transcriptions. Some people are weaponizing these poorly worded drafts of rules, and essays to rationalize implementing their personal preferences onto the work of others. --RAN (talk) 07:00, 4 March 2021 (UTC)

Tall s's

Latest comment: 3 years ago6 comments4 people in discussion

Page:The History of the Island of Dominica.djvu/141

Why on earth are old tall s's being retained? Outside of historical discussions of typography, no one, absolutely no one prior to Wikisource says to transcribe this way. It makes word indexing and searches confusing if not impossible altogether. Deisenbe (talk) 09:49, 24 February 2021 (UTC)

@Deisenbe: The template used in those pages displays a proper s in main namespace and is indexed in main namespace with a standard s. — billinghurst sDrewth 10:49, 24 February 2021 (UTC)

Fwiw, I vaſtly prefer retaining the tall ſs where they appear. Most web browsers index it as an "s" if you do a ctrl-f on the page, and changing their appearances from ſ to s in the encoding removes information rather than reproducing it. Changing it seems like an annotation?
Ultimately, the general practice tends to be making that decision on a work-by-work basis, which I'm fine with. For example, Bible (King James Version, 1611) has a work-wide style of using the long ſ, the tironian et ⁊, the double hyphen ⸗, and combining certain letters (u and v, i and j), but doesn't reproduce the particularly uncommon characters of the capitulum ⸿ and the rotunda ꝛ. Seems fine for other works/editors to decide differently. -- Mathmitch7 (talk) 18:18, 2 March 2021 (UTC)

It's not annotation; it's not adding comment on the work. It's simply recognizing a certain glyph as representing an s, instead of hard-encoding a position-dependent glyph variant.--Prosfilaes (talk) 08:05, 3 March 2021 (UTC)

Well then we get into a real philosophical question of who is representing what and how. Does the "ſ" ink-mark on the page represent an imaginary letter "s" in an imagined ideal form of the text, that I as a wikisource editor should also attempt to represent clearly? Or is it just an ink-mark, and is it my job as wikisource editor to represent the ink-mark itself? My understanding of parallel discussions on re-creating images is that WS tries to emulate the document scanned, not emulate what that document is trying to say. I see an "ſ", my preference is to transcribe an "ſ". That being said, the purpose of the WS project is to create usable source texts that are verifiable, and an "s" is arguably more usable in most cases. I am personally pretty comfortable saying "an ſ is an s" and not claiming that as merely my interpretation. My point is that it sometimes that decision might be interpretation, rather than emulation. Again, my preference is for this to be a decision made on a work-by-work basis.
There are all sorts of typographical innovations that have been made through the centuries, and how best to transcribe them isn't always obvious. I ran across an obvious eszet (ß) in an English text the other day, used in a way that the same work had elsewhere used an "ſs" or "ſſ". I transcribed it as "ß". Was that the right call? Who's to say? But it seems more correct to me than changing it to an ſſ, ſs, or ss merely to standardize orthography within a work or across them. -- Mathmitch7 (talk) 17:13, 3 March 2021 (UTC)

We don't represent ink marks; that's what PNGs are for. Text files represent abstract characters. Neither "А" nor "Α" are acceptable for use in English, no matter what the ink marks are, since they are Cyrillic and Greek characters, not Latin. Using "ß" in English text is just wrong; it, in Unicode, specifically in represents the German character, not the sometimes similar ligature found in other scripts.--Prosfilaes (talk) 03:10, 4 March 2021 (UTC)

Move proofread pages after fixing DJVU

Latest comment: 3 years ago7 comments3 people in discussion

The following discussion is closed:

Pages shifted as requested.

I missed that the pages were out of sync originally because they were equal numbers of additions (for the issue front matter) as deletions here Index:The New Monthly Magazine - Volume 097.djvu. I uploaded a fixed DJVU with the missing pages but now need the proofread pages to be moved to be back in sync. Thanks! MarkLSteadman (talk) 13:41, 17 February 2021 (UTC)

@MarkLSteadman: Build a list somewhere of what needs to be moved (Index talk???). You can do either start and finish wikilinks, or a list of pages with the increment value of pages that are progressed. Any built list it best later pages first as that will prevent any overwrites. — billinghurst sDrewth 21:41, 18 February 2021 (UTC)

To move 373, 372, 371, 246, 245, 244, 243, 243, 241, 240, 220, 219 all to increment by two pages. MarkLSteadman (talk) 18:33, 19 February 2021 (UTC)

@Billinghurst: Is this done or still todo? (asking mostly in case it slipped through and I don't want to wade in there and make a mess). --Xover (talk) 18:38, 3 March 2021 (UTC)

Done — billinghurst sDrewth 10:40, 4 March 2021 (UTC)

Thank you. MarkLSteadman (talk) 13:19, 4 March 2021 (UTC)

This section was archived on a request by: Xover (talk) 16:24, 17 March 2021 (UTC)

Harmonizing

Latest comment: 3 years ago7 comments4 people in discussion

Is it worth harmonizing these: Category:Obituaries in The New York Times by adding in the year, or taking out the year in the one with a year? And removing the word "obituary" from the ones with it? --RAN (talk) 06:42, 5 February 2021 (UTC)

@Richard Arthur Norton (1958- ): It seems that the preferred format for New York Times articles (at least when they are backed by a scan) is to include the full date if it is known, as The New York Times/YYYY/MM/DD/Title. If no scans back them, they should at least be listed on Portal:The New York Times with complete information so that they can eventually be migrated to a scan-backed version (a cursory view of pages in the obituary category seems to show that they're listed on the portal page). As long as all the articles are accounted for, I don't think it really matters if they all follow the same convention or not, as New York Times article coverage is pretty spotty anyways. Personally, I'd move the one with a year in its title back to its old location and leave it for the time being. -- Mathmitch7 (talk) 16:00, 2 March 2021 (UTC)

@Mathmitch7: What do you think of removing the word "Obituary" from the titles? I don't see that anywhere else, and as I look at the actual scan, it is not present. I think the editor added it so it is recognized as an obituary, but we have the category giving the same information. Also, have you noticed that there a half dozen different ways that newspapers are aggregated? If you go to Category:Newspapers of the United States and click on a few, you can see six different ways that articles are aggregated, there are manual bulleted lists, automated aggregated lists, manual sortable tables, there are empty calendar indexes like New York Post, and a few one-off experiments. There are lists and charts with annotations and summaries of the articles, and ones with just titles. Do you have an opinion on what is optimal, or should we let people keep experimenting for a while? Article titles have no standard, some have years, some have full dates. Articles themselves are a mixture of djvu files, jpg index pages, raw unformatted ASCII text, and formatted Unicode/HTML text. --RAN (talk) 21:25, 2 March 2021 (UTC)

@Richard Arthur Norton (1958- ): I think removing the word "Obituary" makes sense except in cases where it is apparently the article title in the text itself. One thing that I recently came across was some old discussion on a proposed redirect policy, which points out that as wikisource is a source for other wiki projects like Wikipedia, moving (particularly moving without leaving a redirect) can create issues when those projects depend on, transclude, or link our source text. This seems like an issue that could especially come up wrt obituaries. Not that we're responsible for WP's links being accurate, but something to keep in mind as we contemplate this kind of change. I say move the pages to the same title without "Obituary" for now and leave other moves (to dates subpages, eg) for when somebody (perhaps, you) takes it upon themselves to clean up NYT articles more generally. -- Mathmitch7 (talk) 16:25, 3 March 2021 (UTC)

I didn't even think of cross wiki linking. I don't even think there is a way to search for it. My way of linking involves only linking to the Wikidata entry for the news article that is stored here. The articles titles stored at Wikidata get changed during the move automatically. So I would link to [[wikidata:Q105939027|Bishop of Mombasa is Dead]], Qids are stable. From the Bishop of Mombasa's entry at Wikidata I would link Described_by_source=Q105939027. --RAN (talk) 02:50, 13 March 2021 (UTC)

Our newspapers and periodicals are a mess, and in desperate need of tidying up, but there is no universally accepted "best" paper to use as a model. I think harmonising the titles (particularly moving articles to a subpage of their parent work, rather than floating untethered in mainspace) is a good first step as at least they are all together. Advice from Wikisource:WikiProject Newspapers is Newspaper Name/YYYY/MM/DD, but usage of this is spotty. The use of Portals for article content needs to be standardised too, but that's a far bigger job. Inductiveload—talk/contribs 09:01, 3 March 2021 (UTC)
- The New York Times/YYYY/MM/DD/Title looks good. Yet when will we harmonize? Should New York Times/1922/Dr. Wilbur F. Crafts Crusader, Dies at 73 be prefixed "The"?--Jusjih (talk) 03:49, 13 March 2021 (UTC)

On future texts I'd like to work on

Latest comment: 3 years ago5 comments4 people in discussion

Afternoon, fellows. In case you're keeping track, it's been a month and a half since I came back to WS—after an eight-year lull induced by Google+ and whatever else. (Already did a Thornton W. Burgess story about an otter family on the final weekend in January; spent more than a fortnight on an 1880s Malagasy grammar [because Madagascar isn't that well-represented here]; and at this writing, have begun transcribing that 18th-century history on my homeland by Thomas Atwood [because coverage of the Nature Island here is similarly all but nonexistent].)

That said, there are no fewer than three works presently on my wanted list, all of which I've tried to track down with the help of my home county's inter-library loan service. In order of publication, along with status checks:

The Fairies That Run the World and How They Do It (1903, Ernest Vincent Wright): At least a couple of decades before Gadsby (a.k.a. the novel with [almost] no "e"s), Mr. Wright dabbled in poetry. While the later Thoughts and Reveries of an American Bluejacket is on IA and the earlier The Wonderful Fairies of the Sun is already on WS, The Fairies That Run the World is practically MIA at our usual sources, being so extremely scarce to begin with; Hillsborough County, FL's ILL team tried their best last month, but found no loaning institutions. (For those with pretty pennies and eBay accounts, it's all yours for at least US$185—way out of my current budget. A white whale if I ever needed one; wanna pony up some funds?)
Čudnovate zgode šegrta Hlapića (1913, Ivana Brlić-Mažuranić): Some time ago, this contributor proposed a Croatian-to-English translation in said namespace. Sad to say, you'll have to wait a while for a new, scan-backed version of those Brave Adventures. Help wanted, thousands of miles away: Anyone in that work's native Croatia with a pre-1920s copy and a scanner to deliver it with. Otherwise, see above for how well that turned out where I am.
Doctor Dolittle's Circus (1924, Hugh Lofting): Already discussed thanks to a less-than-promising IA PDF of a Project Gutenberg AU version. Now up for pickup at my hometown library—but with a catch: We'll be likely dealing with a reprint à la Gatsby, whose U.S. PD début was the catalyst for my comeback in the first place. (Recall that I was aiming for a first-edition copy.)

Honorable mention to...

Happy Jack (1920, Thornton W. Burgess): Which (and whose original cover) has surely seen better days—because what GBooks and Hathi have got has clearly shown its age at the dawn of the 2020s, not helped by the more or less dismal quality that plagued early scans of theirs. (Originally announced on my user page as one of the four titles marking Burgess' début at WS; The Adventures of Jerry Muskrat has since replaced it in my queue.)

To Xover (talk • contribs), SurprisedMewtwoFace (talk • contribs), Jan.Kamenicek (talk • contribs), and anyone else interested: If you've got anything further on the matter, then please ping back. Take care, see you in the backlists, and happy Ash Wednesday. (Which a lot of folks outside the Catholic movement are hardly aware of.)

"Never ends for this Captain, does it?"

—Slgrandson (talk) 17:16, 17 February 2021 (UTC)

@Slgrandson: The University of Illinois at Urbana-Champaign (who provide lots of scans to the IA/HT) have a copy of Fairies in the rare book stacks. They might throw you a bone with their digitisation service considering it's not a very common book (only 3 entries at Worldcat, all in the US) (or they might want their palms crossing with silver like the NLS). Inductiveload—talk/contribs 17:51, 17 February 2021 (UTC)

@Slgrandson: I would be happy to help you with Dr. Dolittle's Circus once you upload the scan of it. I know one of the Gatsby's is a reprint, so I don't think we'll have too much of a problem even without a first edition. I look forward to working with you on it! (SurprisedMewtwoFace (talk) 19:49, 17 February 2021 (UTC))

@SurprisedMewTwoFace: I already got an ILL copy of Circus—and am likely to begin scanning with my SD card tomorrow. Can confirm it's a 1950s reprint. (Just a reminder so that the archiving bot doesn't catch it yet.) —Slgrandson (talk) 01:25, 18 March 2021 (UTC)

If you want a first edition, several libraries near me have it but it might take me a bit to get them, take pictures with my phone etc. MarkLSteadman (talk) 00:32, 18 February 2021 (UTC)

Removing references to Wikisource in Wikipedia

Latest comment: 3 years ago10 comments8 people in discussion

While I'm at it, I have removed the reference to Wikisource "The Barbarism of Slavery" from w:Charles Sumner. WS's text is completely undocumented, has no authority, and users of Wikipedia are not helpfully referred to it. They would be better referred to archive.org, where at least you can see the publication information for what you are looking at.

Many texts exist in multiple, differing versions. There's only one version of "The Barbarism of Slavery"? OK, but say so. Otherwise WS is unreliable.

Undocumented texts like this are worse than useless. They are outright harmful, because they give a misleading impresion that sometthing reliable has been created. I'm not doing this systematically, but I will continue deleting links in Wikipedia to undocumented texts in Wikisource. I think by doing so I am helping WP users.

I had exactly the same argument with Project Gutenberg when it started. Udocumented transcriptions of texts create more problems than they solve. Deisenbe (talk) 10:08, 24 February 2021 (UTC)

@Deisenbe: If you are following enWPs processes, especially with consultation at w:Wikipedia:Reliable sources/Noticeboard, then I am not certain that we would have an issue if you can demonstrate that the work is unreliable. enWP is enWP and they have their guidance for editors to follow. The early additions here are early additions, and they are not how we would do a work today; that said if we don't have evidence that a work is truly problematic, they stand as they are as unsourced, supplied transcripts. Please utilise {{no source}} if a work has no source; please use {{fidelity}}if you think that the transcript is an issue. If you think that a work is truly problematic then consider nominating it for deletion at WS:PD. — billinghurst sDrewth 10:59, 24 February 2021 (UTC)

Obligatory soapboxing: this is why we need to start raising our quality bar, and most critically to start requiring works to be scan-backed and Proofread. Deisenbe puts it more directly than most, but the points are well made and valid regardless of whether most reusers of our texts are able to articulate them. Our balance is far too far in the direction of contributor convenience and preserving sunk cost (expended effort) no matter what, and we need to start shifting it toward better quality and a higher bar if we're ever going to make any appreciable progress (our backlogs are growing by the day). --Xover (talk) 12:18, 24 February 2021 (UTC)

this is what they do at german wikisource, and we are ten times their size (proofread pages), and widening the gap. is that what you want? Slowking4 亞 Rama's revenge 15:59, 25 February 2021 (UTC)

While it's obviously suboptimal to have unsourced versions kicking about, they're still better, IMO, than no version (unless there are actual concerns about fidelity, which is extremely rare). In this case, the source text is evidently the 1863 edition: Index:The Barbarism of Slavery - Sumner - 1863.pdf. Of course, the original IP contributor should have made a note of the edition and provenance back in 2007, and failing that, should have been prompted for one at the time, and {{no source}} applied until such source was provided.

In my opinion, it's more constructive to report such issues here and request research be done to isolate the exact edition and/or scans (or do it yourself if so inclined), than to just delete the links from enWP. As a community, we are generally pretty amenable to making our works useful to enWP, but due to immense backlogs of unsourced junk in the mainspace (driven by the mismatch between how easy it is for a new user to drive-by with a dump and the effort and learning curves of doing it right <insert moan about lack of documented expectations>), we simply can't fix everything up front. But, if something is specifically reported to us because it's linked to by enWP and needs a bit of digging, I don't think I'm overstepping to say we'd happily do it when we can. Inductiveload—talk/contribs 18:40, 24 February 2021 (UTC)

"Undocumented texts like this are worse than useless." no, deletionists are worse than useless: some of us are here to write an encyclopedia, and some of are here to delete one. pontificating tl;dr on "partner" projects about quality issues, is a profound culture problem that is a cancer eating away at the community. and in this case the scan is ready to be migrated. after all, we transcribed the 12000 EB1911 article references, that were cut and paste in wikipedia with only an endnote. i leave it to others to interact with the adversarial, and their issues. Slowking4 亞 Rama's revenge 15:53, 25 February 2021 (UTC)

Don't forget to update the Wikidata entry The Barbarism of Slavery (Q19079234) with the publication information, I connected it to the author. --RAN (talk) 00:18, 26 February 2021 (UTC)
The Wikisource copy needs to be placed on a separate data item, because it is an 1863 published edition, and not an 1860 copy (the date on the main data item). When we have a sourced edition of a work, it should be placed on its own data item so that the publication information can be added to that data item. It would be cross-listed on the main data item using the property "has edition" on the work's data item, and "edition of" on the edition's data item. --EncycloPetey (talk) 00:34, 26 February 2021 (UTC)
I leave that to the more experienced editors working on publications, the speech itself could have its own entry at Wikidata, and an entry for each published edition. I usually create one entry, and let the more experienced editors break them into smaller pieces. Recently at Wikidata, people have been breaking churches into: the building, the congregation, and the cemetery. --RAN (talk) 02:40, 26 February 2021 (UTC)

I've restored the Wikisource link in Wikipedia and added the page name. The previous copy is stored in User:Ineuw/Sandbox10.— Ineuw (talk) 21:36, 22 March 2021 (UTC)

Copydumps

Latest comment: 3 years ago11 comments5 people in discussion

Inspired by the above discussions, I would like to suggest that "copydumps" be added to WS:D#Precedent as they are frequently nominated on WS:PD and are generally uncontested. By "copydumps", I mean works that consist of copy-pasted OCR text, generally including page breaks and so forth - the sort of text that ends up in Category:Texts requiring OCR fixes. It is usually much faster to delete these works and then proofread them from scratch, than to try and clean them up before a match-and-split. There are occasional exceptions to this, which is why I am suggesting to add them as deletion precedent and not as a category for speedy deletion. —Beleg Tâl (talk) 14:31, 25 February 2021 (UTC)

Support if it's just raw OCR, there's basically no benefit. My usual request that a scan is found and linked from the authors/portals to accompany the red links via ext/small scan link. Inductiveload—talk/contribs 19:33, 2 March 2021 (UTC)
Support --Xover (talk) 18:26, 3 March 2021 (UTC)
Comment If you are talking solely a copy and paste of an OCR text from archive.org, then I always feel we can have a quick review process at WS:PD, rather than a long tortured discussion. At this stage I would prefer that any existing work at least could come through PD for a quick review as there can be link removals, etc. I am always comfortable challenging any new addition like that, though feel that summarily deleting can be disenchanting for a new users, and much prefer to move to their user space and discuss. I definitely don't think that they should be encouraged, nor attempted to be converted, other than straight replacement with transcribed and transcluded work. — billinghurst sDrewth 03:18, 5 March 2021 (UTC)
- This is exactly how WS:D#Precedent should work. It can't be used for a speedy, but they are a shorthand for a full discussion of repetitive cases. There's still scope within WS:PD proposal for the work to be fixed and kept. The discussion should also not be insta-closed with this, there must be time allowed for responses and or remedial work. If the work is newly-added, working with the contributor directly to remedy issues is much better than slapping them in the face with a deletion proposal at all. I'd hope this is only used in cases of old-and-stale OCR dumps. Dragging any recently-active work to WS:PD because it needs improvement is a last resort when all other avenues of improvement have failed. Inductiveload—talk/contribs 08:20, 9 March 2021 (UTC)
  - @Inductiveload: The discussion should also not be insta-closed Actually, it could be. The point is notification so that people have a chance to notice and open an insta-undelete discussion. Closed threads are not archived for two weeks or somesuch so there's a minimum time of visibility, and the deletion is findable in the archives if someone later on wonder where a page went. Not that anything is typically closed in any timeframe to which "insta" would be an apposite descriptor, but it does happen (things that are speediable but are brought to PD are sometimes insta-closed). --Xover (talk) 08:43, 9 March 2021 (UTC)
    - I'd certainly hope that nothing that's not long-abandoned is summarily closed without at least a chance for the contributors involved to have a say and/or chance to sort it out. As you say, WS:PD doesn't normally shake out as fast closes anyway, so it's not likely to be a major issue. My point is I don't think being covered by a WS:D#Precedent trumps the usual one-week minimum for discussion.
    - Also, expecting new users to know that, although we just nuked their page, it's not gone-gone and they can get it undeleted with no hard feelings on the proviso that they do X, Y and Z is unlikely to actually result in zero hard feelings all round. Then again, I've been saying in the "articles criteria" section above that using WS:PD as a "hi, please fix this" forum is already an overly adversarial process with a too-strong subtext of "if you don't comply, we're going to nuke it, so shape up, buster". Inductiveload—talk/contribs 08:58, 9 March 2021 (UTC)
      - I don't think being covered by a WS:D#Precedent trumps the usual one-week minimum for discussion. That is, by design, exactly what it does. That's not to say I don't agree with your cautions about how they should be employed (with great power…), nor that deliberately (mistakes do happen) abusing the process and criteria shouldn't be suitably trouted, but those are behavioural issues.
        On the other hand I also think you're exaggerating the "adversarial" issue: experience shows that we do not in fact have any problem with the situations you describe. Rather the opposite. We have some edge cases in recent history but those would have been adversarial no matter what (for entirely different reasons). The vast majority of what ends up on WS:PD is very old (and its age is itself a complicating factor in those discussions) or is problematic for entirely different reasons. It is a much bigger problem that nearly anything that ends up on WS:PD has a significant chance of someone jumping in claiming they will bring it up to standard and then never really following through (doing just enough to keep it from being deleted, and that's an exceedingly low bar). I have several such on my todo list that I will eventually have to finish myself because bringing them back to WS:PD would really come across as adversarial (and I really don't want the grief). We certainly need to be sensitive to how a deletion discussion feels for those contributors who care about whatever is up for discussion, but that can't be a trump card that prevents much needed cleanup.
        Let's keep things in perspective, is what I'm saying. --Xover (talk) 10:22, 9 March 2021 (UTC)
        I think we're (as per) on the exact same page, and I am now, (as per) drifting gently off-topic. I was referring to the slightly fractious result of the "incomplete" works discussions, which started with a multiple listing of works added by a single editor on WS:PD, with the "implied threat" of deletion becoming more contentious than the actual issue at hand (OTOH, sadly, none of the work's did get improved, so it probably would eventually have still escalated to a WS:PD entry in this case). As I have said before we don't have an active "Wikiproject Fixup" or "Makeover Taskforce" or whatever (sounds like these are candidates: I have several such on my todo list), so WS:PD is the de facto venue for "remedial action clearly needed, but I don't want to do it myself" entries.
        
        For raw-OCR copydumps, and, in particular, old copydumps, WS:PD is the right venue. If it's only just just happened, I'd generally say "first contact" via a talk page is more friendly (but you know that and, I think, that's not what you're talking about). Well-meant copydumps are a fairly common first attempt at a constructive edit. If the user vanishes or refuses to engage in improvement, and it's not something anyone else wants to handle, then WS:PD is the final resort. Inductiveload—talk/contribs 10:46, 9 March 2021 (UTC)
  - @Billinghurst: As Inductiveload said, "This is exactly how WS:D#Precedent should work." I am not proposing that we change anything about how we handle copydumps or WS:PD; what I am proposing is that we update WS:D#Precedent so that editors will be better informed about the precedent that has already been established on WS:PD for how we handle such works. —Beleg Tâl (talk) 00:54, 31 March 2021 (UTC)
    - - "Also, expecting new users to know that, although we just nuked their page, it's not gone-gone and they can get it undeleted with no hard feelings" you should not expect no hard feelings when deleting other editors work, without an attempt to find a scan before deletion. the "i as an admin can still see it" is no defense. go on down the "sword of damocles" road, see how many hard feeling you will create. you need to create a quality circle to fix non scanned backed works. deletion is not a quality improvement process, and is therefore power tripping only.Slowking4 亞 Farmbrough's revenge 23:36, 31 March 2021 (UTC)