Wikisource:Scriptorium/Archives/2023-12
Please do not post any new comments on this page.
This is a discussion archive first created in , although the comments contained were likely posted before and after this date. See current discussion or the archives index. |
Nuclear Test Reports - add by file or just transcribe
Hi folks! I just had a question to clarify Wikisource/community policy on something. I have a series of publicly accessible nuclear test reports from the Defense Nuclear Agency (Operation Crossroads, Castle, Ivy, Ranger, Hardtack, Argus, etc) that I'd like to add to Portal:United States Department of Defense or another related portal, since I find them interesting and think they'd be enlightening. Link to [Crossroads Report]
I have worked w/ documents before (see CIA NIS) that are pdf files dropped and transcribed page-by-page; I also know we transcribe things directly onto pages (see: UN resolutions, Acts of Congress, etc) as well. What is the policy on this? Do I add files and transcribe from there, or can I transcribe things directly onto pages?
I figured I'd ask now before I do something and then get yelled at, just want to do it the right way.
Thanks in advance! JoeSolo22 (talk) 05:14, 1 December 2023 (UTC)
- We prefer to have backing scans transcribed page-by-page, especially if the work was originally (or simultaneously) issued in print form. --EncycloPetey (talk) 05:19, 1 December 2023 (UTC)
- The ones without scans are essentially historical efforts. There is a very strong preference for starting from the scan for new texts. cf. Help:Adding texts. Xover (talk) 05:53, 1 December 2023 (UTC)
Tech News: 2023-49
Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
Recent changes
- The spacing between paragraphs on Vector 2022 has been changed from 7px to 14px to match the size of the text. This will make it easier to distinguish paragraphs from sentences. [1]
- The "Sort this page by default as" feature in VisualEditor is working again. You no longer need to switch to source editing to edit
{{DEFAULTSORT:...}}
keywords. [2]
Changes later this week
- There is no new MediaWiki version this week. [3][4]
- On 6 December, people who have the enabled the preference for "Show discussion activity" will notice the talk page usability improvements appear on pages that include the
__NEWSECTIONLINK__
magic word. If you notice any issues, please share them with the team on Phabricator.
Future changes
- The Toolforge Grid Engine shutdown process will start on December 14. Maintainers of tools that still use this old system should plan to migrate to Kubernetes, or tell the team your plans on Phabricator in the task about your tool, before that date. [5]
- Communities using Structured Discussions are being contacted regarding the upcoming deprecation of Structured Discussions. You can read more about this project, and share your comments, on the project's page.
Events
- Registration & Scholarship applications are now open for the Wikimedia Hackathon 2024 that will take place from 3–5 May in Tallinn, Estonia. Scholarship applications are open until 5 January 2024.
Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.
MediaWiki message delivery 23:50, 4 December 2023 (UTC)
Tech News: 2023-50
Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
Recent changes
- On Wikimedia Commons, there are some minor user-interface improvements for the "choosing own vs not own work" step in the UploadWizard. This is part of the Structured Content team's project of improving UploadWizard on Commons. [6][7]
Problems
- There was a problem showing the Newcomer homepage feature with the "impact module" and their page-view graphs, for a few days in early December. This has now been fixed. [8][9]
Changes later this week
- The new version of MediaWiki will be on test wikis and MediaWiki.org from 12 December. It will be on non-Wikipedia wikis and some Wikipedias from 13 December. It will be on all wikis from 14 December (calendar). [10][11]
Future changes
- The 2023 Developer Satisfaction Survey is seeking the opinions of the Wikimedia developer community. Please take the survey if you have any role in developing software for the Wikimedia ecosystem. The survey is open until 5 January 2024, and has an associated privacy statement.
Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.
MediaWiki message delivery 02:12, 12 December 2023 (UTC)
I changed the colour of this template to match our other auxiliary content templates like {{AuxTOC}}. If anyone thinks this could be improved further please go ahead, or let me know and I'll give it a go. —Beleg Âlt BT (talk) 15:14, 12 December 2023 (UTC)
- Before and after for convenience. —Justin (koavf)❤T☮C☺M☯ 15:16, 12 December 2023 (UTC)
Anyone know why the header for The Golden Ass of Apuleius is generating a link to a "Commons gallery" when there is no gallery, but just a category? --EncycloPetey (talk) 22:31, 12 December 2023 (UTC)
- The reason this is happening is that there is a separate category c:Category:The Golden Ass linked at the Wikidata item d:Q1044767. But the versioneditionortranslation item d:Q72754084 links to a different Commons category: c:Category:The Golden Ass of Apuleius (Adlington 1893).
- I'm not sure of the fix, but the logic seems to be something like the below pseudocode:
if category_one at versioneditionortranslation_item and category_two at work_item:
commons_gallery = category_two
- Template:Header would be the place to start. PseudoSkull (talk) 05:48, 13 December 2023 (UTC)
- Thanks. I knew about the other category, but not the code generating the link. It seems odd to link a second category for the work (which may include all manner of items, in multiple languages) when the category for the individual edition exists, and stranger yet to label it as a "Gallery". --EncycloPetey (talk) 06:04, 13 December 2023 (UTC)
(New) Feature on Kartographer: Adding geopoints via QID
Since September 2022, it is possible to create geopoints using a QID. Many wiki contributors have asked for this feature, but it is not being used much. Therefore, we would like to remind you about it. More information can be found on the project page. If you have any comments, please let us know on the talk page. – Best regards, the team of Technical Wishes at Wikimedia Deutschland
Thereza Mengs (WMDE) 12:31, 13 December 2023 (UTC)
Side-by-side Wikisource translations
I keep running across Wikisource translations (everything in the Translation: namespace) with the original and translation side by side using various means (they also typically have various levels of annotations, but that's a separate issue). The most recent one I saw is Translation:Charter of the Courts of Justice 2000/2008-02-23.
I think this is probably a "voodoo" practice stemming from people copying some extant translation—the earliest translations on enWP predated Proofread Page—and a lack of guidance on hoe to do this. These days we tend to not transclude substantial chunks of non-English texts even if they are part of the original published work, and the policy requires that translations be scan-backed (must use Proofread Page) [struck the PRP bit to avoid confusing the issue, based on comments below. --Xover (talk) 23:26, 16 December 2023 (UTC)] and the non-English text be already Proofread on the appropriate language Wikisource.
I've been hesitant to do anything with these when I've ran across them, but like the ones already there, sloppy practices tend to spread and I don't think we should let the Translation:-namespace devolve into an anything-goes, we don't care dump.
On the other hand, removing the side-by-side parts may feel intrusive for contributors who have put a lot of effort into the translation.
What are people's thoughts? Should we clean up older texts to modern standards (I mean case by case, and as they pop up for other reasons; I don't think any mass or bulk effort is called for here)? Should we tighten enforcement of policy for new Wikisource translations to avoid amassing more of these?
NB! The Translation: namespace is really hard for patrollers to manage because almost nothing there is policy-compliant. In order for patrollers to be able to do that effectively they'll need fairly clear backing from the community, hence this discussion. We have policies we just generally haven't ever enforced them (beyond the most egregious cases). Xover (talk) 13:43, 16 December 2023 (UTC)
- AFAIK, we have never required our user-translations to use Proofread Page, and I don't see that in Wikisource:Translations anywhere. For poetical translations with short poems, side-by-side has almost been the norm. We have huge chunks of important poetry in the Translations namespace done side-by-side. All of our translations of the poetry of Catuluus is done that way, with line numbers, which is hugely helpful for students of Latin.
- I've done something similar with the cantigas d'amigo of Martín Codax. For his poems, an in-scan translation would be extremely messy because they were published in a Spanish antiquarian bookseller's pamphlet based on the Pergaminho Vindel. You can see how Galician Wikisource scan-backed theirs here, with two original parchment texts but not using Proofread Page. The source scans have different text between pages because the first image is of the P.V., which contains all seven of his poems and their music, and the second image is from the Cancioneiro da Biblioteca Nacional compiled in 1278, which is an anthology of the nations troubadour songs, a sort of Domesday Book of music. But neither of these was the source text I chose to use; I used the publication of the P.V. in Las siete canciones de amor, which is housed on the Spanish Wikisource, since the descriptive text is in Spanish, even though the songs transcribed are in Old Galician, since the languages of medieval Spain are "messy". This transcription at es.WS is done using Proofread Page, however the images in the work are reconstructed images, where the lacunae in the P.V. have been hand-filled by the antiquarian bookseller who published it, using the aforementioned Cancioneiro da Biblioteca Nacional to complete the text. All of this leads to a reliable base text, but the images themselves are "user-created" and are therefore not a reliable source for supporting from a Proofread Page translation.
- I could continue with a lament about why I haven't yet attempted to translate some of the more important Catalan, Portuguese, and Spanish poets of the medieval period because of the challenges of finding a usable version of a short medieval poem in an entire 500 page text. But the themes here are that (a) poetry does not lend itself to Proofread Page, since poems are seldom on a page of their own without the end or start of another poem, and (b) using Proofread Page means proofreading the entire page, even if only the one poem is desired, and (c) ancient and medieval poetry even more so because the source texts are often not scan-backed with Proofread Page on the "home" Wikisource, even if they are image-backed, and (d) the "home" Wikisource gets messy for languages when you talk about ancient and medieval works. --EncycloPetey (talk) 17:02, 16 December 2023 (UTC)
A scan supported original language work must be present on the appropriate language wiki, where the original language version is complete at least as far as the English translation.
And for languages without an obvious home Wikisource that home would be mulWS.But apart from generally acknowledging that there are always going to be edge cases where pragmatic approaches are needed (your example of hand-filled lacunae may be one such case), I don't understand your arguments. It's easy to agree that line numbers are useful for students of Latin. But so are glosses on words useful for students of Shakespeare, and using modern spelling (to take the most obvious examples). Xover (talk) 17:53, 16 December 2023 (UTC)- The quote says (1) There must be a scan backing the original language text, and (2) the English translation cannot be extend what has been completed on the original language copy, but does not say that our local copy must be in Page namespace in an Index scan. I can see how that might be read into what we have as policy, but our policy does not require that. --EncycloPetey (talk) 19:02, 16 December 2023 (UTC)
- Ok, I've struck that parenthetical above to avoid saying any more than is explicit in the policy text. But how else would you interpret a requirement for scan-backing? Let's disregard the fact this was the intent of that rule when that policy was established in 2013 (well after we'd migrated to Proofread Page, and scan-backing using PRP was a base assumption; this 2011 translation was an example in the discussion): what would you say was a reasonable interpretation of the practical way to implement a requirement that texts be "scan supported" given our current standards? Xover (talk) 23:52, 16 December 2023 (UTC)
- The quote says (1) There must be a scan backing the original language text, and (2) the English translation cannot be extend what has been completed on the original language copy, but does not say that our local copy must be in Page namespace in an Index scan. I can see how that might be read into what we have as policy, but our policy does not require that. --EncycloPetey (talk) 19:02, 16 December 2023 (UTC)
- I do agree that the translation namespace is in an enormous mess and we need to apply strict rules to stop its spreading. Our current rules demand that a scan-backed proofread version in the original language Wikisource exists. (I personally would add another rule that the proofread extension has to be used for the translation itself too–in fact I would consider it even more important than the current rule). Side by side translation is a bad practice as the patrolling users have to check not only whether the text was translated well, but also have to check whether the original language transcription is correct, which is very time demanding without direct access to a text proofread in the proofread extension, especially when the checking is performed in the language which the patrolling user is not fluent in. The result is that people tend to avoid patrolling frequent changes to the texts in the translation namespace, and so even if an added originally translated text were good, there is a big danger of its gradual corruption. So whenever somebody tries to add such a side-by-side text, we should politely ask them to rework it in accordance with our rules, and if the user were not willing to do so, it should be deleted. --Jan Kameníček (talk) 18:24, 16 December 2023 (UTC)
- Side by side text has a noble history in print of making originals accessible to people with limited understanding of the language, and making people with limited understanding of a language more comfortable in their study of texts in that language. There aren't that many changes in Translation space--somewhere around 200 so far for December--so it doesn't take that much time to check them. There's certainly not enough edits for a "gradual corruption" of anything. I oppose making more rules for translators to jump through.--Prosfilaes (talk) 20:59, 16 December 2023 (UTC)
- 200 in 2 weeks is a lot. Would be interesting to know, how many of them were thoroughly checked (i.e. both whether the added foreign language text corresponds to the original in all details and whether the translation corresponds to the foreign language text). I personally gave this checking up a long time ago as for me it definitely is too time demanding. --Jan Kameníček (talk) 21:46, 16 December 2023 (UTC)
- Side-by-side is great and along with word glosses, line numbers, and critical commentary common ways editions are published. However, we do not add these to editions that were not originally published with such apparatus. I am not suggesting adding any new rules; just asking whether we should try to actually enforce the rules we already have. Xover (talk) 23:23, 16 December 2023 (UTC)
Technical needs survey RFC ongoing on Commons
There is a Technical Needs survey that is ongoing on Commons here: Technical needs survey. Given that Wikisource is so bound to Commons, this is a good opportunity for making proposals for technical improvements in tools, bots etc that would be a benefit to our project here.
One thing that's on my mind is that issue in IA-upload that causes any upload sourced from JP2 files (instead of direct DJVU downloads from IA) to have a black page that gets added to the start of the uploaded djvu file in Commons. I don't know any better way of describing the issue, but it throws off the OCR sync, and the only solution seems to be to always select the option to skip/remove the first page from the upload. The issue existed at least until last month when I last encountered it. Ciridae (talk) 13:26, 18 December 2023 (UTC)
Tech News: 2023-51
Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
Tech News
- The next issue of Tech News will be sent out on 8 January 2024 because of the holidays.
Changes later this week
- The new version of MediaWiki will be on test wikis and MediaWiki.org from 19 December. It will be on non-Wikipedia wikis and some Wikipedias from 20 December. It will be on all wikis from 21 December (calendar). There is no new MediaWiki version next week. [12][13]
- Starting December 18, it won't be possible to activate Structured Discussions on a user's own talk page using the Beta feature. The Beta feature option remains available for users who want to deactivate Structured Discussions. This is part of Structured Discussions' deprecation work. [14]
- There will be full support for redirects in the Module namespace. The "Move Page" feature will leave an appropriate redirect behind, and such redirects will be appropriately recognized by the software (e.g. hidden from Special:UnconnectedPages). There will also be support for manual redirects. [15]
Future changes
- The MediaWiki JavaScript documentation is moving to a new format. During the move, you can read the old docs using version 1.41. Feedback about the new site is welcome on the project talk page.
- The Wishathon is a new initiative that encourages collaboration across the Wikimedia community to develop solutions for wishes collected through the Community Wishlist Survey. The first community Wishathon will take place from 15–17 March. If you are interested in a project proposal as a user, developer, designer, or product lead, you can register for the event and read more.
Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.
MediaWiki message delivery 16:17, 18 December 2023 (UTC)
Broken pages on December 20
Hi all. For anyone seeing error pages that look like the site is down. This was due to a problem with the new version of MediaWiki deployed this evening (UTC). The devs and operations people are working on it, but the issues may persist for a while longer. As I'm writing this they know roughly what caused it, have rolled back to an earlier version, and are debating how best to fix the fallout (cached error pages, essentially). Xover (talk) 19:52, 20 December 2023 (UTC)
- It appears the fallout has been cleaned up now. They've also implemented some protections against the same problem occurring again in the future, and plan to work on further making this area of the infrastructure more robust after the holidays. Xover (talk) 06:59, 21 December 2023 (UTC)
Transclusion checker - Translations namespace
It looks as though the Transclusion checker tool borks when the text is transcluded to the Translations namespace. See Index:Ivan Cankar - Hlapci.pdf and Translation:The Serfs. The full text appears to be transcluded by a manual check, but the checker tool does not verify this. --EncycloPetey (talk) 04:56, 22 December 2023 (UTC)
- @EncycloPetey: Should be fixed now. Xover (talk) 20:56, 22 December 2023 (UTC)
Split author and work disambiguation pages where they have the same name
- Previous discussion (2023-07)
- (Discussion renamed from "Proposal: Explicitly allow Author cross-namespace redirects to mainspace, if the redirect is to a disambiguation page of both authors and works")
I think that, in cases where the mainspace is used to disambiguate both works and authors, an Author-namespace redirect should be allowed to be created to redirect to that mainspace page. For example:
- Author:John Brown -> John Brown
- Author:Nathan Hale -> Nathan Hale
- Author:Oscar Straus -> Oscar Straus
(The reason these mainspace pages exist in the first place is because of a technical rule at Wikidata that does not allow multiple disambiguation pages for the same name.)
My reasoning for this is primarily structural. It's clearly not useful and confusing to have a red link at the Author page of a common name like "John Brown" or "Nathan Hale". If someone tries to link to an author page for one of those names, and finds it's a red link, our linking scheme and hierarchy doesn't make it entirely clear where to go from there. So those editors that use the gadget that notifies if a page they link to is a disambiguation page wouldn't work, for example.
In terms of where to "rule" this formally, I'm not sure. The closest thing we have to "a list of rules" about disambiguation and redirects are our Help:Disambiguation and Help:Redirects page (both help pages). But, "cross-namespace redirects" are listed as a speedy deletion criteria at Wikisource:Deletion policy#Speedy deletion.
I don't think that it should be taken out as a speedy deletion criteria, but I would like to note that the speedy deletion criteria are not absolute rules set in stone either. There are plenty of cases of cross-namespace redirects that are accepted here, for example, and have barely been challenged (and I even used a few in preparing this post). But I wanted to bring a proposal here (again), because some other administrators seem to disagree in this specific case that these author->main disambig redirects should exist.
I'm skeptical that we need some policy written, though. I'm looking for community consensus in this discussion to determine that this is, indeed, a valid edge case for cross-namespace redirects (as I believe). PseudoSkull (talk) 15:53, 22 December 2023 (UTC)
- With regard to "a technical rule at Wikidata that does not allow multiple disambiguation pages" - there are seperate object types on Wikidata for "Wikimedia disambiguation page" and "Wikimedia human name disambiguation page", which handily takes care of this issue. My suggestion, therefore, is to split these mixed-use disambig pages back into the namespaces where they belong. —Beleg Tâl (talk) 16:35, 22 December 2023 (UTC)
- @Beleg Tâl: Actually, that makes sense, since the item for Wikimedia human name disambiguation page is also given the following data:
has use: author disambiguation
Wikisource category (of the Category:Human name disambiguation pages item): Category:Author disambiguation pages
- So, I can split the disambiguation pages that way. @Billinghurst, @EncycloPetey: does this sound okay to you? PseudoSkull (talk) 17:15, 22 December 2023 (UTC)
- I haven't looked at any of this in depth, but my stance going in is that we should not mix author disambiguation and work disambiguation on a single wikipage. To the degree that practice is the root cause of this issue I am all for finding a way to avoid that. Cross-namespace redirects are, in general, bad because they violate ontological hard boundaries. Whenever a perceived need for one crops up it is very likely that we've messed up elsewhere. Redirects between Help: and Wikisource:, for example, are most likely because we haven't kept the two's purposes cleanly separated. Xover (talk) 17:45, 22 December 2023 (UTC)
- My opinion is the same as Beleg Tâl and Xover on this issue. I would like to see works disambiguated in the Mainspace and authors in the Author namespace. I would prefer to keep Work and Author disambiguation separate, with at most a "see also" type of link between them when both exist. --EncycloPetey (talk) 18:55, 22 December 2023 (UTC)
- Comment (I never understood the purpose of the Help namespace anyway. How is it supposed to be different from the Project namespace?) PseudoSkull (talk) 20:45, 22 December 2023 (UTC)
- Help: should only have the "how to" articles in it; while Wikisource: is for policy and project co-ordination. Beeswaxcandle (talk) 20:59, 22 December 2023 (UTC)
- Comment (I never understood the purpose of the Help namespace anyway. How is it supposed to be different from the Project namespace?) PseudoSkull (talk) 20:45, 22 December 2023 (UTC)
- My opinion is the same as Beleg Tâl and Xover on this issue. I would like to see works disambiguated in the Mainspace and authors in the Author namespace. I would prefer to keep Work and Author disambiguation separate, with at most a "see also" type of link between them when both exist. --EncycloPetey (talk) 18:55, 22 December 2023 (UTC)
- I haven't looked at any of this in depth, but my stance going in is that we should not mix author disambiguation and work disambiguation on a single wikipage. To the degree that practice is the root cause of this issue I am all for finding a way to avoid that. Cross-namespace redirects are, in general, bad because they violate ontological hard boundaries. Whenever a perceived need for one crops up it is very likely that we've messed up elsewhere. Redirects between Help: and Wikisource:, for example, are most likely because we haven't kept the two's purposes cleanly separated. Xover (talk) 17:45, 22 December 2023 (UTC)
- So, I can split the disambiguation pages that way. @Billinghurst, @EncycloPetey: does this sound okay to you? PseudoSkull (talk) 17:15, 22 December 2023 (UTC)
Comment @PseudoSkull: Please take the courtesy to link to the previous attempt that you made on this matter. It is a bit rude to expect that others need to go and rehash their arguments just as you consider it important and the community did not agree with you at the time. — billinghurst sDrewth 23:08, 22 December 2023 (UTC)
- And NO I don't think that it is right that after a few comments that you think that we have a consensus and you can go and undo many years of policy based on the commentary so far. This needs a proper RFC, not this discussion. You are asking for a change of practice long held. Please show a little respect, especially at this busy time of year. — billinghurst sDrewth 23:12, 22 December 2023 (UTC)
- (edit conflict) What is a "proper" RfC? In the past nine years, only two topics have been opened at Wikisource:Requests for comment, and in the entire history of that page, only one topic has ever been closed. The dearth of activity demonstrates that the community does not take that page seriously. Nearly all discussion on policy has taken place in the Scriptorium, or on one of its satellite pages. --EncycloPetey (talk) 23:26, 22 December 2023 (UTC)
- @Billinghurst: Linked (it was just due to forgetting). And the course of this discussion has now changed, from suggesting a cross-namespace redirect to solving the problem entirely with a more namespace-friendly solution. I just pinged you to know if you're in favor of splitting John Brown into two disambiguation pages, John Brown and Author:John Brown, based on the Wikidata evidence provided above. It seems to me like a straightforward and workable solution, and several other editors seem to agree (in fact I'm not even the one who originally suggested it). I'm not aware of any rules that explicitly disallow this type of split, so what justifies a formal RFC vote about that? But if it is against some longstanding policy, sure, I'll bring it to RFC promptly. PseudoSkull (talk) 23:24, 22 December 2023 (UTC)
- I agree - changing the policy on cross-namespace redirects would warrant a more official proposal, but splitting the multi-namespace disambiguation pages does not. Since the former proposal is no longer being suggested, I think this is ok as it is (but you might want to consider renaming the discussion lol) —Beleg Tâl (talk) 16:26, 23 December 2023 (UTC)
- Comment I have created a test case (out of rather immediate necessity), at Samuel Butler. Compare Author:Samuel Butler, and see the Wikidata items: for "human name" (authors), and general disambiguation item (works). PseudoSkull (talk) 02:48, 23 December 2023 (UTC)
Exports even more unreliable than usual lately
PDF export links such as this one usually work with flying colors, but instead I get this error:
Wikimedia Cloud Services Error
This web service cannot be reached. Please contact a maintainer of this project.
Maintainers can find troubleshooting instructions from our documentation on Wikitech.
I am aware that these servers have not been known to be the best in the world, but out of the last 10 or 15 transcriptions I've finished (over ~ the past few weeks), only one export actually worked...which is pretty unusual. For simple works like The High School Boy and His Problems, with a small number of pages and chapters, little formatting, and only one image, this is usually a cinch for the exporter, but over the past few weeks it's crapped out on almost every download attempt I've tried.
Maybe there's a particular problem with the servers right now that needs to be addressed? PseudoSkull (talk) 16:55, 18 December 2023 (UTC)
- @PseudoSkull: Known issue. ws-export has more downtime than uptime the last week, and uptime over the last 90 days is something like 80%. The cause is apparently PDF exports leaking system resources. I have suggested just disabling PDF exports for now, as a short-term workaround. Longer term there's an effort in progress to rewrite ws-export such that it can rate-limit jobs (and, I hope, run jobs on external compute nodes so one job can't take down the whole service). Xover (talk) 19:48, 18 December 2023 (UTC)
- @Xover: Is there a particular place this issue is being tracked or discussed publicly right now? I've stopped doing PDF downloads after transcriptions for now to respect the servers. But I would like to keep track of the status of this issue. PseudoSkull (talk) 16:41, 23 December 2023 (UTC)
- @PseudoSkull: Oh, sorry, this one slipped my mind. The most apposite task is probably T335553. Xover (talk) 08:13, 27 December 2023 (UTC)
- @Xover: Is there a particular place this issue is being tracked or discussed publicly right now? I've stopped doing PDF downloads after transcriptions for now to respect the servers. But I would like to keep track of the status of this issue. PseudoSkull (talk) 16:41, 23 December 2023 (UTC)
The start of a conversation about born-digital texts
NB! This thread is not a proposal and not a vote about anything!
In a currently open deletion discussion several contributors have expressed various forms of scepticism or reservations about born-digital texts. This resonates with my own scepticism about such texts, that come to the surface whenever I run across an example of them that I feel is problematic for one reason or another. So I think the time may be ripe to start a conversation about it.
I'm not suggesting we try to make policy in this conversation, or really decide anything much. Iff any change is to be made as a result it'll be in a separate proposal, and based on something much more concrete and structured (e.g. a specific policy).
In any case… Right now born-digital texts are essentially treated identically to traditionally published texts (out-of-copyright printed books, to boil it down to its most core definition) as far as our policies and practices are concerned. But there are fundamental differences between the two categories of texts and our tooling and workflows (and policies) are built based on assumptions that something exists on paper somewhere as the ultimate source.
I don't think anyone would go so far as to blanket ban born-digital texts. But I think maybe it's worth considering whether to define the scope for born-digital texts in terms of limiting what can be included, and under what circumstances / to what standards. I, for my own part, have been wondering whether we should start treating born-digital texts as an exception rather than co-equal to traditional texts. That is, "We don't allow born-digital texts, unless X, Y, and Z."
I think it's fairly obvious that were we to go that far we would need at the very least some commonsense exceptions, and probably some form of a grandfather-clause.
One sensible and, I don't think, not too controversial carveout from the status quo is the mirroring of content originally created on our sister projects: Wikipedia articles (including articles in The Wikipedia Signpost), Wikibooks books, Wikinews articles, and so forth. I think it would even make sense to say "wikis" in general. These are already way into the gray area close to WS:WWI's "stuff that keeps changing", but not explicitly and not as a bright-line rule.
Another possible carveout might be blog posts, even from reputable publishers (the example in the deletion discussion referenced is published by the London School of Economics). While superficially similar to a magazine or newspaper, blogs in general tend to have very different quality standards, ad hoc editorial standards, tend to change quietly and invisibly, etc. The better way to deal with blogs is the WayBack Machine / Internet Archive or other web mirroring services. We're just not set up to do a good job of such content.
And that's more or less the crux of the issue for me: for born-digital texts, Wikisource and our tools and practices add no or very little value, and almost always require some level of compromise to our standards or approaches.
So I'd like to hear the sentiments of the broader community on this. Are we happy with the status quo? Or if not, what exceptions and grandfather clauses would we need if we were to raise the bar for born-digital texts?
Or maybe born-digital texts should just be clearly labelled, have their own manual of style, requirements, etc.?
Is this a case-by-case type of issue? Do we need specific policy for it? A separate "Born-digital texts policy"? Or an adjustment to WS:WWI?
I have my own thoughts and opinions about this issue, but not so firmly held that I'd care to stake out a clear "We as a community should do this specific thing." So thoughts, opinions, advice would be very welcome. Including "Leave well enough alone!" or variants. I am not at all sure we should do something here right now (there are definitely downsides), so if that's where the community is at that'd be valuable input too. Xover (talk) 17:39, 25 December 2023 (UTC)
- @Xover: Thanks for the in-depth analysis. For the record, I don't think that we should blanket-ban all born-digital texts, especially those that originated from government agencies. 7,200 Lost U.S. Silent Feature Films (1912-29), for example, is a text that was born digital, but it has a single editor, was made on behalf of a federal government agency (the Library of Congress), and was published in PDFs (with a new version released when it's updated). This, to me, is very different from a blog post or, especially, a text born directly from a MediaWiki page.
- Then we have to consider exceptions due to notability. Me at the zoo, for example, could be considered a born-digital text. It's also a pretty informal video. And such YouTube videos are not things I'd normally think were appropriate for Wikisource...except for the fact that it's the first ever YouTube video, has its own Wikipedia article, etc. So clearly the video has established notability.
- These lines are hard to draw. But I think born-digital texts that are government document, that have multi-project established notability, or that have some kind of a static nature (especially where they originate in a file such as a PDF or a video), are things we should consider here.
- For wiki pages and blogs, I think those should be a hard line IMO, unless there's some extremely exceptional case (probably related to notability) that I haven't considered yet. But, in terms of the rest of the Internet, we have to consider quite a lot, because the Internet is huge, despite having only been around for around a quarter of a century. And it's hurting my brain already. Cheers, and Merry Christmas to all! PseudoSkull (talk) 19:31, 25 December 2023 (UTC)
- Thanks. I think video is an entirely separate conversation, mainly because I think the factors are so different from textual works, so none of what I wrote above should be assumed to apply to video as a category of work. But your examples made me think: maybe the way to navigate towards some kind of balance is to judge a born-digital work's "distance" from a traditional paper book? If it has a formal publisher (the feds, or Oxford University, etc.), is subject to some standard etc., and is published in some persistent form (i.e. has some similar properties to a printed book), then it fits better for us? If the Government Printing Office sends one copy of a PDF master to the printers and puts one copy on their website, there's no need for us to go scan the printed book vs. just grabbing the PDF. Contrast this with a webpage that's continuously updated in small and large ways, with no versions, revision history, or dates and editions. This may be too limiting as a policy, but maybe it's a useful lens for thinking about the issue? Xover (talk) 19:46, 25 December 2023 (UTC)
- I definitely oppose a ban on all born-digital texts, for the simple reason that all laws nowadays are born-digital. In fact, I think that laws are a gold standard on a hypothetical “born-digital spectrum” which could/should be policy/general instructions. They (1) come from a source with roots in printed material, (2) are in document form (as opposed to a normal Web-page), and (3) are stable long-term, or at least have clearly defined and separate versions. My main reluctance in pushing this as a policy is that there will likely be some edge cases: for one example, I think that “End Poem” would fail basically all of my “tests,” but I still think it should be included here. For such edge cases, I think a hypothetical policy could allow for other ways to make up for the failure to meet the conditions. For example, even if editing is technically possible (violating (3)) a commitment against editing could allow for inclusion. However, such a policy should be very harsh, such that exceptions are rare. Now, to justify my restrictions. (1) is necessarily to keep in line with the traditionally accepted methods of publication. What is now a legally published blog post was formerly just a thought or discussion, unless it got into a local newspaper as an editorial. In addition, the medium of print ensures long-term stability; you can’t edit a published editorial, for example. (2) is related to (1), in ensuring that Wikisource fulfills its role. The role is enshrined in our Index: and Page: namespaces, which are designed to handle multi-page documents. For Web-pages, there really is no direct equivalent in document form: either you have one very long image, or a multi-page PDF where the Web-page is broken up into several smaller pieces. Neither of these are adequate as a representation of such a Web-page, and could be handled better without any involvement of File:, Index:, or Page:. For that reason, I think the document restriction is reasonable. (3) also harkens back to print restrictions: where modifications are difficult, and result in different versions, Wikisource can represent each version. But where modifications are easy, and result in no versions-like differentiation, Wikisource can be stuck with a prior version without knowing about an update. For this reason, and also what I have said in defending restriction (2), on-line archival systems like the Wayback Machine are the superior method of preserving old Web-sites. TE(æ)A,ea. (talk) 21:19, 25 December 2023 (UTC)
- @TE(æ)A,ea.: Quick comment: Since End Poem is from a video game (Minecraft), it exists within an originally published video game, so that should be considered to pass our publication requirements. It was mainstream to distribute games through physical disks and cartridges up until very recently in gaming history (think late 2010s, early 2010s), so while technically "digital" I'd still consider Minecraft a physically published work, since I know for a fact (though I'm not a Minecrafter myself) that Minecraft was published on disc for several consoles and devices. PseudoSkull (talk) 22:09, 25 December 2023 (UTC)
- PseudoSkull: Our copy of “End Poem”—the one formally released by the author—was buried in a blog post; as that is the formal release, I would consider the blog post the source, but the video game the point of notability—although in any case, “End Poem” is an unusual case, the sort of “edge case” of which I was thinking in my longer comment. TE(æ)A,ea. (talk) 00:06, 26 December 2023 (UTC)
- I think we are always going to need safety-valves for the really odd cases, because in aggregate there are quite a few of them. But I think the community has been pretty good at spotting these when they crop up, and have carved out ad hoc exceptions for them. Xover (talk) 08:08, 27 December 2023 (UTC)
- PseudoSkull: Our copy of “End Poem”—the one formally released by the author—was buried in a blog post; as that is the formal release, I would consider the blog post the source, but the video game the point of notability—although in any case, “End Poem” is an unusual case, the sort of “edge case” of which I was thinking in my longer comment. TE(æ)A,ea. (talk) 00:06, 26 December 2023 (UTC)
- @TE(æ)A,ea.: Quick comment: Since End Poem is from a video game (Minecraft), it exists within an originally published video game, so that should be considered to pass our publication requirements. It was mainstream to distribute games through physical disks and cartridges up until very recently in gaming history (think late 2010s, early 2010s), so while technically "digital" I'd still consider Minecraft a physically published work, since I know for a fact (though I'm not a Minecrafter myself) that Minecraft was published on disc for several consoles and devices. PseudoSkull (talk) 22:09, 25 December 2023 (UTC)
- I likewise oppose a full ban on digital-born texts. Reasons have been well-stated already, so I will simply add that there are an increasing number of peer-reviewed scientific journals with electronic-only publication, to reduce costs. While most of that content won't be available for us to host for decades, some of it is public domain now. --EncycloPetey (talk) 22:12, 25 December 2023 (UTC)
- To add to this, many recent government edicts are full-on digital releases. So, any ruling should certainly leave those and scientific journals from being caught in the fire. PseudoSkull (talk) 03:09, 27 December 2023 (UTC)
Meta categories for occupations
I would like to question the legitimacy of having occupation categories, such as Category:Composers, as meta categories that don't allow pages. We can have works that are not about specific composers, but are about composers in general, or "composer" as an occupation. I was thinking "The Great American Composer", an essay about American composers (in general), should be added to that category, but it is not a biography or an author. PseudoSkull (talk) 15:32, 27 December 2023 (UTC)
- @PseudoSkull: So... a subcategory named "Works about composers"? Fully diffused categories are a common way to organise things in order to keep the top few levels of categories manageable. Xover (talk) 16:09, 27 December 2023 (UTC)
- @Xover: That sounds fine to me—whatever works. PseudoSkull (talk) 16:10, 27 December 2023 (UTC)
- And "Composers in fiction" should be separate as well, for fictional works with a central character who is a composer? --EncycloPetey (talk) 17:39, 27 December 2023 (UTC)
Journal of a Residence in Circassia Book Problem
Hello.
Journal of a Residence in Circassia is a book written by London E. Moxon who died in 1858, so it has zero problems with Copyright. However, I have no I idea how to actually make a new page about this book.
Here is the original text:
https://archive.org/details/journalofresiden01belluoft/mode/2up Blahhmosh (talk) 19:43, 27 December 2023 (UTC)
- There are already scans of this book on Commons, albeit from a different library. Create the Index pages:
- and begin transcribing. --EncycloPetey (talk) 20:13, 27 December 2023 (UTC)
Date should not be linked
I have created a {{Hyakunin Isshu link}} based on invoking {{Article link}}, and for some reason the date "13th c," is being linked in the implementation, even though it is not linked in any of the examples provided in the documentation. Example of the issue below:
- "tago no ura ni", poem 4, by [[Author:Yamabe no Akahito|Yamabe no Akahito]] in Hyakunin Isshu, (ed.) by Fujiwara no Teika (13th c.)
Please, can someone determine (and possibly correct) the cause of the problem? --EncycloPetey (talk) 23:33, 27 December 2023 (UTC)
- @EncycloPetey it seems that if one of these is not specified, date is linked:
- Mpaa (talk) 17:33, 28 December 2023 (UTC)
: if args['series'] or args['volume'] or args['issue'] then : ... : else : if date_str then : local target = table.concat(parts, '/', 1, #parts) : local date_link = make_link(target, date_str, true) : out = out .. ', ' .. make_span('wst-artlink-midlevel wst-artlink-date', date_link) : end : end :
- sigh; then I guess I'll need to make a link template from scratch, as none of the existing multi-purpose base templates appear to be usable. --EncycloPetey (talk) 17:53, 28 December 2023 (UTC)
Access to WSJ articles?
Earlier today I saw an article on the wsj website titled: How an Israeli Airstrike on a Hamas Commander Also Killed Scores of Civilians.
It was a stunning analysis an airstrike that took place on October 31, almost two months earlier. It had tons of images of crowds of men and buildings, some that tell a whole story. It had quotes from many men, several of whom I would hazard a guess, no one reading this had ever heard of before. There was also one fairly elaborate quote from a woman who was identified as an aunt of someone, but her name was not revealed in the story.
I expect this article will probably be used to add information by those few on the English Wikipedia who have access to the WSJ website, but the majority of editors and readers will never see it. Where in wiki-land will this article be accessible to simpletons suchn as myself? Will it be here, Commons, or anywhere else?
Thanks in advance, Ottawahitech (talk) 15:41, 28 December 2023 (UTC)
- @Ottawahitech: It seems unlikely to me that a Wall Street Journal article from 2023 is freely licensed in any way, so I don't think that it can be hosted anywhere at all in the WMF sphere. PseudoSkull (talk) 19:15, 28 December 2023 (UTC)
- It is possible that you can access it via the Wikipedia Library: https://wikipedialibrary.wmflabs.org/, e.g. via EBSCO or ProQuest. MarkLSteadman (talk) 05:11, 30 December 2023 (UTC)
'Mark proofread' gadget stopped working properly
In the past day or so this gadget has stopped working properly. In some cases it is putting a red box around 'problematic' or 'no text' pages. For pages marked 'proofread', it only seems to be highlighting correctly the first 30 or 40 pages but after that it stops. Chrisguise (talk) 04:30, 30 December 2023 (UTC)
- Is this related to the thread Transclusion checker - Translations namespace that I posted above? --EncycloPetey (talk) 04:37, 30 December 2023 (UTC)
- @EncycloPetey: Only insofar as it prompted me to rewrite the Gadget so that it could support the Translation: namespace as well.@Chrisguise: Can you give an example of an Index: page on which it is not behaving correctly? I'll also need a list of ~3–5 pages in that Index: that are either getting incorrectly marked, or are not getting marked when they should be, each with a description of what you're seeing. The name of your web browser and its version and which operating system you're running is going to be useful if the problem doesn't manifest everywhere. And if you're at all technically inclined and comfortable doing so, have a look in your web browser's JavaScript Console for error messages that could be connected to this. If you're not comfortable futzing about with that console then don't worry about it for now. Xover (talk) 10:05, 30 December 2023 (UTC)
- @Xover@EncycloPetey A simple example would be the current PotM Index:Frenzied Fiction.djvu. In the index page everything up to /122 is transcribed to one point or another (5 'no text', 3 'not proofread' and the balance split between 'proofread' and 'validated'). In this case there are no red borders on the 'no text' or 'not proofread' ones. However, there aren't any on the ~54 'proofread' pages. Whilst I have transcribed some of these, I haven't done them all. Examples are /78, /79, /80.
- Windows 11 Home Version 10.0.22631 Build 22631. Firefox 120.0.1 (64 bit).The JavaScript console for the index page has one item with a white exclamation mark in a red circle. 'Uncaught TypeError: Images is undefined' prefetchImages ext.gadget.Preload_Page_Images-script-0.js:25
- jQuery 3
- xhr index.js:298I will look for another index with different problems. I have also tried Edge with the similar results. Chrisguise (talk) 10:34, 30 December 2023 (UTC)
- Hmm. The JavaScript error you see is a problem with a different Gadget (Preload Page Images; it was broken due to upstream changes in PRP/MW, so I rewrote it and forgot to limit it to the Page: namespace. Index:-namespace pages obviously have no "next" image to preload), and it should be fixed now.That pages marked "No text" are not highlighted is deliberate, as "No text" is presumed to be an "end state" like "Validated".However, the pages in status "Proofread" (marked so by a different contributor) and the pages in status "Problematic" should definitely be getting a red outline. And indeed, that's what happens in my web browser. So something is definitely broken, but whatever it is varies by user account or web browser or similar. A few more example Index: pages would be useful for testing. And if anybody else could test and report whether and if so what problems they see (include browser + OS) that might be useful too. Xover (talk) 10:54, 30 December 2023 (UTC)
- @Xover@EncycloPetey Another example. Index:The return of the soldier (IA returnofsoldier00west2).pdf. This is a work which I have contributed to and which has all pages proofread. There are no instances of incorrect red borders around 'no text' pages but none of the 'proofread' pages has a red border, although I have done nothing on, for example, /154, /155, /156, /157, /158, etc. Chrisguise (talk) 11:25, 30 December 2023 (UTC)
- Hmm. I wonder if this might be because we're hitting an API request limit. Admins have higher limits than regular users, which might explain why you're seeing this problem but it works fine for me. If other contributors who are admins do not see this problem but other contributors who are not admins do see it then that would tend to indicate that that's what's going on. Xover (talk) 11:57, 30 December 2023 (UTC)
- @Chrisguise: I've added some logging for errors returned by the API. Could you test again and see if you see any new messages in the JavaScript console? If an error or warning is returned by the API the gadget will log the returned error or warning object to the console, and it will probably be displayed as an expandable data structure (disclosure triangles to show branches). To start with it'll be the main error message and possibly error code that are relevant. Iff this is an API error that should tell me what's going on. Xover (talk) 12:10, 30 December 2023 (UTC)
- Another example:
- I have checked the Java console on the two index pages referenced above and there are no red flags, just a lot of entries along the lines of 'Referrer Policy: Ignoring the less restricted referrer policy “origin-when-cross-origin” for the cross-site request: https://commons.wikimedia.org/w/load.php?modules=ext.gadget.CropTool', each of which seems to be related to the gadgets I have selected in my preferences.
- Another example is Index:Tales of my landlord (Volume 4).djvu, which I have contributed to only a little. All pages are at least proofread. However, only those pages at 'proofread' up to /42 have a red border, with nothing thereafter (e.g. /43, /46, /47, /48 etc.), with the exception of /352 and /353, which do (correctly) have a red border. Chrisguise (talk) 14:55, 30 December 2023 (UTC)
- Index:The Music of the Spheres.djvu On this one, which is a work in progress, /8 to /13, which are at 'not proofread' have red borders, but other pages at this status do not (e.g. /90, /97, etc.) Chrisguise (talk) 15:00, 30 December 2023 (UTC)
- Index:The Crowne of all Homers Workes - Chapman (1624).djvu. This index has a lot of 'problematic' pages, as the images have not yet been done. Of the ones at 'problematic', /4, /10, /12, /30, /178, and /196 have red borders but none of the others do. Chrisguise (talk) 15:22, 30 December 2023 (UTC)
- This is very strange. It sounds like the pages that do get marked are marked correctly, but not all pages get checked. That would typically be because the API calls to get the status fail after one or two batches, but if that were the case I would expect there to be either a local error (generated by the web browser) or a returned error from the API (which I am now logging) in the JavaScript console. The absence of anything in the logs makes this kinda mysterious. I have been unable to reproduce this in any browser I have available.In an effort to get somewhere at least, I have tweaked the gadget to optimize the API requests so it now requests less data and less data that it is expensive for the server to generate. This really shouldn't matter, but since we're flying kind of blind here I figure it's worth a shot. Xover (talk) 16:00, 30 December 2023 (UTC)
- FWIW, I'm getting the same errors, despite being an admin, so that doesn't seem to be the issue. And there are plenty of other indexes for which the check runs just fine albeit much slower than in the past). The Index Index:Hundredversesfro00fujiuoft.djvu no longer runs properly, but Index:The Innocents Abroad (1869).djvu completes successfully. --EncycloPetey (talk) 17:35, 30 December 2023 (UTC)
- They're both ok for me (I have never seen errors). @Xover now the transclusion check is broken when I press "check", I think your last modification has broken the query. Mpaa (talk) 18:12, 30 December 2023 (UTC)
- @Mpaa: D'oh! That's because I'm a dummy. I did indeed break the transclusion check in the last change, and in probably the dumbest possible way. It should be fixed now though. Xover (talk) 18:18, 30 December 2023 (UTC)
- Ok, now I have no idea what's going on. It works perfectly for me on every index, and the query now should be as efficient as the old one so it's unlikely we're running into any rate limit. Xover (talk) 18:20, 30 December 2023 (UTC)
- It's now working on each Index I've tried. --EncycloPetey (talk) 19:49, 30 December 2023 (UTC)
- @EncycloPetey @Mpaa @Xover No change for me I'm afraid. Chrisguise (talk) 20:31, 30 December 2023 (UTC)
- @Chrisguise: I (finally) found a specific bug in the code that gave very strange symptoms and could be the cause of the problems you've been having. Could you try again, and if it fails to verify that you see "[transclusion-check]: loading with revision 11" in the JavaScript console (MediaWiki uses aggressive caching for Gadgets so it sometimes takes quite a long time from I make a change until it's active for everyone). Xover (talk) 20:58, 30 December 2023 (UTC)
- @EncycloPetey @Mpaa @Xover I can see "[transclusion-check]: loading with revision 11" in the JavaScript console and the gadget does now appear to be working properly. Thanks. Chrisguise (talk) 21:36, 30 December 2023 (UTC)
- Yay! Sorry about the mess, this was very definitely a bug in my code.I was finally able to reproduce the problem by removing the logged-in check and making the Gadget default so that I could check it when not logged in. I have no idea why it worked perfectly for me when I was logged in, but failed for others, and I definitely don't understand why I was suddenly able to reproduce it when not logged in. In fact, I don't understand how the old code ever worked for anyone.API requests are divided into batches when there is more than some limit of data to fetch, and the need to fetch more data is signalled by including a "continue" parameter in the returned data. To get more data you resend the original request but add on the magic values you found in that "continue" parameter. There can be multiple magic values in there, one for each type of data you're fetching (transcluded pages, categories, revisions, etc.). The bug was that when first loading an index the Gadget only asks for information about revisions (to get information about who last changed page status), but when this request needed more batches the code would instead request information about transcluded pages and categories but still include the "continue" magic values for the revisions request.This really should have never worked for anyone and should have failed with an obvious error message. I have no clue how that seemingly worked perfectly for me (and Mpaa etc.).In any case, I've removed most of the debug logging and removed it from the default gadgets, so we don't spam the everyone's logs so much. I'll probably have to refactor this code again to make it less susceptible to this kind of bug, but I'll let it sit in peace for a while first. Xover (talk) 22:17, 30 December 2023 (UTC)
- @Xover I always saw a single request looking with Inspect->Network. Mpaa (talk) 00:10, 31 December 2023 (UTC)
- @Mpaa: Yeah, I suppose that could be the explanation. Admins may have higher API limits in a way that affects how many results can be returned in a single request than normal users like Chris. And it's also possible that non-logged in users have even smaller limits. That could explain why Chris saw problems that you and I didn't, and I why I finally saw them when I was logged out. It doesn't quite explain why Petey was able to reproduce it while logged in, but that might just be differences between indexes (number of pages, say).In any case, big thanks to all of you for catching this problem and helping me fix it! Xover (talk) 10:33, 31 December 2023 (UTC)
- @Xover I always saw a single request looking with Inspect->Network. Mpaa (talk) 00:10, 31 December 2023 (UTC)
- Yay! Sorry about the mess, this was very definitely a bug in my code.I was finally able to reproduce the problem by removing the logged-in check and making the Gadget default so that I could check it when not logged in. I have no idea why it worked perfectly for me when I was logged in, but failed for others, and I definitely don't understand why I was suddenly able to reproduce it when not logged in. In fact, I don't understand how the old code ever worked for anyone.API requests are divided into batches when there is more than some limit of data to fetch, and the need to fetch more data is signalled by including a "continue" parameter in the returned data. To get more data you resend the original request but add on the magic values you found in that "continue" parameter. There can be multiple magic values in there, one for each type of data you're fetching (transcluded pages, categories, revisions, etc.). The bug was that when first loading an index the Gadget only asks for information about revisions (to get information about who last changed page status), but when this request needed more batches the code would instead request information about transcluded pages and categories but still include the "continue" magic values for the revisions request.This really should have never worked for anyone and should have failed with an obvious error message. I have no clue how that seemingly worked perfectly for me (and Mpaa etc.).In any case, I've removed most of the debug logging and removed it from the default gadgets, so we don't spam the everyone's logs so much. I'll probably have to refactor this code again to make it less susceptible to this kind of bug, but I'll let it sit in peace for a while first. Xover (talk) 22:17, 30 December 2023 (UTC)
- @EncycloPetey @Mpaa @Xover I can see "[transclusion-check]: loading with revision 11" in the JavaScript console and the gadget does now appear to be working properly. Thanks. Chrisguise (talk) 21:36, 30 December 2023 (UTC)
- @Chrisguise: I (finally) found a specific bug in the code that gave very strange symptoms and could be the cause of the problems you've been having. Could you try again, and if it fails to verify that you see "[transclusion-check]: loading with revision 11" in the JavaScript console (MediaWiki uses aggressive caching for Gadgets so it sometimes takes quite a long time from I make a change until it's active for everyone). Xover (talk) 20:58, 30 December 2023 (UTC)
- @EncycloPetey @Mpaa @Xover No change for me I'm afraid. Chrisguise (talk) 20:31, 30 December 2023 (UTC)
- It's now working on each Index I've tried. --EncycloPetey (talk) 19:49, 30 December 2023 (UTC)
- They're both ok for me (I have never seen errors). @Xover now the transclusion check is broken when I press "check", I think your last modification has broken the query. Mpaa (talk) 18:12, 30 December 2023 (UTC)
- FWIW, I'm getting the same errors, despite being an admin, so that doesn't seem to be the issue. And there are plenty of other indexes for which the check runs just fine albeit much slower than in the past). The Index Index:Hundredversesfro00fujiuoft.djvu no longer runs properly, but Index:The Innocents Abroad (1869).djvu completes successfully. --EncycloPetey (talk) 17:35, 30 December 2023 (UTC)
- This is very strange. It sounds like the pages that do get marked are marked correctly, but not all pages get checked. That would typically be because the API calls to get the status fail after one or two batches, but if that were the case I would expect there to be either a local error (generated by the web browser) or a returned error from the API (which I am now logging) in the JavaScript console. The absence of anything in the logs makes this kinda mysterious. I have been unable to reproduce this in any browser I have available.In an effort to get somewhere at least, I have tweaked the gadget to optimize the API requests so it now requests less data and less data that it is expensive for the server to generate. This really shouldn't matter, but since we're flying kind of blind here I figure it's worth a shot. Xover (talk) 16:00, 30 December 2023 (UTC)
- @Chrisguise: I've added some logging for errors returned by the API. Could you test again and see if you see any new messages in the JavaScript console? If an error or warning is returned by the API the gadget will log the returned error or warning object to the console, and it will probably be displayed as an expandable data structure (disclosure triangles to show branches). To start with it'll be the main error message and possibly error code that are relevant. Iff this is an API error that should tell me what's going on. Xover (talk) 12:10, 30 December 2023 (UTC)
- Hmm. I wonder if this might be because we're hitting an API request limit. Admins have higher limits than regular users, which might explain why you're seeing this problem but it works fine for me. If other contributors who are admins do not see this problem but other contributors who are not admins do see it then that would tend to indicate that that's what's going on. Xover (talk) 11:57, 30 December 2023 (UTC)
- @Xover@EncycloPetey Another example. Index:The return of the soldier (IA returnofsoldier00west2).pdf. This is a work which I have contributed to and which has all pages proofread. There are no instances of incorrect red borders around 'no text' pages but none of the 'proofread' pages has a red border, although I have done nothing on, for example, /154, /155, /156, /157, /158, etc. Chrisguise (talk) 11:25, 30 December 2023 (UTC)
- Hmm. The JavaScript error you see is a problem with a different Gadget (Preload Page Images; it was broken due to upstream changes in PRP/MW, so I rewrote it and forgot to limit it to the Page: namespace. Index:-namespace pages obviously have no "next" image to preload), and it should be fixed now.That pages marked "No text" are not highlighted is deliberate, as "No text" is presumed to be an "end state" like "Validated".However, the pages in status "Proofread" (marked so by a different contributor) and the pages in status "Problematic" should definitely be getting a red outline. And indeed, that's what happens in my web browser. So something is definitely broken, but whatever it is varies by user account or web browser or similar. A few more example Index: pages would be useful for testing. And if anybody else could test and report whether and if so what problems they see (include browser + OS) that might be useful too. Xover (talk) 10:54, 30 December 2023 (UTC)
- @EncycloPetey: Only insofar as it prompted me to rewrite the Gadget so that it could support the Translation: namespace as well.@Chrisguise: Can you give an example of an Index: page on which it is not behaving correctly? I'll also need a list of ~3–5 pages in that Index: that are either getting incorrectly marked, or are not getting marked when they should be, each with a description of what you're seeing. The name of your web browser and its version and which operating system you're running is going to be useful if the problem doesn't manifest everywhere. And if you're at all technically inclined and comfortable doing so, have a look in your web browser's JavaScript Console for error messages that could be connected to this. If you're not comfortable futzing about with that console then don't worry about it for now. Xover (talk) 10:05, 30 December 2023 (UTC)
Main namespace name of Index:Jesuit education, its history and principles viewed in the light of modern educational problems.djvu
I have have no experience of my own with such a long file name. I was thinking of naming the main namespace page as Jesuit Education, but I want to defer to the community practice. — ineuw (talk) 18:26, 31 December 2023 (UTC)
- @Ineuw: Just my personal opinion, but I myself would always use the name Jesuit Education, because subtitles are just too complex and look really nasty as titles. Take for example Wikipedia's common practice of leaving out the subtitle, and going with the simpler name. MediaWiki isn't really designed to handle much baggage in terms of titles (since thousands of redirects are constantly necessary), so I think the simpler title is better.
- Unfortunately, though, from a policy standpoint, there isn't really anything stopping you from creating the page as Jesuit Education ; its History And Principles viewed in The Light of Modern Educational problems, which is a really nasty lenience on the part of our titling policy that I'm still trying to think of a way to convince the community at large to fix. PseudoSkull (talk) 18:38, 31 December 2023 (UTC)
- Of course I would prefer the short name as well. My question was concerning web searches and then the accuracy of its identity on the commons. — ineuw (talk) 18:55, 31 December 2023 (UTC)
- @Ineuw: The short / main title is definitely preferable here. Keep in mind that it is the wikipage name we're talking about; the actual title is what's put in the
|title=
parameter of the {{header}} template. Think of the wikipage name as a file name: it's preferable if it is descriptive so you know what's inside, but it's the contents that count. In principle we could even have automatically generated hexadecimal identifiers in the URL. But as things are, I would give priority to things like making the page name easy to type for a human, easily findable through the search function, not being surprising relative to the actual text on the page etc. Xover (talk) 19:55, 31 December 2023 (UTC)- Comment @Xover: That "hexadecimal identifier [sort of thing]" will probably be the version's Wikidata item in the future. Probably far future, given the nature of change on this site... In any case, this would make handling titles and disambiguation much more straightforward, since the subtitle and title can be clearly categorically separated on Wikidata, and then we can use that data to display every work's title/subtitle however we'd like dynamically, rather than have them be inputted manually into multiple places. PseudoSkull (talk) 20:01, 31 December 2023 (UTC)
Do preview settings in css affect previews of all namespaces?
The preview page in the User:Ineuw/vector.css is set to be displayed to the left at 45% of the display screen width. This setting affects previews in all namespaces, including this post's preview. Is it supposed to be so? Just asking. — ineuw (talk) 18:45, 31 December 2023 (UTC)
- @Ineuw: The contents of User:Ineuw/vector.css is loaded in every single page when you have the Vector skin active. What it applies to is determined by the selectors you put in there. Xover (talk) 19:50, 31 December 2023 (UTC)
- Much thanks. — ineuw (talk) 20:19, 31 December 2023 (UTC)
Turkey in the Straw request for someone interested in music
As the transcription of Steamboat Willie is going to officially be published here at Wikisource in just a few hours, I noticed that the main song used in that film, "Turkey in the Straw", has no available transcriptions here whatsoever. While "Turkey in the Straw" (c. 1834) has been in the public domain worldwide for quite a long time, it now has a special relevance to the current public domain year of 2024, or 1928. And I am hereby requesting that someone transcribe some version of the song sooner than later, since it will remain a red link on what I suspect will be a very high-traffic transcription in the coming months. PseudoSkull (talk) 22:12, 31 December 2023 (UTC)
- @Beeswaxcandle: Does this seem up your alley? PseudoSkull (talk) 00:50, 1 January 2024 (UTC)
- “Turkey in the Straw” is a tough case, because it’s a new version (different lyrics and title) of the older (and original) song “Zip Coon.” So, we could look for some sheet music for “Turkey in the Straw” (like IMSLP’s copy here), or we could use an old copy of “Zip Coon” (like the one here), and create “Turkey in the Straw” as a redirect to “Zip Coon.” PseudoSkull, which approach would you prefer? (There’s also the four-page copy uploaded per-page here, which might be a good point from which to start.) As a matter of fact, would a portal for the multiple lyrics (“Turkey in the Straw,” “Zip Coon,” and “Nigger Love a Watermelon, Ha! Ha! Ha!”) be appropriate? TE(æ)A,ea. (talk) 01:06, 1 January 2024 (UTC)
- @TE(æ)A,ea.: Wow! I actually wasn't aware that the song was this racially motivated...reminds me of the "Catch a tiger by his toe" rhyme and its history. Well... In any case, I think a versions page the original song (whichever is the original) is the solution I'd prefer. At the very least, "Turkey in the Straw", as well as the one with the less than considerate lyrics, can be considered adaptations of the original, I think. And as far as the question of whether to transcribe "Zip Coon" or "Turkey and the Straw", I think (ideally) a version of both should be transcribed, but if I had to pick one, then a "Turkey in the Straw" one since that's the one that readers may be interested in, in relation to Steamboat Willie. PseudoSkull (talk) 01:23, 1 January 2024 (UTC)
- PseudoSkull: (Funnily enough, despite the racial titles and lyrics, the song really is known primarily for the non-racial “Turkey in the Straw” lyrics. The other lyrics, so far as I can tell, were not published at sheets, but only as a record.) Given that, I’ll start transcribing the already-uploaded four-page copy on Commons; transclude it at “Turkey in the Straw;” and leave the portal &c. work to you. TE(æ)A,ea. (talk) 01:39, 1 January 2024 (UTC)
- @TE(æ)A,ea.: Wow! I actually wasn't aware that the song was this racially motivated...reminds me of the "Catch a tiger by his toe" rhyme and its history. Well... In any case, I think a versions page the original song (whichever is the original) is the solution I'd prefer. At the very least, "Turkey in the Straw", as well as the one with the less than considerate lyrics, can be considered adaptations of the original, I think. And as far as the question of whether to transcribe "Zip Coon" or "Turkey and the Straw", I think (ideally) a version of both should be transcribed, but if I had to pick one, then a "Turkey in the Straw" one since that's the one that readers may be interested in, in relation to Steamboat Willie. PseudoSkull (talk) 01:23, 1 January 2024 (UTC)
- @TE(æ)A,ea.: Thank you. And how do you feel about the second song, "Steamboat Bill" (1910)? PseudoSkull (talk) 08:06, 1 January 2024 (UTC)
- PseudoSkull: (1) The “four pages” on Commons are actually pages 3–6 of another copy, so I uploaded the one I saw on IMSLP. So far, I’ve finished typing up the right-hand line. (2) For “Steamboat Bill,” IMSLP’s scan looks to be of high quality, so I will upload and proofread that copy after I’ve finished up “Turkey in the Straw” here. TE(æ)A,ea. (talk) 16:35, 1 January 2024 (UTC)
- PseudoSkull: “Steamboat Bill” is also now complete. TE(æ)A,ea. (talk) 00:31, 2 January 2024 (UTC)
Return of the Soldier
The File:The return of the soldier (IA returnofsoldier00west2).pdf needs to be localized to Wikisource. It is still under copyright in the UK. --EncycloPetey (talk) 01:21, 13 December 2023 (UTC)
- @EncycloPetey: The book was published in New York in 1918. It is an US work and is in the public domain in the US, which is sufficient for it to qualify being in Commons. Ciridae (talk) 05:25, 13 December 2023 (UTC)
- Commons doesn't see it that way. If it was also published in the UK, and the book is still under copyright in the UK, then it can't be housed there. We've had similar situations before where the copy on Commons was deleted. For example, The Murder on the Links. --EncycloPetey (talk) 05:53, 13 December 2023 (UTC)
- We'd have to do the move (to WS) ourselves, then, as we can't rely on Commons admins to do that for us. The file will simply end up being deleted on Commons. Is there a tool or process for that? Ciridae (talk) 06:40, 13 December 2023 (UTC)
- @Ciridae: Ask a bot-equipped admin. Billinghurst and myself have often done this. Xover (talk) 07:16, 13 December 2023 (UTC)
- OK. Thanks for letting me know. Ciridae (talk) 09:31, 13 December 2023 (UTC)
- General comment. Yes, I do it, either by request or when I see it needs doing. Noting though that not by bot anymore, the tools at Commons broke, and it was just too freaking ugly and convoluted for backend bot, to no benefit. So I manually do it. — billinghurst sDrewth 03:39, 14 December 2023 (UTC)
- OK. Thanks for letting me know. Ciridae (talk) 09:31, 13 December 2023 (UTC)
- @Ciridae: Ask a bot-equipped admin. Billinghurst and myself have often done this. Xover (talk) 07:16, 13 December 2023 (UTC)
- @EncycloPetey: This appears to have been first published in February 1918 in two parts in Century Magazine (New York), and then in book form by The Century Company in New York in early March. It was published in the UK in June the same year by Nisbet & Co. When a work is first published in the US, or published in the US within 30 days of first publication, it is a US work for copyright purposes according to Berne and thus also Commons licensing policy. The nationality of the author is immaterial here. Xover (talk) 07:14, 13 December 2023 (UTC)
- @Xover: a tangential question: How do you find out the publication history of any given work; this one as an example? Ciridae (talk) 09:41, 13 December 2023 (UTC)
- Research. In this case in US and UK newspapers where you can find ads, reviews, notices, etc. that mention the work, and thereby you can infer when publication happened and where. You usually can't get down to less than about a week's precision with any kind of certainty, and not everything will be noticed in newspapers, but often you can find enough to determine coarse limits like in this case. Xover (talk) 12:05, 13 December 2023 (UTC)
- The American publication in February and British publication in June in 1918 would be more than 3 months apart, so would the UK also be considered a "source country" to copyright? The author died in 1983, so her works would remain copyrighted in many countries rejecting the rule of the shorter term.--Jusjih (talk) 20:34, 5 January 2024 (UTC)
- Research. In this case in US and UK newspapers where you can find ads, reviews, notices, etc. that mention the work, and thereby you can infer when publication happened and where. You usually can't get down to less than about a week's precision with any kind of certainty, and not everything will be noticed in newspapers, but often you can find enough to determine coarse limits like in this case. Xover (talk) 12:05, 13 December 2023 (UTC)
- @Xover: a tangential question: How do you find out the publication history of any given work; this one as an example? Ciridae (talk) 09:41, 13 December 2023 (UTC)
- We'd have to do the move (to WS) ourselves, then, as we can't rely on Commons admins to do that for us. The file will simply end up being deleted on Commons. Is there a tool or process for that? Ciridae (talk) 06:40, 13 December 2023 (UTC)
- Commons doesn't see it that way. If it was also published in the UK, and the book is still under copyright in the UK, then it can't be housed there. We've had similar situations before where the copy on Commons was deleted. For example, The Murder on the Links. --EncycloPetey (talk) 05:53, 13 December 2023 (UTC)