Wikisource:Scriptorium/Archives/2024-10
Please do not post any new comments on this page.
This is a discussion archive first created in , although the comments contained were likely posted before and after this date. See current discussion or the archives index. |
Need help linking to a page on mulWS
I'm trying to link mul:Avesta with our page Avesta, but I can't do it since Wikidata does not recognise mul as a valid language code, and the interlinking option on mulWS itself is broken. Can anyone help? Mårtensås (talk) 21:39, 20 October 2024 (UTC)
- Mårtensås: Try it in the (wikilinks) area at wikidata called "Multilingual sites" which also contains Commons and Species.--RaboKarbakian (talk) 23:00, 20 October 2024 (UTC)
- This section was archived on a request by: —Justin (koavf)❤T☮C☺M☯ 02:25, 21 October 2024 (UTC)
The Jack Smith Brief concerning Jan 6th..
Firstly is it license compatible with Wikisource? Secondly is there a source for it? ShakespeareFan00 (talk) 13:18, 3 October 2024 (UTC)
- ShakespeareFan00: As he is acting in the capacity of government office, the brief is
PD-USGov
. It can be found here. TE(æ)A,ea. (talk) 13:47, 3 October 2024 (UTC) - Thanks.
The next conideration would be if the document contains material that is by others (outside the scope of the PD-USGov license. (Such as submissions by non-Federal employeesAs Annexes..) etc..
I however do think that if it's PDUSgov and within Wikisource scope, it's a document that Wikisource should attempt a transcribe of.
ShakespeareFan00 (talk) 16:09, 3 October 2024 (UTC)
Wikisource News
The latest edition of WS:News is out. Please enjoy. You are welcome to unsubscribe from these notifications by removing your name from this list. MediaWiki message delivery (talk) 15:56, 3 October 2024 (UTC)
Wrong licensing again
On the page for The Woman and the Priest, the licenses applied are {{PD/US|1936}} for the original (and this is correct) and {{PD-US-expired}} which displays incorrectly, It is using the date of death for the original author (which is known) but should not use any date because the date of death of the translator is not known. Because the translator's date of death is not known, no date was inserted into the template. But because no date of death was supplied, the template is automatically pulling to the known date, even though it shouldn't. I thought we had decided to deactivate automatic dates because of similar problems before? --EncycloPetey (talk) 18:00, 3 October 2024 (UTC)
- CalendulaAsteraceae: This was your work. I still don’t understand why everything is being moved to Lua. TE(æ)A,ea. (talk) 18:16, 3 October 2024 (UTC)
- I've disabled Wikidata pulls for creator death years in Module:License Wikidata.
- I can't say I especially want to have the argument about using Lua, but possibly now that the immediate issue has been resolved I should just suck it up and have the argument anyway. (I mean, not right now, since I have to get back to work, but soon.) —CalendulaAsteraceae (talk • contribs) 21:01, 3 October 2024 (UTC)
- That said, maybe the Lua argument should get its own thread? The question of when and whether to use Lua for things is separate from the question of which things it's used for. It is true that I wouldn't have tried to pull death and publication dates from Wikidata if we didn't have Lua modules or similar, because the idea of trying to implement that in wikicode makes me want to claw my own eyes out and never touch a computer again, but like I said, what we do and how we do it are separate questions. —CalendulaAsteraceae (talk • contribs) 04:47, 4 October 2024 (UTC)
Problems with pdf file pages displaying
I am experiencing a problem with PDF file pages displaying (it doesn't seem to be affecting DJVU). If I start to work on any existing pdf, either creating a new page or editing an existing one, the page image displays correctly. However, after two or three pages, the page image no longer displays. The 'transcribe text' tool still return text. This effect has happened on a range of pdf files, not just one, and with files that have been trouble free up to now.
I have tried a different browser and the same effect occurs. Chrisguise (talk) 07:39, 5 October 2024 (UTC)
- Ongoing discussion about this at WS:S/H#PDF images not loading — Alien 3
3 3 09:17, 5 October 2024 (UTC)- Thanks. Chrisguise (talk) 09:21, 5 October 2024 (UTC)
Invitation to Participate in Wiki Loves Ramadan Community Engagement Survey
Dear all,
We are excited to announce the upcoming Wiki Loves Ramadan event, a global initiative aimed at celebrating Ramadan by enriching Wikipedia and its sister projects with content related to this significant time of year. As we plan to organize this event globally, your insights and experiences are crucial in shaping the best possible participation experience for the community.
To ensure that Wiki Loves Ramadan is engaging, inclusive, and impactful, we kindly invite you to participate in our community engagement survey. Your feedback will help us understand the needs of the community, set the event's focus, and guide our strategies for organizing this global event.
Survey link: https://forms.gle/f66MuzjcPpwzVymu5
Please take a few minutes to share your thoughts. Your input will make a difference!
Thank you for being a part of our journey to make Wiki Loves Ramadan a success.
Warm regards,
User:ZI Jony 03:19, 6 October 2024 (UTC)
Wiki Loves Ramadan Organizing Team
Sentence case or Title Case for portals
We have Portal:Military Science (Title Case) and we have Portal:Language and literature (Sentence case), which should we be standardizing on? I want to create Portal:Baldwin robbery, or should it be Portal:Baldwin Robbery? RAN (talk) 16:40, 7 October 2024 (UTC)
- Just taken a quick skim through Category:Portals. On the whole, the trend is towards sentence case for portals like your proposed creation. Beeswaxcandle (talk) 17:42, 7 October 2024 (UTC)
- For what it's worth, Wikisource:Style_guide#Page_titles prefers sentence case as a "general guideline" but does not explicitly mention namespaces. —Justin (koavf)❤T☮C☺M☯ 18:00, 7 October 2024 (UTC)
- I agree, as I look at more and more, a majority use the Sentence Case. --RAN (talk) 18:05, 7 October 2024 (UTC)
- We've tended to use sentence case, and try to match whatever the Library of Congress Classification uses, but there is not 100% consistency. The main subject headings in The LoCC are in all-caps, so it can't be used as a guide for that level, but it can be used at other levels. We have Portal:Dictionaries and general reference in sentence case, but Portal:Philosophy, Psychology and Religion in title case. The latter would look odd in sentence case, as it is a list of three coordinate items, and I think that is why title case was used in that instance. So the general trend has been sentence case because that's what is usually used by the LoC, but in some situations where sentence case would look odd or misleading, title case gets used. --EncycloPetey (talk) 18:29, 7 October 2024 (UTC)
- I am glad we avoided the ALL CAPS used by some sources, it comes across as shouting. There are so many ways to bring attention to a word without ALL CAPS. --RAN (talk) 21:11, 7 October 2024 (UTC)
Tech News: 2024-41
Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
Weekly highlight
- Communities can now request installation of Automoderator on their wiki. Automoderator is an automated anti-vandalism tool that reverts bad edits based on scores from the new "Revert Risk" machine learning model. You can read details about the necessary steps for installation and configuration. [1]
Updates for editors
- Translators in wikis where the mobile experience of Content Translation is available, can now customize their articles suggestion list from 41 filtering options when using the tool. This topic-based article suggestion feature makes it easy for translators to self-discover relevant articles based on their area of interest and translate them. You can try it with your mobile device. [2]
- View all 12 community-submitted tasks that were resolved last week.
Updates for technical contributors
- It is now possible for
<syntaxhighlight>
code blocks to offer readers a "Copy" button if thecopy=1
attribute is set on the tag. Thanks to SD0001 for these improvements. [3] - Customized copyright footer messages on all wikis will be updated. The new versions will use wikitext markup instead of requiring editing raw HTML. [4]
- Later this month, temporary accounts will be rolled out on several pilot wikis. The final list of the wikis will be published in the second half of the month. If you maintain any tools, bots, or gadgets on these 11 wikis, and your software is using data about IP addresses or is available for logged-out users, please check if it needs to be updated to work with temporary accounts. Guidance on how to update the code is available.
- Rate limiting has been enabled for the code review tools Gerrit and GitLab to address ongoing issues caused by malicious traffic and scraping. Clients that open too many concurrent connections will be restricted for a few minutes. This rate limiting is managed through nftables firewall rules. For more details, see Wikitech's pages on Firewall, GitLab limits and Gerrit operations.
- Five new wikis have been created:
Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.
MediaWiki message delivery 23:42, 7 October 2024 (UTC)
Keymapping issue help needed.
For a few months now I have been unable to manually enter the following keys in my editing interface. For { I see ̪ and for } I see ˈ . Also < sometimes but not always gives some other character (its working ok at the moment so I can't tell what character it was displaying.)
This isn't a fault with my keyboard because if I open notepad the characters display as expected, and it is not specific to a browser because I see the same issue in edge and firefox, so perhaps it is some issue with my interface / profile settings.
Can someone advise how I can reset my interface to see if that clears the issue? I'd like to retain any customisations I may have made in the past (but can't remember what they would have been) so is some kind of backup possible first? Thanks Sp1nd01 (talk) 19:26, 7 October 2024 (UTC)
- Are you on a Mac? There are language settings on Macs that allow one keyboard to type as if it were a specialist keyboard in another language. I have (for example) sometimes accidentally set my keyboard into Czech mode, which can cause issues like the ones your describing. How you change the language on a Mac will depend on your current OS and on certain other settings. It may be as easy as clicking on the flag in the top of your monitor window and switching to the flag of your usual language. --EncycloPetey (talk) 19:41, 7 October 2024 (UTC)
- I forgot to mention I use Windows 10, but I have just checked the Language settings and Keyboard Settings and the only Language installed is English so I don't think that is causing the problem unless there is some additional setting hidden somewhere. It's not a major problem for me I just have to do additional copying and pasting when I need to use those characters. Thanks Sp1nd01 (talk) 08:12, 8 October 2024 (UTC)
- Does this happen on any website, or just Wikisource (how about other Wikis)? I don't think the issue is with your user js, but it's possible it might be some combination of the gadgets or your editing preferences you have enabled. Or do you use the same browser extensions on both Edge and Firefox, or programs like WinCompose or AutoHotKey, which can change your input based on which program you're using? --YodinT 09:08, 8 October 2024 (UTC)
- This happens when you use "Internal Phonetic Alphabet - SIL". Click in the bottom right on the edit field to see the little keyboard icon, or ctrl-M to return to "Native keyboard". You may also "Disable input tools". // M-le-mot-dit (talk) 09:32, 8 October 2024 (UTC)
- This was always just an issue on Wikisource for me, I had not seen it anywhere else.
- On editing it was showing the "Internal Phonetic Alphabet - SIL" option, I had not noticed that option before so I don't know why it had changed unless I may have accidentally click on it at sometime. I have changed it back to "Native keyboard" and all is now working normally again. Many thanks everyone for the assistanceǃ Sp1nd01 (talk) 10:13, 8 October 2024 (UTC)
- This happens when you use "Internal Phonetic Alphabet - SIL". Click in the bottom right on the edit field to see the little keyboard icon, or ctrl-M to return to "Native keyboard". You may also "Disable input tools". // M-le-mot-dit (talk) 09:32, 8 October 2024 (UTC)
Redirect
Do we ever have a redirect at Portal:Person when the person is at Author:Person? --RAN (talk) 03:40, 9 October 2024 (UTC)
- No. We do not use cross-namespace redirects. --EncycloPetey (talk) 04:11, 9 October 2024 (UTC)
Shortcuts
Today, I've noticed keyboard shortcuts were available ("ctrl-i" for italics and "ctrl-b" for bold). Are there new others? // M-le-mot-dit (talk) 09:39, 9 October 2024 (UTC)
- ctrl-u yields an underline. Cremastra (talk) 20:15, 9 October 2024 (UTC)
Three part story
I have a three part story where each third was printed in the paper each week. It was never published as one piece, would we stitch together the three parts into one entry, as well as keep the three parts? RAN (talk) 03:26, 9 October 2024 (UTC)
- We just had a discussion about handling this kind of publication. Check the archives for "Serialized works in periodicals (voting)". --EncycloPetey (talk) 04:11, 9 October 2024 (UTC)
- Thanks! I copied the format for the one I asked about. --RAN (talk) 21:37, 14 October 2024 (UTC)
Regex
I'm not managing to add regexes to my editing window as I documented at Wikisource:Regex. The UI is just absent. Any advice? HLHJ (talk) 21:10, 13 October 2024 (UTC)
- I think that installing w:en:User:Ebrahames/Advisor at User:HLHJ/common.js will give you a regex search and replace box. —Justin (koavf)❤T☮C☺M☯ 01:03, 14 October 2024 (UTC)
- Thank you! Just a box? I'd really like the ability to write my own regexes and save them and run them with a single click, which is what the Meta:TemplateScript promises... Has anyone got it to work? HLHJ (talk) 02:18, 14 October 2024 (UTC)
- Have you switched on the gadget in your preferences? Beeswaxcandle (talk) 06:08, 14 October 2024 (UTC)
- ...No. Thank you.
- I have expanded Wikisource:Regex accordingly; it now has step-by-step instructions for making it work. I will add some less trivial cookbook examples if people think the documentation useful.
- To everyone: Regex is like a really advanced search-replace, where you can include patterns like "any number from 1 to 300" or "whatever letters came before the first number in the line" or "all lines containing only one character". Is there any repetitive editing task you'd like to semi-automate? Please, suggest it, and if possible, I'll write up how to do it. HLHJ (talk) 19:31, 14 October 2024 (UTC)
- Have you switched on the gadget in your preferences? Beeswaxcandle (talk) 06:08, 14 October 2024 (UTC)
- Thank you! Just a box? I'd really like the ability to write my own regexes and save them and run them with a single click, which is what the Meta:TemplateScript promises... Has anyone got it to work? HLHJ (talk) 02:18, 14 October 2024 (UTC)
Preliminary results of the 2024 Wikimedia Foundation Board of Trustees elections
Hello all,
Thank you to everyone who participated in the 2024 Wikimedia Foundation Board of Trustees election. Close to 6000 community members from more than 180 wiki projects have voted.
The following four candidates were the most voted:
While these candidates have been ranked through the vote, they still need to be appointed to the Board of Trustees. They need to pass a successful background check and meet the qualifications outlined in the Bylaws. New trustees will be appointed at the next Board meeting in December 2024.
Learn more about the results on Meta-Wiki.
Best regards,
The Elections Committee and Board Selection Working Group
MPossoupe_(WMF) 08:26, 14 October 2024 (UTC)
Tech News: 2024-42
Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
Updates for editors
- The Structured Discussion extension (also known as Flow) is starting to be removed. This extension is unmaintained and causes issues. It will be replaced by DiscussionTools, which is used on any regular talk page. A first set of wikis are being contacted. These wikis are invited to stop using Flow, and to move all Flow boards to sub-pages, as archives. At these wikis, a script will move all Flow pages that aren't a sub-page to a sub-page automatically, starting on 22 October 2024. On 28 October 2024, all Flow boards at these wikis will be set in read-only mode. [10][11]
- WMF's Search Platform team is working on making it easier for readers to perform text searches in their language. A change last week on over 30 languages makes it easier to find words with accents and other diacritics. This applies to both full-text search and to types of advanced search such as the hastemplate and incategory keywords. More technical details (including a few other minor search upgrades) are available. [12]
- View all 20 community-submitted tasks that were resolved last week. For example, EditCheck was installed at Russian Wikipedia, and fixes were made for some missing user interface styles.
Updates for technical contributors
- Editors who use the Toolforge tool Earwig's Copyright Violation Detector will now be required to log in with their Wikimedia account before running checks using the "search engine" option. This change is needed to help prevent external bots from misusing the system. Thanks to Chlod for these improvements. [13]
- Phabricator users can create tickets and add comments on existing tickets via Email again. Sending email to Phabricator has been fixed. [14]
- Some HTML elements in the interface are now wrapped with a
<bdi>
element, to make our HTML output more aligned with Web standards. More changes like this will be coming in future weeks. This change might break some tools that rely on the previous HTML structure of the interface. Note that relying on the HTML structure of the interface is not recommended and might break at any time. [15]
In depth
- The latest monthly MediaWiki Product Insights newsletter is available. This edition includes: updates on Wikimedia's authentication system, research to simplify feature development in the MediaWiki platform, updates on Parser Unification and MathML rollout, and more.
- The latest quarterly Technical Community Newsletter is now available. This edition include: research about improving topic suggestions related to countries, improvements to PHPUnit tests, and more.
Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.
MediaWiki message delivery 21:21, 14 October 2024 (UTC)
Substitution of images
If we have the original image, say from the Library of Congress would we display that one, cropped the same way, to replace the poor high contrast scan in a news article entry? RAN (talk) 19:07, 12 October 2024 (UTC)
- Well, imo it can sometimes be done in this way if the only difference is the quality of the scan, i.e. if it is absolutely the same image, cropped in the same way, of the same colours (as the same image reprinted in various books can have small differences in colours caused by different reprinting techniques), etc. Can you give an example? --Jan Kameníček (talk) 19:19, 12 October 2024 (UTC)
- BTW: Images reprinted in newspapers usually have a special "newspaper" low-resolution appearance, which should imo be preserved and such images should not be replaced by "better" images e. g. from books. --Jan Kameníček (talk) 19:23, 12 October 2024 (UTC)
- It has long been the practice to replace lower-quality images with higher-quality originals, where the only reason the lower-quality version is lower-quality is because of printing restraints. This is the case, for example, where an image (originally in color) is printed in black-and-white in a book (or newspaper) without color printing. See, e.g., The Vampire (Summers). TE(æ)A,ea. (talk) 19:48, 12 October 2024 (UTC)
- I am afraid that the link is exactly the example of a bad practice, because the two pictures have been cropped differently. The picture that was added to our transcription is missing parts of bodies of the women on the right. --Jan Kameníček (talk) 19:53, 12 October 2024 (UTC)
- I'd just replaced a very poor overinked impression of a woodblock with a much better impression of the same woodblock from another edition of the book, here. If this is a problem, please let me know. HLHJ (talk) 03:35, 13 October 2024 (UTC)
- This is imo absolutely OK, because the bad quality is not connected with this edition as such, but only with this particular specimen. --Jan Kameníček (talk) 10:06, 13 October 2024 (UTC)
- Great, thanks. Physical impressions of images (aquatints, etchings, etc.) all tend to deteriorate with number of impressions, and are sometimes altered quite significantly when they get retouched. And I've seen 21st-century reprints which completely mess up the original images, reprinting high-res glossy photo plates at 300dpi, or with glaringly unoverlookable compression artifacts, so that you can't actually see the features described in the captions. I've even seen modern books that managed to overtrim their margins and chop off the edges of the misaligned text. Obviously, where possible, we should start from a good printing job! HLHJ (talk) 17:37, 13 October 2024 (UTC)
- This is imo absolutely OK, because the bad quality is not connected with this edition as such, but only with this particular specimen. --Jan Kameníček (talk) 10:06, 13 October 2024 (UTC)
- As the uploader of the image in question, I am not happy with it wither. However, the image is protected in the EU, so options for locating copies were limited, and I used what was available. If a superior copy becomes available, it should be used instead. --EncycloPetey (talk) 20:05, 15 October 2024 (UTC)
- I'd just replaced a very poor overinked impression of a woodblock with a much better impression of the same woodblock from another edition of the book, here. If this is a problem, please let me know. HLHJ (talk) 03:35, 13 October 2024 (UTC)
- File:Louis_Julius_Freudenberg_I_(1894–1918)_composite_obituary_in_the_Hudson_Observer_on_November_22,_1918.jpg versus File:Louis Julius Freudenberg I (1894–1918) obituary in the Hudson Observer on November 22, 1918.jpg I have transcribed the article already, but not formatted. It should always be mentioned in the notes section that the image has been replaced with a higher resolution version. --RAN (talk) 03:48, 13 October 2024 (UTC)
- I am personally not really happy about similar replacements because of 1) the frame of the picture missing 2) the dotted newspaper structure of the image missing and 3) overall much better original reproduction of the image than the original reproduction in the newspaper which gives the reader a false impression that the original newspaper reproduction was better than it was. For example the contrast of the image in the newspaper was not of such a good quality as the replacement and I believe we should not improve original publications. However, given the very bad quality of the scan too, I think it could be temporarily accepted until somebody finds a better scan (if somebody ever finds a better scan at all...). --Jan Kameníček (talk) 10:01, 13 October 2024 (UTC)
- Thinking about it again, keeping the original newspaper image seems a better solution to me. However, I if you still decide for the replacement, I could live it it :-) --Jan Kameníček (talk) 10:11, 13 October 2024 (UTC)
- I am personally not really happy about similar replacements because of 1) the frame of the picture missing 2) the dotted newspaper structure of the image missing and 3) overall much better original reproduction of the image than the original reproduction in the newspaper which gives the reader a false impression that the original newspaper reproduction was better than it was. For example the contrast of the image in the newspaper was not of such a good quality as the replacement and I believe we should not improve original publications. However, given the very bad quality of the scan too, I think it could be temporarily accepted until somebody finds a better scan (if somebody ever finds a better scan at all...). --Jan Kameníček (talk) 10:01, 13 October 2024 (UTC)
- I just wanted to know for the future. I will keep it as is for now. All the original photos for the WWI New Jersey War dead are available from the state archive, but it would be a lot of work to scan them. I first have to create Wikidata entries for the people. --RAN (talk) 01:15, 14 October 2024 (UTC)
- One specific example that is routinely replaced is images of signatures. —Justin (koavf)❤T☮C☺M☯ 01:17, 14 October 2024 (UTC)
- Not a good practice imo either. A person's signature usually changes a bit with time, and so replacing a signature with its older or newer version is quite misleading too. It is always much better to extract its actual version from the document. --Jan Kameníček (talk) 05:55, 14 October 2024 (UTC)
- In the only example I can find, it was swapped out for the svg version of the same signature from the same source from the same date. --RAN (talk) 21:34, 14 October 2024 (UTC)
- Not a good practice imo either. A person's signature usually changes a bit with time, and so replacing a signature with its older or newer version is quite misleading too. It is always much better to extract its actual version from the document. --Jan Kameníček (talk) 05:55, 14 October 2024 (UTC)
OCR rotated?
I can rotate pages to proofread them, but the OCR always runs on the unrotated page. Could it be made to run with the page oriented as in the display, please? It would save effort on pages where all the text runs vertically. HLHJ (talk) 02:20, 14 October 2024 (UTC)
:With the toolbar enabled, in the upper right hand corner just above the scan page, there are buttons that will zoom and rotate. With the toolbar disabled, there are key combinations that will rotate it -- but I have only ever accidentally done this and have forgotten which key com does this. It would be handy to remember it.--RaboKarbakian (talk) 10:56, 14 October 2024 (UTC)
- Sorry, right answer to a different question. The way I handled this is to upload a rotated page to commons, and then manually putting the address (to the original image, link located under the thumbnail at Commons) into the ocr at the web site. The web site can be found by choosing "Advanced options" or some such when using the ocr button. At the ocr page, they have crop options but not rotate, unfortunately.--RaboKarbakian (talk) 11:00, 14 October 2024 (UTC)
- About the forgotten keys, when you click on the facsimile, you may use
- r rotate cw.
- R rotate ccw.
- s pan up
- w pan down
- a pan right
- d pan left
- f mirror horiz.
- M-le-mot-dit (talk) 13:06, 14 October 2024 (UTC)
- That is very ingenious. I mean no offense when I call it a kludge. But for many applications, transcribing the text by hand without any OCR would be faster. It seems like changing the software to rotate the image and then send it to the OCR would save a lot of human time.
- Actually, why don't we have an OCR that generates markup and Wikisource templates (and links to Commons images auto-uploaded using ws-image-uploader) instead of just plain text? We could feed it the raw images, the image metadata, and the OCR text; with have a good body of training data. The Transkribus software all seems to be under compatible licenses. Shall we ask the WMF for this? If we could double or triple our digitization speed, it would be well worth it.
- I have just been formatting a TOC with {{dotted TOC line}} templates, and it was very repetitive and took about an hour, and nearly as much again trying to get the syntax working. And it's still indenting in the wrong direction and I've probably messed it up slightly in several other ways. It's the sort of thing software could do better than I can. HLHJ (talk) 02:56, 15 October 2024 (UTC)
- That Table of Contents looked very good! After accomplishing few of those, I suspect you should be able to regex them. There are a lot of TOC that are very similar, and a software solution could be found, for sure, because where ever regex is, a software can go there and do that. Some TOC and indeed, many of the other tables (like those found in the scientific journals) are not so cut and dry like most tables of contents. The problem with a software solution is that the art of making tables will be lost and those tables without a software solution might never get authored. I fear a world where the robot overlords are not so much evil rule keepers, but are the only things that can do things and the poor humans cannot break out of their cookie cutter lives and evolve. I like doing tables, personally, and actually no. I don't like doing tables so much as having it done; so grumpy in the middle and such.
- About the forgotten keys, when you click on the facsimile, you may use
- Bad OCR vs typing: it was easier to type in patents than it was to fix the OCR, but I am a fast typist. The bad OCR was due to the bad scans, and more current OCR did not help much. My guess is that an OCR output that also contains layout guesses will make Validation more like editing a work and less like perfecting a work. I heard of complaints from gutenberg when OCR got really good, it became boring for proofers. They don't do layout at the same time so it is different there. I like good OCR, mostly because here the layout can be done at the same while. Proofing is a task for nit pickers and without the nits, the pickers get bored.
- The kudos on your TOC were real, as it was not even one of the most usual of them. You might consider something from Category:Texts with missing tables to hone your skill and craft your regex and consider software solutions (yes or no). Sorry to ramble on.--RaboKarbakian (talk) 13:42, 15 October 2024 (UTC)
- Thank you for the kudos! I will certainly be adding some regex cookbook entries on TOCs and ordinary tables. I've also noticed that some tables are pseudo-tables, with the body actually only being one row, which would make it pretty impossible to read across a row if you are using a screenreader, so I may write something to convert that, or convert the tab- or space- or comma- separated table you can readily get by copy-pasting a pseudotable into Gnumeric and out into plain text again.
- The kudos on your TOC were real, as it was not even one of the most usual of them. You might consider something from Category:Texts with missing tables to hone your skill and craft your regex and consider software solutions (yes or no). Sorry to ramble on.--RaboKarbakian (talk) 13:42, 15 October 2024 (UTC)
On the philosophical front, I am not worried about robots becoming the only ones who can do things, because history.
|
---|
Spinning enough thread to make one suit of clothes per person per year used to be a full-time job for half the adult population, assisted by a lot of child labour, everywhere where clothing was required. The spinning wheel more than tripled the labour efficiency of spinning. The industrial revolution's backbone was the automation of spinning, which now takes up a comparatively negligible portion of the world's labour.
|
- I am sure there are some people who would find better OCR tools spoilt the fun. But
- They don't have to use them.
- There are also people who like the result more than the process, and they will do more if it costs me less time and tedium.
- There are people who like a lot of things about the process, but not the bits the better tools could take over.
- I'm in that last cat, but also sometimes in the second! HLHJ (talk) 18:31, 16 October 2024 (UTC)
- I am sure there are some people who would find better OCR tools spoilt the fun. But
- About TOCs, don't use {{dotted TOC line}}, {{dotted TOC page listing}}, and some others, because they make a separate html table for each line, resulting in unnecessarily huge output. Rather, use {{TOC begin}}, {{TOC end}} and the various TOC row's (listed at TOC begin).
- Also, never link TOCs to pagespace, but always to mainspace, because it's from the TOC that readers will access, from the root mainspace page, its subpages. I haven't done that, as I'm not actively working on that book and I don't know what titles things should have.
- About OCR, training an OCR model is a lot of work. We'd also have to do something very different, as normal OCR engines have much trouble recognizing things that are obvious to humans. Take for example font size. On pages like this one, the OCR engine, which is fed the text page by page, has nothing to compare to, so can't differentiate it from a book just printed in small type. — Alien 3
3 3 14:55, 15 October 2024 (UTC)- Rather than using either TOC template set, I've been using a plain table and index styles to format my TOCs. That's the option I've found the best so far, both because it keeps the amount of wikicode down (fundamentally, the TOC row series is a wrapper for table row syntax with some classes and styles attached), and because all the formatting can be done in one place -- also, did you know you can do dot leaders in pure CSS? Arcorann (talk) 23:12, 15 October 2024 (UTC)
- Last time I checked, dot leaders had been planned by the W3C but has not actually been implemented by browers. — Alien 3
3 3 05:38, 16 October 2024 (UTC)- Seems, to still be so, as "leader('.')" is filtered out by Firefox as an invalid value for content. — Alien 3
3 3 05:59, 16 October 2024 (UTC)- This is interesting, but since a newbie cannot be expected to know all this, nor can it be gleaned from the template docs, is there something like Help:Table of contents (formatting)?
- I am absolutely not suggesting that we train an OCR from scratch; that would indeed be a lot of work. But the Transkribus OCR is under a wiki-compatible license (as, I trust, are the others?). I'm suggesting that we add markup capabilities: adding {{c|}} around centered text, scanning two columns of text sequentially rather than cross-cutting them, scanning tables into Mediawiki tables, and so on.
- It seems like it would be possible to set a character height on the scanned images as standard, cross-page, and that would let a program determine if the text was larger, xx-larger, xxx-smaller, etc. Computers are better at measuring large numbers of similar things quickly and precisely; I'm happy to do the things that require human judgement, and troubleshoot, because those IMO are the interesting bits :). HLHJ (talk) 18:38, 16 October 2024 (UTC)
- (Sorry for the docs issue, we do have a big one, but eh)
- The current about best thing for issues like that is w:Document Layout Analysis, but that only detects the placement of blocks of text.
- The question, then, is how do you define centered? You could say it's "equal margins on either side", but then, margins of what? Lines? If so, OCR often breaks up lines, so wouldn't bring you anywhere. Blocks? Then it'd be adding {{c}} to poems, for example. Or even, suppose line and perfect line detection. Images are always cropped, not always equally, the original "natural" margins present on either side of the text aren't necessarily equal, in multiple column text there are multiple centers, and &c, so the margins aren't good. It looks like a simple issue, but it is often a headache (and even for human, I for one have often encountered issues where you can't really tell if something is centred or not). — Alien 3
3 3 19:37, 16 October 2024 (UTC)- Thank you for the link, I hadn't seen that! Adding c to poems wouldn't be too bad. Easy to replace manually. Just something basic that would do the headers and page numbers would be useful. Or something that let us manually select rectangular areas (as with Commons image annotations, which see little use) and OCR them in sequence, that would do multicolumn text. We could even label areas as images and not OCR those. A modified manually-set-x-by-y-grid version would do tables. HLHJ (talk) 23:19, 16 October 2024 (UTC)
- Seems, to still be so, as "leader('.')" is filtered out by Firefox as an invalid value for content. — Alien 3
- Last time I checked, dot leaders had been planned by the W3C but has not actually been implemented by browers. — Alien 3
- Rather than using either TOC template set, I've been using a plain table and index styles to format my TOCs. That's the option I've found the best so far, both because it keeps the amount of wikicode down (fundamentally, the TOC row series is a wrapper for table row syntax with some classes and styles attached), and because all the formatting can be done in one place -- also, did you know you can do dot leaders in pure CSS? Arcorann (talk) 23:12, 15 October 2024 (UTC)
- Have you considered making a Phabricator request re: rotating the image on the advanced options page? Arcorann (talk) 23:18, 15 October 2024 (UTC)
- That's a good idea, I'll try to get around to it, unless someone else does. :). HLHJ (talk) 14:56, 16 October 2024 (UTC)
- It's been done: task T87017. Pinging User:ShakespeareFan00, who made the request. HLHJ (talk) 18:33, 16 October 2024 (UTC)
Seeking volunteers to join several of the movement’s committees
Each year, typically from October through December, several of the movement’s committees seek new volunteers.
Read more about the committees on their Meta-wiki pages:
Applications for the committees open on 16 October 2024. Applications for the Affiliations Committee close on 18 November 2024, and applications for the Ombuds commission and the Case Review Committee close on 2 December 2024. Learn how to apply by visiting the appointment page on Meta-wiki. Post to the talk page or email cst@wikimedia.org with any questions you may have.
For the Committee Support team,
-- Keegan (WMF) (talk) 23:09, 16 October 2024 (UTC)
Tech News: 2024-43
Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
Weekly highlight
- The Mobile Apps team has released an update to the iOS app's navigation, and it is now available in the latest App store version. The team added a new Profile menu that allows for easy access to editor features like Notifications and Watchlist from the Article view, and brings the "Donate" button into a more accessible place for users who are reading an article. This is the first phase of a larger planned navigation refresh to help the iOS app transition from a primarily reader-focused app, to an app that fully supports reading and editing. The Wikimedia Foundation has added more editing features and support for on-wiki communication based on volunteer requests in recent years.
Updates for editors
- Wikipedia readers can now download a browser extension to experiment with some early ideas on potential features that recommend articles for further reading, automatically summarize articles, and improve search functionality. For more details and to stay updated, check out the Web team's Content Discovery Experiments page and subscribe to their newsletter.
- Later this month, logged-out editors of these 12 wikis will start to have temporary accounts created. The list may slightly change - some wikis may be removed but none will be added. Temporary account is a new type of user account. It enhances the logged-out editors' privacy and makes it easier for community members to communicate with them. If you maintain any tools, bots, or gadgets on these 12 wikis, and your software is using data about IP addresses or is available for logged-out users, please check if it needs to be updated to work with temporary accounts. Guidance on how to update the code is available. Read more about the deployment plan across all wikis.
- View all 33 community-submitted tasks that were resolved last week. For example, the South Ndebele, Pannonian Rusyn, Obolo, Iban and Tai Nüa Wikipedia languages were created last week. [16][17][18][19][20]
- It is now possible to create functions on Wikifunctions using Wikidata lexemes, through the new Wikidata lexeme type launched last week. When you go to one of these functions, the user interface provides a lexeme selector that helps you pick a lexeme from Wikidata that matches the word you type. After hitting run, your selected lexeme is retrieved from Wikidata, transformed into a Wikidata lexeme type, and passed into the selected function. Read more about this in the latest Wikifunctions newsletter.
Updates for technical contributors
- Users of the Wikimedia sites can now format dates more easily in different languages with the new
{{#timef:…}}
parser function. For example,{{#timef:now|date|en}}
will show as "26 December 2024". Previously,{{#time:…}}
could be used to format dates, but this required knowledge of the order of the time and date components and their intervening punctuation.#timef
(or#timefl
for local time) provides access to the standard date formats that MediaWiki uses in its user interface. This may help to simplify some templates on multi-lingual wikis like Commons and Meta. [21][22] - Commons and Meta users can now efficiently retrieve the user's language using
{{USERLANGUAGE}}
instead of using{{int:lang}}
. [23] - The Product and Tech Advisory Council (PTAC) now has its pilot members with representation across Africa, Asia, Europe, North America and South America. They will work to address the Movement Strategy's Technology Council initiative of having a co-defined and more resilient technological platform. [24]
In depth
- The latest quarterly Growth newsletter is available. It includes: an upcoming Newcomer Homepage Community Updates module, new Community Configuration options, and details on new projects.
- The Wikimedia Foundation is now an official partner of the CVE program, which is an international effort to catalog publicly disclosed cybersecurity vulnerabilities. This partnership will allow the Security Team to instantly publish common vulnerabilities and exposures (CVE) records that are affecting MediaWiki core, extensions, and skins, along with any other code the Foundation is a steward of.
- The Community Wishlist is now testing machine translations for Wishlist content. Volunteers can now read machine-translated versions of wishes and dive into discussions even before translators arrive to translate content.
Meetings and events
- 24 October - Wiki Education Speaker Series Webinar - Open Source Tech: Building the Wiki Education Dashboard, featuring Wikimedia interns and a Web developer in the panel.
- 20–22 December 2024 - Indic Wikimedia Hackathon Bhubaneswar 2024 in Odisha, India. A hackathon for community members, including developers, designers and content editors, to build technical solutions that improve contributors' experiences.
Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.
MediaWiki message delivery 20:52, 21 October 2024 (UTC)
I added Show Boat as a book to Wikisource
Please help me format it. Thankfully it's in the public domain because it was published 1926. Blahhmosh (talk) 04:18, 21 October 2024 (UTC)
- I added a few others. Please proofread. Blahhmosh (talk) 04:35, 21 October 2024 (UTC)
- Also, is Amerika: The Missing Person translation in Public domain? I know the Original german version is, but the American one was published 1938 or something. Blahhmosh (talk) 04:50, 21 October 2024 (UTC)
- Also, how does "Hunting for Hidden Gold" work? How do you just record the original version as well as the other versions? Blahhmosh (talk) 04:52, 21 October 2024 (UTC)
- Note that Wikisource no longer accepts second-hand transcriptions, e.g. from Project Gutenberg. All new works must be proofread to a source text. See Wikisource:What_Wikisource_includes#Second-hand_transcriptions. For example, Show Boat is available to proofread here Index:Show boat - 1926.djvu and An American Tragedy here: Index:An American Tragedy Vol 1.pdf. MarkLSteadman (talk) 05:12, 21 October 2024 (UTC)
- I see. How do I submit new PDF files for transcription? Blahhmosh (talk) 07:12, 21 October 2024 (UTC)
- You can upload them to Wikimedia Commons, there and then follow the procedure described over there. — Alien 3
3 3 07:30, 21 October 2024 (UTC)- So are digital versions of the books pdf banned? Blahhmosh (talk) 08:02, 21 October 2024 (UTC)
- If that's your question (I'm not sure I understood), PDF isn't banned, it's just that it works less well, and has many issues, so DjVu is more convenient. You can perfectly upload a PDF and work on it, it's your choice. — Alien 3
3 3 08:10, 21 October 2024 (UTC)- No, I'm saying sometimes the PDF doesn't contain the actual images of the book and instead contains just plain text of the book (standard Times New Roman, Ariel, Corier, etc.) font that isn't the font used in the original book. Should we use those? Blahhmosh (talk) 13:46, 21 October 2024 (UTC)
- No, "source texts" here mean a scan of the physical edition of that book. — Alien 3
3 3 13:47, 21 October 2024 (UTC)- What if it's the physical edition of that book but you can copy the text of the book? Blahhmosh (talk) 13:54, 21 October 2024 (UTC)
- Source text, means that it is proofread against a known published version, which is almost always a physical copy (or for recent government works, a PDF release). Although scans are strongly preferred, they are not strictly required per policy, but a clear record of what is the source. Also given the ubiquity of high-quality digital cameras, it shouldn't be too hard to image the pages (even if not able to process them) so that there is a record for someone in the future. MarkLSteadman (talk) 04:31, 22 October 2024 (UTC)
- Re taking a physical work --> typing it up --> releasing it as say a PDF on say Internet Archive --> retranscribing it at WS, that typically counts as secondhand / self-published. Several reasons why they are problematic as they effectively create a new "edition": 1. They can introduce issues with any omissions / decisions / additions causing divergences with the physical work, even if just in things like pagination 2. editor's copyright not being released / verified 3. source text uncertainty, are any divergences in the text caused by lack of record in exactly which is the source edition, which may itself introduce copyright concerns. 4. They typically end up being duplicative of a future proofread against the scans version of the text anyways (which then can be used for referencing page numbers, authority control, source text comparison etc.), and will be deleted. MarkLSteadman (talk) 05:02, 22 October 2024 (UTC)
- Source text, means that it is proofread against a known published version, which is almost always a physical copy (or for recent government works, a PDF release). Although scans are strongly preferred, they are not strictly required per policy, but a clear record of what is the source. Also given the ubiquity of high-quality digital cameras, it shouldn't be too hard to image the pages (even if not able to process them) so that there is a record for someone in the future. MarkLSteadman (talk) 04:31, 22 October 2024 (UTC)
- What if it's the physical edition of that book but you can copy the text of the book? Blahhmosh (talk) 13:54, 21 October 2024 (UTC)
- No, "source texts" here mean a scan of the physical edition of that book. — Alien 3
- No, I'm saying sometimes the PDF doesn't contain the actual images of the book and instead contains just plain text of the book (standard Times New Roman, Ariel, Corier, etc.) font that isn't the font used in the original book. Should we use those? Blahhmosh (talk) 13:46, 21 October 2024 (UTC)
- If that's your question (I'm not sure I understood), PDF isn't banned, it's just that it works less well, and has many issues, so DjVu is more convenient. You can perfectly upload a PDF and work on it, it's your choice. — Alien 3
- So are digital versions of the books pdf banned? Blahhmosh (talk) 08:02, 21 October 2024 (UTC)
- You can upload them to Wikimedia Commons, there and then follow the procedure described over there. — Alien 3
- I see. How do I submit new PDF files for transcription? Blahhmosh (talk) 07:12, 21 October 2024 (UTC)
- Note that Wikisource no longer accepts second-hand transcriptions, e.g. from Project Gutenberg. All new works must be proofread to a source text. See Wikisource:What_Wikisource_includes#Second-hand_transcriptions. For example, Show Boat is available to proofread here Index:Show boat - 1926.djvu and An American Tragedy here: Index:An American Tragedy Vol 1.pdf. MarkLSteadman (talk) 05:12, 21 October 2024 (UTC)
- Also, how does "Hunting for Hidden Gold" work? How do you just record the original version as well as the other versions? Blahhmosh (talk) 04:52, 21 October 2024 (UTC)
- Also, is Amerika: The Missing Person translation in Public domain? I know the Original german version is, but the American one was published 1938 or something. Blahhmosh (talk) 04:50, 21 October 2024 (UTC)
'Wikidata item' link is moving, finally.
Hello everyone, I previously wrote on the 27th September to advise that the Wikidata item sitelink will change places in the sidebar menu, moving from the General section into the In Other Projects section. The scheduled rollout date of 04.10.2024 was delayed due to a necessary request for Mobile/MinervaNeue skin. I am happy to inform that the global rollout can now proceed and will occur later today, 22.10.2024 at 15:00 UTC-2. Please let us know if you notice any problems or bugs after this change. There should be no need for null-edits or purging cache for the changes to occur. Kind regards, -Danny Benjafield (WMDE) 11:29, 22 October 2024 (UTC)
I messed up a title
When I made the page Index:The Last Post (1928), it should've been Index:The Last Post (1928).pdf. What do I do? Blahhmosh (talk) 18:33, 22 October 2024 (UTC)
- I've moved it for you. It's easiest if an admin does this because we can suppress the automatic redirect that would happen if you moved it. Beeswaxcandle (talk) 18:42, 22 October 2024 (UTC)
- I see. Also, based on the nature of the .pdf file, is it valid for Wikisource? Blahhmosh (talk) 18:45, 22 October 2024 (UTC)
- It is, again, a second-hand transcription, as it says on page 3:
This ebook is the product of [...] Standard Ebooks
,based on a transcription by Faded Page Canada
, so no. What would be valid in this case, for example, would be to go get the original page scans from Google Books (mentioned still on page 3). — Alien 3
3 3 18:50, 22 October 2024 (UTC)
- It is, again, a second-hand transcription, as it says on page 3:
- I see. Also, based on the nature of the .pdf file, is it valid for Wikisource? Blahhmosh (talk) 18:45, 22 October 2024 (UTC)
Where to start and end a book's file
I've noticed that traditionally, files of books are assembled from cover to cover. I'm considering deviating from this tradition by creating a djvu file of this hathitrust book that starts from the first page that contains transcribable material and ends at the last page containing such. So I wonder if there is any purpose to having empty pages at the start and end besides to act as a placeholder. Is there a need to preserve a book's integrity that necessitates having the entire book? Prospectprospekt (talk) 23:43, 22 October 2024 (UTC)
- Having all pages from cover to cover shows that nothing was omitted. If you start chopping off pages, then how can the readers be sure that you haven't arbitrarily decided to delete something important? Such as the toc or preface or errata notes or anything else. Some books include advertisement pages of questionable value, but if you remove them, then this would look suspicious. --Ssvb (talk) 05:54, 23 October 2024 (UTC)
- To me, the empty pages in themselves don't have an important interest (though I'd leave them anyway for integrity), but it's much easier to check if something else has been removed in violation of WS:NPOV if they have been left there. If the empty pages have been left, verification is as simple as the number of pages, but if we remove them, it gets much more complicated, and it's not even just substracting the number of empty pages at the beginning and end, as some may also remove the back of plates, so in any way you have to check all the pages. — Alien 3
3 3 06:59, 23 October 2024 (UTC) - Agree, empty pages should not be removed from the scans of the book. Besides the reasons above I will add some more: 1) the scans uploaded to Commons do not serve only Wikisource, but to anybody, and I can imagine that somebody might like to create an exact facsimile of the original publication including the cover etc., and so they would miss the omitted pages then. 2) Although we do not transcribe e.g. the library tags attached to books, for somebody it might be useful to know in which library this particular specimen was stored, so we should not cut it off from the scan. --Jan Kameníček (talk) 09:20, 23 October 2024 (UTC)
- Prospectprospekt: That file is somewhat unusual. The actual, printed item is /3 to /52; /1, /2, /53, and /54 are all a cover which was added to the pamphlet by the library which owns the item. In this case, properly, those four pages should be excluded; but I generally do not exclude them because I do not think it is necessary to do so. Jan Kameníček, does it change your opinion to know that the covers (of this work) were not original to the publication? TE(æ)A,ea. (talk) 17:43, 23 October 2024 (UTC)
- Well, there still remains my "library argument", which, I admit, is not too strong, so although I personally would not remove these pages, I would not object too much if somebody else would. --Jan Kameníček (talk) 17:50, 23 October 2024 (UTC)
Manual news article aggregation (manual indexing) versus automatic news article aggregation (automatic indexing)
See: Jersey Journal (manually curated, always missing entries) versus The Washington Post (newspaper) (automatically curated, always complete) to see the difference. I have identified at least 6 different ways that news articles are manually aggregated in different formats from calendars to various table formats to lists by year. Is there a hard rule that prevents us from having both manual and automatic curation. The best analogy would be Commons which has Commons:Category:Abraham Lincoln for automatic aggregation and Commons:Abraham Lincoln for manual aggregation. I don't see why we cannot have both methods to satisfy both needs. We could have Portal:Jersey Journal or Periodical:Jersey Journal for manual indexing and a link to Jersey Journal for the automated list, just like is done at Commons; or, we could have The Jersey Journal versus Jersey Journal with one automatic and the other hand curated, and a link between the two. A third option would be a hybrid where both appear on the same page like here: New York Tribune. RAN (talk) 17:12, 23 October 2024 (UTC)
HELP IS NEEDED ON POPULAR SCIENCE MONTHLY
Due to formating issues and other problems. Help would be appreciated due to its gargantuan size. Booklover09097 (talk) 19:26, 18 October 2024 (UTC)
- Can you clarify? —Justin (koavf)❤T☮C☺M☯ 23:38, 18 October 2024 (UTC)
- Moved from Wikisource:News/2024-10 where it was placed incorrectly. --Jan Kameníček (talk) 22:00, 18 October 2024 (UTC))
- This project, being very close to my heart, begs the question. What are you talking about, and what are the specific issues? — ineuw (talk) 04:20, 24 October 2024 (UTC)
how accurate is transcribe?
Whenever i'm on a page and the text is juttery and clunky, i press transcribe and the text looks pretty good. But how accurate is the transcribe button in relation to the text? Booklover09097 (talk) 09:29, 19 October 2024 (UTC)
- That is mostly dependent on the quality of the image. With lower resolutions, it gets garbled.
- It also depends on the OCR engine used. For on-site OCR, Google OCR is the best one for character accuracy, although it has trouble with columns.
- What you see when you create a page is the OCR that was embedded in the file before upload. The best engine off-site is probably Tesseract. The people who did that might have not used the best OCR (or it was not available at the time).
- Even though it is sometimes pretty good, there is no guarantee of accuracy, and editors are expected to check. — Alien 3
3 3 09:51, 19 October 2024 (UTC) - @Booklover09097: Which book are you talking about? There are a lot of PDF files exported from https://archive.org (IA), which had been optimized for extreme size reduction at the expense of their quality. But thankfully IA typically also has the original high quality images of the scanned book pages for download, and they can be used to create a better quality PDF or DjVu files. --Ssvb (talk) 13:10, 22 October 2024 (UTC)
- In my experience with IA and familiarity with OCR technology, five factors influence the quality. Image clarity, scanning method, scanning equipment, scanning software, and optics. Many early documents at IA were scanned manually. With automation, the technical particulars became available on the IA download page.
- One additional note. Initial IA scanning equipment used one OCR software for English, and another for accented Latin languages. Since English academic documents reference other languages, look closely before applying Tesseract OCR. It is not always wanted. — ineuw (talk) 05:12, 24 October 2024 (UTC)
Tech News: 2024-44
Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
Updates for editors
- Later in November, the Charts extension will be deployed to the test wikis in order to help identify and fix any issue. A security review is underway to then enable deployment to pilot wikis for broader testing. You can read the October project update and see the latest documentation and examples on Beta Wikipedia.
- View all 32 community-submitted tasks that were resolved last week. For example, Pediapress.com, an external service that creates books from Wikipedia, can now use Wikimedia Maps to include existing pre-rendered infobox map images in their printed books on Wikipedia. [25]
Updates for technical contributors
- Wikis can use the Guided Tour extension to help newcomers understand how to edit. The Guided Tours extension now works with dark mode. Guided Tour maintainers can check their tours to see that nothing looks odd. They can also set
emitTransitionOnStep
totrue
to fix an old bug. They can use the new flagallowAutomaticBack
to avoid back-buttons they don't want. [26] - Administrators in the Wikimedia projects who use the Nuke Extension will notice that mass deletions done with this tool have the "Nuke" tag. This change will make reviewing and analyzing deletions performed with the tool easier. [27]
Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.
MediaWiki message delivery 20:56, 28 October 2024 (UTC)
I'm kind of new to wikipedia and need some help
I would like to know how to better edit this code and what exactly it is for:
{{header
| title = {{subst:}}
| author =
| section =
| previous =
| next =
| year =
| notes =
}}
WikiEducationalVol (talk) 03:19, 27 October 2024 (UTC)
- Hi @WikiEducationalVol: It seems that you might be confusing us for Wikipedia. We are actually not Wikipedia—we are their sister site Wikisource.
- You recently submitted an article about "Adolescence", which I deleted because it is not within our project scope. We host a collection of transcriptions of already-existing texts, mostly old books and government documents. But your article about the definition of adolescence might be better suited for Wikipedia itself or maybe even Wikibooks. Try those communities next. SnowyCinema (talk) 04:45, 27 October 2024 (UTC)
- to answer your question, this is the header template for completed transcribed works, for example Beethoven (Rolland). we have a works namespace rather than article namespace. it is backed by a side by side transcription stitched together at an index page, for example Index:Rolland_-_Beethoven,_tr._Hull,_1927.pdf. --Slowking4 ‽ digitaleffie's ghost 00:09, 28 October 2024 (UTC)
- @WikiEducationalVol: If you're used to the Wikipedia way of thinking, try thinking of this template as a sort of little infobox for each book or article.
- The
title
parameter is for the full title of the work as originally published. (This might be different from the title of the page in some instances!) - The
author
parameter is for the author of the work. - The
year
parameter is for the year the work was originally published. - The
section
parameter is for the title of the chapter (if you are working on a chapter subpage; don't fill this out on the main page). - The
previous
parameter holds a wikilink to the previous chapter, so you can go back to the previous chapter if you want. - Similarly, the
next
parameter holds a wikilink to the next chapter, so you can skip ahead. - Lastly, the
notes
parameter holds any other info you might want the reader to know. This might be a brief summary of the work, or a comment describing how the formatting of this version differs from that of the original.
- The
- Duckmather (talk) 21:06, 29 October 2024 (UTC)
Scans are now migrated to the talk page
Scans are now migrated to the talk page, when did that start? See: Talk:The_Indianapolis_News/1937/4_American_Pilots_Quit_Spanish_War_as_Loyalists_Fail_to_Pay
Also, is every news article here at Wikisource supposed to get an entry at Wikidata? RAN (talk) 20:24, 28 October 2024 (UTC)
- @Jan.Kamenicek: Any comments? SnowyCinema (talk) 01:51, 29 October 2024 (UTC)
- If you do not want the scan to appear on the text page anymore, the best thing would be to create a Wikidata entry for each news article and the image will appear there and we will link to the text here from Wikidata, the link will then appear in the upper right corner. See for instance: Wikidata:Q86172138 --RAN (talk) 05:02, 29 October 2024 (UTC)
- Main namespace is supposed to contain the transcribed text, sometimes accompanied by original illustrations of the text. The best place for scans is the page namespace. It is redundant to have both the transcribed text and the scan in the mainspace page. It is not being done with other works and there is no reason why it should be done with news articles. Such practice is not supported by any of our rules or help pages. For example Help:Digitising texts and images for Wikisource#Images and illustrations contradicts this approach by stating that images should be extracted from the work and uploaded as separate files, not like here]. Proper work with scans is described at Help:Proofread, work with .jpg scans is described at Help:Index pages#Using individual image files. I have moved some thumbs of images of a few of such scans to the talk pages so that they are not lost if anybody wanted to use them for proper scanbacking.As for Wikidata entries, they are not required but their creation is certainly supported. --Jan Kameníček (talk) 11:23, 29 October 2024 (UTC)
- BTW: One more thing should be said, and that is general appreciation for the work with transcribing interesting and useful news articles. --Jan Kameníček (talk) 11:30, 29 October 2024 (UTC)
Final Reminder: Join us in Making Wiki Loves Ramadan Success
Dear all,
We’re thrilled to announce the Wiki Loves Ramadan event, a global initiative to celebrate Ramadan by enhancing Wikipedia and its sister projects with valuable content related to this special time of year. As we organize this event globally, we need your valuable input to make it a memorable experience for the community.
Last Call to Participate in Our Survey: To ensure that Wiki Loves Ramadan is inclusive and impactful, we kindly request you to complete our community engagement survey. Your feedback will shape the event’s focus and guide our organizing strategies to better meet community needs.
- Survey Link: Complete the Survey
- Deadline: November 10, 2024
Please take a few minutes to share your thoughts. Your input will truly make a difference!
Volunteer Opportunity: Join the Wiki Loves Ramadan Team! We’re seeking dedicated volunteers for key team roles essential to the success of this initiative. If you’re interested in volunteer roles, we invite you to apply.
- Application Link: Apply Here
- Application Deadline: October 31, 2024
Explore Open Positions: For a detailed list of roles and their responsibilities, please refer to the position descriptions here: Position Descriptions
Thank you for being part of this journey. We look forward to working together to make Wiki Loves Ramadan a success!
Warm regards,
The Wiki Loves Ramadan Organizing Team 05:11, 29 October 2024 (UTC)
Commision/Commission bug
If you look at Van Cise exhibits to the Commission on Industrial Relations regarding Colorado coal miner's strike and click the "Source" button, you will get to Index:Van Cise exhibits to the Commission on Industrial Relations regarding Colorado coal miner's strike.djvu, which doesn't exist. The actual index is at Index:Van Cise exhibits to the Commision on Industrial Relations regarding Colorado coal miner's strike.djvu (note the missing "s" in "Commision" [sic]). There's a similar problem on some of the pages. For example, clicking the up arrow on Page:Van Cise exhibits to the Commision on Industrial Relations regarding Colorado coal miner's strike.djvu/1 also leads you to the nonexistent "Commission" index page.
I see two ways out of this:
- Move the file, the index page, and all individual pages to use the "Commission" spelling, and make sure that no pages are still using the old "Commision" spelling.
- Use the "Commision" spelling, and somehow fix all the redlinks (maybe they're due to ProofreadPage going wonky somehow?).
Duckmather (talk) 21:01, 29 October 2024 (UTC)
- The problem was caused in Commons about 2 years ago when User:Armbrust moved the file to the new name without taking care of our index page. I have moved the index and all the individual pages to the new title so now it should be fixed. --Jan Kameníček (talk) 22:12, 29 October 2024 (UTC)
- @Jan Kameníček: Thanks! Now I can get back to validating it. Duckmather (talk) 02:27, 30 October 2024 (UTC)
Help with handwritten letter
I'm working on Index:T. C. E. Laugesen to Carl Laugesen, am mostly done but would appreciate someone validating my work and helping decipher a word on page 3 I couldn't work out. —CalendulaAsteraceae (talk • contribs) 09:04, 31 October 2024 (UTC)
- @CalendulaAsteraceae: You should update the Wikidata entry with 30719036 (findagrave) and 996N-7YW (familysearch) and 6000000007926526330 (geni). It looks like he is missing an entry at Wikitree, here is his dad: https://www.wikitree.com/wiki/Laugesen-8 you can create an entry there and migrate the image I added to Commons. --RAN (talk) 16:23, 31 October 2024 (UTC)
Page access request
Hello, I have a small request. I've been addressing some specific priority syntax errors here on Wikisource, and have dropped two error types down to near zero. The Tidy Font Bug (78 remain), and Misnested tags (42 remain). 77 and 41 of these are on Full protected pages, and I wondered if I could have access to these Tidy font and these misnested pages for a brief time to address these issues. I have 2 years of experience on Wikipedia with handling these (and other) tracked syntax errors in an respectful and knowledgeable manner, and currently have a temporary adminship (Sept-Dec) on Wikivoyage, where I addressed 99.99% of their 30k syntax errors in 5k edits (Aug-Sept). I've asked Xover and Encyclopety on their talk pages about the possibility of my accessing these few pages, but neither have been very active here since my messages, and have not replied, so I figured the next step was to ask here since it had been a few days. I am happy to discuss or answer any questions admin may have. Thanks, and hope you have a great weekend. Zinnober9 (talk) 19:54, 25 October 2024 (UTC)
- Crossposted to WS:AN since no reply here after a week, and only an admin could grant this request. Zinnober9 (talk) 05:41, 3 November 2024 (UTC)
Wikisource: We preserve publishers typos
I think that any person, bot or other software reply mechanism here should never say the words "as published" until current policy is at least softened; as it is a lie.
We preserve publishers typos | As published |
---|---|
exceptions: | exceptions: |
|
|
Feel free to add to either list.
Perhaps there are more. While it sounds (and reads as) so 'leet to say "as the publisher" it is simply not true and over the border which makes it a lie. A simple softening of the policy, so that the occasional editor cannot drop in, validate a page that has one image on it and then ravage the style sheet, would perhaps give you back that 'leet feeling you get when you utter that lie. Without the softening of policy on those point, it is simply a lie.--RaboKarbakian (talk) 20:21, 22 October 2024 (UTC)
- Also, I beg of you. Please find for me an English text from between 1650 and 1750 that is not using a serif type face!!--RaboKarbakian (talk) 20:23, 22 October 2024 (UTC)
- There's nothing like holding an actual book from the 17th century. However, that's quite different from holding it as published, which no one has done in centuries. Good color PDF scans can preserve some of the qualities of an old book, but miss out on a lot of others. It seems quite weird to say "type family" as opposed to "type face"; you think you can just replace Caslon or Baskerville with Times New Roman? Given that Caslon was the old-school and Baskerville was the new wave when they were competing, how does even replacing one with the other qualify as "as the publisher"?
- We are not making digital facsimiles. If you want a digital facsimile, use the PDF. I don't see your second column of exceptions as basically changing anything as to the truthhood or falsehood of "as the publisher".--Prosfilaes (talk) 21:38, 22 October 2024 (UTC)
- You are completely correct about that, if we are talking about a text file -- which is the one format the exporter does not do! I am asking simply that the policy be changed to be more "in general" and not so "against". I am also not insisting that the policy be changed so that everything on that list has to be reflected. I would prefer the occasional editor to be a little less enabled. My style sheet just said "serif" because it is not a facimile.--RaboKarbakian (talk) 22:40, 22 October 2024 (UTC)
- What did I say about a text file? (Which, by the way, drops stuff that's integral to most works, like italics.) Serif/san-serif is an irrelevant distinction. As you say, a work published in 1750 is going to be in a serif typeface. But you're ignoring many of the other features a work published in 1750 would have, e.g. [28], like the font size, the very different looking fonts, the signatures and tail word. You've also ignoring choices of publishers that are distinct choices, like which serif font to use, in exchange for removing the default font of the reader.
- I'm against making working on pages more complex; I've found that PGDP's total separation of proofing from formatting to simplify things a lot, and the more formatting we add is just going to make it worse. I'm also against making more per-project idiosyncrasies. I see preserving publisher typos as more making a standard, undisputable format for pages, and not terribly important in and of itself.--Prosfilaes (talk) 20:32, 25 October 2024 (UTC)
- You are completely correct about that, if we are talking about a text file -- which is the one format the exporter does not do! I am asking simply that the policy be changed to be more "in general" and not so "against". I am also not insisting that the policy be changed so that everything on that list has to be reflected. I would prefer the occasional editor to be a little less enabled. My style sheet just said "serif" because it is not a facimile.--RaboKarbakian (talk) 22:40, 22 October 2024 (UTC)
- (For images that start and end chapters, I always add them, and have trouble understanding why apparently no one does.)
- Leaving ls apart as that's another debate, to me what you listed is perfectly compatible with the fact that we transcribe the work as published, not the physical book as published.
- After type families, paragraph indentation, and margins, the same arguments would also lead us to replicate the relative height and width of letters, the width of the page, heck, even the color of the paper. At that point, it'd be much more reasonable to get a 600 or more DPI scan, feed it to the OCR, which respects layout as it places the text on the page, and at this resolution if we take the right engine it's going to be near-perfect.
- Also, what would you mean by a
softening of policy
, which would makethe occasional editor to be a little less enabled
? You said above that you do not think policy should make those things mandatory, but then, do what else? — Alien 3
3 3 06:49, 23 October 2024 (UTC)
3 3 16:03, 23 October 2024 (UTC)
- love the passion about transcription; don't like the "as it is a lie." in any project, there will be compromises between verisimilitude and usability, and calling compromises lies is unhelpful. --Slowking4 ‽ digitaleffie's ghost 13:21, 23 October 2024 (UTC)
- Slowking4, or should I say "font-family:UnifrakturMaguntia"? Personally, I miss the rants of Rama's revenge; and perhaps I am just filling in for the lack of those. That said, I was indenting paragraphs when in elementary school and I was not the only one doing that. The indents help for reading. They are a challenge in markup tho', no doubt. What I did try to say was that "As the publisher published it" is a lie, because it is really just "all of the publishers typos" and anything else might get an editor harassed because there is policy against it. So, I am suggesting that if we are to continue on as is, that we stop lying and simply proclaim "we preserve publishers typos and misspellings".
- And I don't want to be misunderstood that new policy should insist upon that list; I want the option without the potential harassment. 'Tis a huge and capable layout engine; policy wants it to be used like a bulldozer to go the 20 feet to get the mail.--RaboKarbakian (talk) 15:52, 23 October 2024 (UTC)
- @RaboKarbakian: From the technical perspective, a configurable layout with switchable paragraph indentations (on/off) and switchable typos preservation (on/off) in the browser is very realistic and relatively easily doable. For example, see the Wikisource:Scriptorium/Archives/2024-08#Dynamic_Layouts_and_Template:SIC_/_Template:Errata_possible_interaction discussion topic. Currently we don't have these features in the browser because the consensus of the Wikisource contributors is firmly against having them. Exporting to EPUB/PDF is another part of the puzzle, because right now there's only one non-configurable way to do that as well. But this is again not set in stone and it's the community's desire to preserve the status quo that is the decisive factor. --Ssvb (talk) 11:21, 24 October 2024 (UTC)
- And I don't want to be misunderstood that new policy should insist upon that list; I want the option without the potential harassment. 'Tis a huge and capable layout engine; policy wants it to be used like a bulldozer to go the 20 feet to get the mail.--RaboKarbakian (talk) 15:52, 23 October 2024 (UTC)
- Ssvb: Too many images, tables, and formulas appear in the middle of paragraphs for indentation to be considered "easy" or "realistic" technically. You would still have to leave a mark where the "new paragraph" does not indent, and that puts the "technical doable" into the same problems that people have with this. Automatic paragraph indentation is confusing to see. Heck, sentences get interrupted for image and the like. I just cannot agree with the "easily doable" part.--RaboKarbakian (talk) 12:16, 24 October 2024 (UTC)
- (Also, not all paragraph starts are indented, see for example this.) — Alien 3
3 3 12:18, 24 October 2024 (UTC)- This doesn't seem difficult to solve: just needs one template that marks up non-indented paragraphs; this could just add a css class that does nothing if the CSS displays it as paragraphs with gaps between, but prevents any indentation if the reader selects "original paragraph indentation mode". In fact, we already have {{No indent}} and {{Nodent}} for just these situations. Pretty sure there's also a template that prevents the gap between paragraphs for continuations of the same paragraph (e.g. that have been interrupted by poems, tables, etc.) – but I can't find it at the moment! --YodinT 16:26, 24 October 2024 (UTC)
- Putting every non-indented paragraph in a template would make for quite a lot of lot, wouldn't it?
- Also, indentation is not always the same, depends on the period, publisher, etc, so we'd have to add something to the index styles too, which would make more stuff to do.
- The most problematic part would probably be updating all we've done so far to make it compatible with the changes.— Alien 3
3 3 16:29, 24 October 2024 (UTC)- Most books I've come across that have indented paragraphs only have a few exceptions to that rule as far as I've seen, so not a huge amount of work while proofreading to add {{ni}} in those cases. And I think the idea would be to make this opt-in, so in most cases (including all the books currently transcribed), they'd just display as they currently are. If an editor wants to give readers the options to view it with the original paragraph indentation (or other options, like long-s, original margins, etc. etc.), the editor could add those options to the Index CSS, and just add the {{ni}} exceptions as they were proofreading (again, not too much more work, and entirely their choice). Editors could choose to go back through the works they've already done, and add indentation options, etc. if they wanted, but again this would be completely optional, just allowing those who want to to do so, but no obligation for editors who aren't interested in this. And, as mentioned below, if editors added this option, readers could still choose whether to view it either as it currently is (i.e. modern paragraph spacing, no long-s – this could be the default option for logged-out users), or with something closer to the original typography (could even give more granular toggles, so font style as one option, page margins another, etc.). --YodinT 17:09, 24 October 2024 (UTC)
- This doesn't seem difficult to solve: just needs one template that marks up non-indented paragraphs; this could just add a css class that does nothing if the CSS displays it as paragraphs with gaps between, but prevents any indentation if the reader selects "original paragraph indentation mode". In fact, we already have {{No indent}} and {{Nodent}} for just these situations. Pretty sure there's also a template that prevents the gap between paragraphs for continuations of the same paragraph (e.g. that have been interrupted by poems, tables, etc.) – but I can't find it at the moment! --YodinT 16:26, 24 October 2024 (UTC)
A dissenting opinion here: I'm personally more interested in producing an accurate digital version of the text itself than this approach, but I think it's both technically feasible, and also not a terrible idea to allow editors to create essentially "vectorised facimilies" (i.e. the precise fonts and typography used, page margins, etc. etc.) – if this was provided for as a separate stylesheet for example (so /styles.css
for the normal web edition, and /facimilie.css
for this), it would be straightforward for a parser to let the reader choose which version they wanted to see (another option could be annotated versions; again all using the same Page:s). This would let editors produce whichever version they wanted (facimilie, text, or annotated), without having to revert/ban/tell editors that it has to be done in a certain way, and producing standardised texts that the majority of readers would find useful regardless of which approach the editor uses. --YodinT 14:16, 23 October 2024 (UTC)
- Yodin When I export my highly stylized works to epub, most of the style goes away, and it is usually a good experience to read these things there. Having the exporter export to text would allow picky readers to impose their own style to it, or not. Exporting to text would also (hold your horses here!!): preserve publishers typos, which is what we do here (by consensus). The 18th century Arabian Nights I have been working on--there is a late 19th century version that is so much more readable: so that to me, having the earlier one "modernized" and streamlined for reading is silly. Having it look the museum piece that it is kind of nice in a documentation sense.
- A howto for setting your personal browser's style would settle most concerns, without the need for multiple style sheets.
- Also, the long-s option. Really, people should be required to log in to turn them into s. That way, we get the email addys for getting the donations.--RaboKarbakian (talk) 15:52, 23 October 2024 (UTC)
- There's already a howto at Help:Layout. But such howto for setting your personal browser's style is beyond the abilities of the vast majority of the Wikisource users. Moreover, many of the existing wiki templates would benefit from becoming a bit more CSS-aware to enable such customization. --Ssvb (talk) 11:41, 24 October 2024 (UTC)
- Would be great to have something along the lines of French Wikisource, which has a tab at the top of the page, next to "Page | Source | Discussion" that allows readers to automatically switch between original spelling and modernised spelling (e.g. this page), and even a toggle to highlight the changes that have been made. In our case it could be things like original typography (long S etc.) instead; could even have an option to toggle between original typos and SIC corrected spellings. --YodinT 13:28, 24 October 2024 (UTC)
- Yodin At French wikisource, the "Source" just links to the Index page, and this wiki has the same link. "modern" is also dated, like tomorrow it will be different things that "modern" describes; so in some ways, modernization is an editorialization of the spelling and its punctuation and such of that time it was transcribed. I really really like "As it was published", which was probably thoroughly modern at its time.--RaboKarbakian (talk) 15:33, 24 October 2024 (UTC)
- Yep, the modernisation option is next to Source (the Index: link) and the talk page (Discussion) tab at the top of the page. It absolutely is editorialisation, but follows predictable rules (this isn't the same in English), and they update the "modernisation" algorithm when there's spelling reform. But the main thing is that it still completely preserves the original "as it was published" version of the text as well, and just allows an automatic option for people who want to read the texts using current spelling conventions. That's what I'd like to see here: an option for readers to easily choose whether they want to see the long-s, original fonts, etc. etc., and original as-is typos, or switch these off. Handling annotations the same way (rather than copy-pasting the text, and adding hyperlinks/footnotes to this copy, which will be extremely difficult to sync with the original if further proofreading/validation improves the quality of the original text) – it seems to me it would be much easier to use templates to markup annotations in Page: space, and switch them off by default – but that's another discussion! --YodinT 16:03, 24 October 2024 (UTC)
- Yodin I was wrong and I would strike my paragraph except that I enjoyed the rant about "modern". Also, that French module is very cool. If we use it here, maybe I might still be around for the "Post-modernization" module!! As it is for me here {{ls}} never displays long s; no matter the preference toggle, no matter the namespace; so I find myself being very firmly on the other side of "No options, this way" pasting the long s so that I can see it that way. I think that in the page namespace it always displayed the s, and that was also not helpful for editing. Also, once, I used one of the wikimedia fonts (via @font in the stylesheet) and since then, my browser displays the wrong font size, always; well, not at first (with a vanilla configuration) but at second; just like something is grabbing it and using its configuration instead of mine. I think these and (many, many) other problems are all related, but the long s one did me in. Another thing, I really hate using those words "I was wrong" just so you know.--RaboKarbakian (talk) 19:51, 24 October 2024 (UTC)
- Yep, the modernisation option is next to Source (the Index: link) and the talk page (Discussion) tab at the top of the page. It absolutely is editorialisation, but follows predictable rules (this isn't the same in English), and they update the "modernisation" algorithm when there's spelling reform. But the main thing is that it still completely preserves the original "as it was published" version of the text as well, and just allows an automatic option for people who want to read the texts using current spelling conventions. That's what I'd like to see here: an option for readers to easily choose whether they want to see the long-s, original fonts, etc. etc., and original as-is typos, or switch these off. Handling annotations the same way (rather than copy-pasting the text, and adding hyperlinks/footnotes to this copy, which will be extremely difficult to sync with the original if further proofreading/validation improves the quality of the original text) – it seems to me it would be much easier to use templates to markup annotations in Page: space, and switch them off by default – but that's another discussion! --YodinT 16:03, 24 October 2024 (UTC)
- Yodin At French wikisource, the "Source" just links to the Index page, and this wiki has the same link. "modern" is also dated, like tomorrow it will be different things that "modern" describes; so in some ways, modernization is an editorialization of the spelling and its punctuation and such of that time it was transcribed. I really really like "As it was published", which was probably thoroughly modern at its time.--RaboKarbakian (talk) 15:33, 24 October 2024 (UTC)
- Would be great to have something along the lines of French Wikisource, which has a tab at the top of the page, next to "Page | Source | Discussion" that allows readers to automatically switch between original spelling and modernised spelling (e.g. this page), and even a toggle to highlight the changes that have been made. In our case it could be things like original typography (long S etc.) instead; could even have an option to toggle between original typos and SIC corrected spellings. --YodinT 13:28, 24 October 2024 (UTC)
- There's already a howto at Help:Layout. But such howto for setting your personal browser's style is beyond the abilities of the vast majority of the Wikisource users. Moreover, many of the existing wiki templates would benefit from becoming a bit more CSS-aware to enable such customization. --Ssvb (talk) 11:41, 24 October 2024 (UTC)
- Generally I support the as-it-was-published attitude, but I am sceptical we will be able to reach an agreement or change the en.ws approach towards all the mentioned subtopics within one discussion. Maybe we should discuss individual problems like indentation, long s, fonts, etc. one by one. BTW: I do miss paragraph indentation here very much, and do not like the modern inter-paragraph spacing that replaces it at all. --Jan Kameníček (talk) 15:50, 24 October 2024 (UTC)
- Jan Kameníček: For group projects, especially those that beginners have been directed to start with, the simpler the better. Individual projects or those having just a few contributors should not have to suffer policy intended (heh, I typoed "indented" first here) for beginners. Another thing, How and where to discuss things where capable and interested hackers might be that can enable things. Poor CalendulaAsteraceae will be coding until the post modern module is needed and maybe still won't be done with everything that is wanted. Also, some of the best coders I know have little interest in policy discussions and might even run from anything using the word "consensus". Phab tickets seem to sit there; although it might just be the tickets I look at. I'ma gonna call what we have now .--RaboKarbakian (talk) 19:51, 24 October 2024 (UTC)
- We also do not include at all times decorative elements/flourishes that may appear in news articles, because we do not have stock svg versions of all of them. That would be the same as "images that start and end chapters", but with news articles, especially from the 1800s. We have some simple rule elements, but not all. We also do not include boxes. Some news articles or advertisements appear in a box. --RAN (talk) 23:04, 12 November 2024 (UTC)
ſ to Template:ls
- purpose: I want to use a bot to replace the
ſ
with {{ls}} - scope: Arabian Nights Entertainments (1706)
- programming language or tools: I've never done a wiki bot before so I'm not sure yet, I'm open to ideas.
- degree of human interaction involved: semi-automated I think?
Eievie (talk) 05:57, 31 October 2024 (UTC)
- Eievie:
- use of {{ls}} is not mandatory; some extra things come from using it.
- there is a bot that runs here whose purpose is to release drag on templates: {{ae}} and {{black-letter}} are instances that I know of. {{ae}} gets reverted to its utf equivalent æ and black-letter just gets removed.
- the project has been accomplished more than 90% one person. It is the custom at en.wikisource to follow the precedence set by the main contributor, unless it is a book that is within a collection of works that should have similar typographic customization. An example of this is a recent conversion of '' to ‘’ in the Lang Coloured Fairy Books.--RaboKarbakian (talk) 17:49, 11 November 2024 (UTC)
- (I'm not aware of a bot removing {{bl}}. Which one would it be?) — Alien 3
3 3 18:01, 11 November 2024 (UTC) - The style guide says to use {{ls}} (see Wikisource:Style guide/Orthography). If it's usage isn't actually encouraged, then the guide needs to be changed. Eievie (talk) 19:44, 11 November 2024 (UTC)
- Oppose—without inserting into this discussion any personal opinion of mine on the issue of {{ls}} versus
ſ
, it is a quite contentious issue in the enWS community. It would be better not to fuel that flame. If you must, maybe a broader discussion on the issue (across all texts) would be in order instead of on an individual work. SnowyCinema (talk) 01:25, 12 November 2024 (UTC)- Wikisource:Style guide/Orthography makes it look like a settled issue, like there's policy — or at least guidelines — that say use {{ls}}. If that's not true, if it's actually contentious, then the style guide should be updated. Otherwise its super misleading. I'm newish to this site and I read that guide and was left thinking, "Ok, so that's a preestablished policy/guideline, so I should implement it when possible." Eievie (talk) 02:02, 12 November 2024 (UTC)
- This is a bit offtopic, but if standardizing characters to follow a single precedent while editing page-by-page, see WS:Regex. You can type long-s and short-s both as a "s", and then automatically convert the ones that should be {{ls}}, or convert all the {{ls}}s to ordinary "s". Either can be done with two clicks (one to open the tool). HLHJ (talk) 03:23, 12 November 2024 (UTC)
- I'm familiar with it; I did Arabian Nights Entertainments volume 1 that way. There are a lot of volumes though, which is why I asked about bots. Eievie (talk) 03:31, 12 November 2024 (UTC)
- Eievie: At the very least, your edits broke volume 1 (did you not even notice all of the newly-introduced red links)? In any case, because you have not “fixed” the rest of the Arabian Nights, please revert your changes to the first volume so that all of the text has a consistent style. TE(æ)A,ea. (talk) 03:48, 12 November 2024 (UTC)
- The main person behind that work and I are discussing the use of {{ls}} privately, and we will settle this between us. But the main person behind the page did not automatically want then {{ls}} removed. Eievie (talk) 05:04, 12 November 2024 (UTC)
- Eievie I wanted to wait until I was not angry to address this. By that time, I believe I could intelligently state my reasons here. If you paste any user name (example: [[User:Eievie|Eievie]]) they will get a notification. Also, see Wikisource:Scriptorium#Wikisource:_We_preserve_publishers_typos where I was probably annoyed, mostly from pasting all of those darn ſ.--RaboKarbakian (talk) 17:44, 12 November 2024 (UTC)
- I'm fine dropping this whole thing — I'm just asking that Wikisource:Style guide/Orthography#Phonetically equivalent archaic letter form be changed then. I maintain that trying to implement an explicitly stated site style guideline is not unreasonable. If its not something people are actually supposed to do on this site, I need there to not be instruction pages saying that's how things ought to be done. Since the question of bot usage is long over, can this thread also be ended and someone point me to what thread handles questions of altering policy and making it clear? Eievie (talk) 21:21, 12 November 2024 (UTC)
- Eievie I wanted to wait until I was not angry to address this. By that time, I believe I could intelligently state my reasons here. If you paste any user name (example: [[User:Eievie|Eievie]]) they will get a notification. Also, see Wikisource:Scriptorium#Wikisource:_We_preserve_publishers_typos where I was probably annoyed, mostly from pasting all of those darn ſ.--RaboKarbakian (talk) 17:44, 12 November 2024 (UTC)
- Eievie: At the very least, your edits broke volume 1 (did you not even notice all of the newly-introduced red links)? In any case, because you have not “fixed” the rest of the Arabian Nights, please revert your changes to the first volume so that all of the text has a consistent style. TE(æ)A,ea. (talk) 03:48, 12 November 2024 (UTC)
- I'm familiar with it; I did Arabian Nights Entertainments volume 1 that way. There are a lot of volumes though, which is why I asked about bots. Eievie (talk) 03:31, 12 November 2024 (UTC)
So, even though this has been dropped, I just learned from Eievie that purpose was not to script a wikibot (as indicated here) but to make it easier to proof already existing pages. See User_talk:Eievie#rh_vs._c I suggest that an admin deny this request, point Eievie to Category:Proofread and recommend that if there is anything on one of those pages, to simply pick a different one.--RaboKarbakian (talk) 19:33, 13 November 2024 (UTC)
- Completely aside from the content discussion here, a regex that operates on a whole work would be very useful. I've been dealing with OCRs that make errors so consistently that an autoreplace on the entire work would save a lot of time, and sometimes you format a work one way, and then realize that you really ought to replace all instance of one template with another, which must also happen when old templates get replaced and deprecated; there are lots of non-controversial use cases. It would be necessary to have a way to revert the whole thing with a click, though. HLHJ (talk) 19:42, 17 November 2024 (UTC)
- For non-controversial requests (e.g. € to e), you can always ask at WS:BR. — Alien 3
3 3 19:47, 17 November 2024 (UTC)
- For non-controversial requests (e.g. € to e), you can always ask at WS:BR. — Alien 3
Android app for Wikisource
Hi, is there an Android app for Wikisource? How does it work? I have been advised that there is no infrastructure for push notifications for Android apps for sister wikis and I would be interested to know more. Related: phab:T378545. Thanks! Gryllida (talk) 23:14, 29 October 2024 (UTC)
- There is no app for Wikisource at all. For any platform. There is only the website.
- This isn't a terribly popular website, so it's probably not worth the time to develop an app to help editors—even though I would love it. —FPTI (talk) 06:50, 25 November 2024 (UTC)