Wikisource:Scriptorium/Archives/2012-09
Please do not post any new comments on this page.
This is a discussion archive first created in , although the comments contained were likely posted before and after this date. See current discussion or the archives index. |
Announcements
mw:MediaWiki 1.20/wmf9 deploying to all wikis by August 29
“ |
|
” |
—w:Wikipedia:Wikipedia_Signpost/2012-08-20/Technology_report#In brief |
— billinghurst sDrewth 14:55, 21 August 2012 (UTC)
Proposals
Page Triage
I think we should request to have the page triage extension enabled here. It is current live on en.WP, and you can see it at Special:NewPagesFeed on.en.WP. Please check it and considering supporting have it enabled here so we can request a bug. BirgitteSB 17:01, 12 July 2012 (UTC)
- I like, looks like a good tool Jeepday (talk) 00:22, 13 July 2012 (UTC)
- I second it, it looks like a much more flexible way to view new pages, which otherwise get swept away and lost. Inductiveload—talk/contribs 01:29, 19 July 2012 (UTC)
- At the very least, it looks like a lot more fun than the current new pages list! --Eliyak T·C 01:53, 19 July 2012 (UTC)
- I support this too; it looks good. Hopefully it can be adapted for some Wikisource needs (eg. "No header" warnings like the "Orphan" and "No categories" I'm currently seeing in red text) but it isn't even finished for Wikipedia yet so I won't hold my breath for that. - AdamBMorgan (talk) 11:21, 19 July 2012 (UTC)
- The current w:Special:NewPagesFeed is already fully usable, so I would vote for its installation It seems to offer at least all the functionalities of MW:Help:Patrolled edits + w:fr:Wikipedia:LiveRC. JackPotte (talk) 20:26, 19 July 2012 (UTC)
- I support this too; it looks good. Hopefully it can be adapted for some Wikisource needs (eg. "No header" warnings like the "Orphan" and "No categories" I'm currently seeing in red text) but it isn't even finished for Wikipedia yet so I won't hold my breath for that. - AdamBMorgan (talk) 11:21, 19 July 2012 (UTC)
- At the very least, it looks like a lot more fun than the current new pages list! --Eliyak T·C 01:53, 19 July 2012 (UTC)
- Thank you all for taking the time to look it over. Since there is clear support I have filed a bug to enable. I am sure we have more bugs tweak it for WS related tasks once we test it out. This feature is being developed as a live deployment since it does not supersede anything. So it might be good to get on the list to figure out our issues while there is still team assigned to this. bug 38512--BirgitteSB 01:40, 20 July 2012 (UTC)
- Hey all; thanks for this (I'm afraid I only just saw it!). It's not currently in a state where we're deploying it anywhere and considering it "done", but I'm glad to know there is some interest from non-enwp projects :). I'll let you know when we've got a final release, and we can talk about what needs to be done to localise it. Ironholds (talk) 11:06, 26 August 2012 (UTC)
Sidebar changes
The following discussion is closed:
The following changes will be made to the sidebar:
- Change: Random page → Random work (NB: link changed)
- Change: Random book → Random transcription
- Add: Subject index
--Erasmo Barresi (talk) 14:23, 4 August 2012 (UTC)
- Closed again after a month, same result. - AdamBMorgan (talk) 11:56, 30 August 2012 (UTC)
There is a proposal to change the links in the sidebar. This has come out of a discussion on Wikisource talk:Maintenance of the Month. The three changes are:
- Change: Random page →
Random workRandom work (limit to basepages in the main namespace) - Change: Random book → Random transcription
- Add:
PortalsSubject index
At present, the functions of "Random page" and "Random book" are not clear from their names. The results of "Random page" also give a mix of different namespaces, usually a page in the Page namespace due to simple quantity. This is probably not the intent of this link nor the behaviour users may expect (especially new and casual users). The change described here not only alters the wording but limits the result to the Main namespace; this should be a little more intuitive. "Random book" produces a result in the Index namespace but, as stated, this is not currently clear from its name, which could be confused for works in the main namespace. The final item just adds the main portal page to the sidebar as a new link. - AdamBMorgan (talk) 11:55, 26 July 2012 (UTC)
- support for the first proposal. To keep within the existing spirit, why not link the third to Special:Random/Portal. If you don't want more Randoms then I suggest this is more useful for the newcomer than Special:Random/Index. Chris55 (talk) 15:43, 27 July 2012 (UTC)
- I like the two Random links, but I'm not sure "Portals" (in and of itself) is as intuitive a title as we believe it to be. "Subjects" or "Topics" would convey the meaning a bit better to the average reader, I think, though at any rate I do think we should have a link to it on the sidebar (and I think a direct link to the Portals page is better than a random portal link). I would also put the Portal link (whatever it ends up being called) above the three random links. EVula // talk // 15:51, 27 July 2012 (UTC)
- Support as developing, though I do think Portals is a Wiki(pedia) term that most user either are or should become accustom to. JeepdaySock (talk) 16:53, 27 July 2012 (UTC)
- They'll see it soon enough when they click on the link! I support Subjects. Chris55 (talk) 10:51, 30 July 2012 (UTC)
- What about "Subject index", which is fairly standard library term? (It isn't entirely accurate bt close enough). This makes me think the Author index should be added as well but that can wait for now to prevent "proposal creep." - AdamBMorgan (talk) 11:35, 30 July 2012 (UTC)
- There may only be enough room for "Random Subject" - which was my proposal. I think there is merit in it - it shows a lot more of what Wikisource has. It takes a lot of effort to go down from the top-level portal, whereas it's quite easy to go up. Chris55 (talk) 18:32, 30 July 2012 (UTC)
- What about "Subject index", which is fairly standard library term? (It isn't entirely accurate bt close enough). This makes me think the Author index should be added as well but that can wait for now to prevent "proposal creep." - AdamBMorgan (talk) 11:35, 30 July 2012 (UTC)
- They'll see it soon enough when they click on the link! I support Subjects. Chris55 (talk) 10:51, 30 July 2012 (UTC)
I've re-opened this proposal to get further consensus and input following related comments on Scriptorium. I suggest reviewing closure near the end of August. - AdamBMorgan (talk) 00:21, 5 August 2012 (UTC)
- I have amended the "Random work" link in this proposal because I have only just found the Special:Randomrootpage special page. This should exclude subpages, which I think will be the best use for this link (and the most expected behaviour). I hope this does not negatively affect the approval of this proposal. - AdamBMorgan (talk) 17:16, 16 August 2012 (UTC)
Additional proposal
To prevent the first proposal being derailed, this is a parallel proposal for two more changes to the sidebar:
- Add: Random subject
- Add: Author index
A random portal link has been suggested and an author index would complement the portal, or "subject", index (besides, both French and German Wikisources have this in their sidebars). - AdamBMorgan (talk) 17:24, 16 August 2012 (UTC)
BOT approval requests
Help
Other discussions
Was there a software update that we weren't told about?
There seems to have been a software update. The changes I've noticed are:
- The "rollback" link now tells us how many edits will be rolled back; and
- The buttons on the Proofread Page are now loading in a different sequence (at least in monobook). The standard set of editing buttons are first, then those from my js, then the special buttons for Proofread page (toggle header, zoom and vertical/horizontal). (This change is a &^%$ nuisance. I've clicked the left most button to toggle the header some 20,000 times in the past three years and suddenly today I'm adding bold text when I do so. Any ideas?)
Are there any other changes that we need to know about? Beeswaxcandle (talk) 03:29, 9 August 2012 (UTC)
- Updates are being applied roughly every two weeks for some time now. The notices for each update seem to have stopped
but if you dig through the archived WS:S pages, you can find the link back to the main MW page tracking the updates.See Below
- fwiw.... I'm seeing the same button behavior in PR mode and it doesn't seem to be skin specific as I'm running Vector not Monobook. I forget who it was that complained about this earlier but I'm sure its near the same archived pages as the last notice mentioned above is. -- George Orwell III (talk) 04:43, 9 August 2012 (UTC)
- User:Ineuw(talk) is struggling since long with the button problem. I am not updated on the latest status of his investigations (Inductiveload helped him, at least to explain why it happened)--Mpaa (talk) 12:04, 9 August 2012 (UTC)
- Beeswaxcandle, thanks for this post and the replies. At least it proves that I am not (so) crazy. In addition to the earlier problem, this latest button location switch drives me (more) nuts. I am using Vector and was wondering about monobook, but now I know that it happens regardless of the skin. Posted a detailed Bugzilla report 38218 over a month ago, but nothing happened since. Interestingly, this exists in both Wikipedia and the Commons editors - using the same editor as here. For the time being, I transferred most of the toolbar macros to AutoHotkey, but this latest issue stomps me. In short, didn't follow up with Inductiveload because I didn't want to tie up more his precious time when I already reported the issue on Bugzilla. Besides, with his sheep herding, summer in England, and the Olympics, he has enough on his hands. :-) — Ineuw talk 19:15, 9 August 2012 (UTC)
- When was our software updated to 1.20wmf9 ???? Did anyone know about this? It seems that we have become the guinea pigs for Wikimedia. — Ineuw talk 22:24, 9 August 2012 (UTC)
- Why, Ineuw, do you not like guinea pigs? Aside from bad manners of those with power, what difference does it make that we were not told when it will be done anyway? Roll with the punches and keep on going forward. Kind regards, William Maury Morris II (talk) 22:37, 9 August 2012 (UTC)
┌────────────────┘
Here's the link where the updates are tracked - MediaWiki -- George Orwell III (talk) 23:06, 9 August 2012 (UTC)
- Thanks for the link, however this doesn't point to a solution, or a way to bring the issues to the attention of the developers to act upon. I find it ridiculous that new software updates are released every two weeks without attending to the problems introduced in previous releases. There are 129 bugs are supposed to be addressed in this version, none of which relate to the toolbar bug introduced in version 1.18. — Ineuw talk 08:18, 10 August 2012 (UTC)
- 'Squeaky wheel gets the grease'?
Mentioning the problem - any problem - here solves very little because this forum is not the exclusive point where debate, discussion and development takes place. If it was, it would be monitored more diligently but "we" can't even seem to unite under that single banner so I'm afraid things are not likely to change in the future. -- George Orwell III (talk) 11:19, 10 August 2012 (UTC)
- 'Squeaky wheel gets the grease'?
- Also worth watching Special:Version so you can see which variety of the core change is in place. — billinghurst sDrewth 12:23, 10 August 2012 (UTC)
- Yes, if you have a bug related to Mediawiki and his extensions open a bug on bugzilla. The big problem is that there is no paid developer that works on Wikisource related extensions (specially ProofreadPage and DoubleWiki). Beau and I work a few on it but we haven't a lot of time and we are too tired of our "real work" to spend a lot of time on things that doesn't enjoy us. So, please, make lobbing in order have a paid developer that give a day each week and be nice with us. Tpt (talk) 17:52, 10 August 2012 (UTC)
- Thanks for all the replies and I have absolutely no problem with the developers. Am grateful for any help and which I have often received. I am also surprised that this popular software doesn't have more paid/salaried developers. I wouldn't even try to guess the number of commercial sites that use the wiki software. Tpt's post did lift the veil of the mystery and my ignorance of how things get done and that in itself is most helpful. — Ineuw talk 18:42, 10 August 2012 (UTC)
- There are paid developers for Mediawiki, there are not developers who are paid to work on some our favorite extensions. Extensions that are not used by many other installations of Mediawiki. One big piece of news from Wikimania, which Charles mentions below, is that for the first time ever there is paid developer who responsible for ProofreadPage extension. Now this is still not his primary responsibility and "ProofPage" extension does not mean "anything wanted at Wikisource". From other things I have picked up, the idea seems to be that all staff developers have each taken responsibilty for an extension that is not being developed by staff to the end that they will be responsible for seeing that the volunteer development is reviewed and that they should aim to spend 20% of their time working with the volunteer developers more closely. The previous situation was that volunteer developers wrote updates and bug fixes which sat for months and months without even being reviewed. So there were fixes to bugs sitting there that were just never actually fixed, no feedback given, driving people mad. So things are looking much better for ProofreadPage, but perhaps it was not realized how bad off we were starting from. However your toolbar issue may not be part of ProofreadPage. I really don't know who to figure out the software origins of things like that at all. If you can figure out what extension it is part of, you might try looking at mw:Main Page for the extension which may list who maintains it. Talking to them directly might move things along quicker than just watching the Bugzilla page.--BirgitteSB 17:17, 11 August 2012 (UTC)
- My sincere gratitude for Birgitte's reply because it confirms that my effort at finding the answer to the very same question on mediawiki was in the correct direction, namely: which extension do toolbar issues belong, and where to post questions and initiate a discussion on mediawki. There is a page for custom toolbar buttons which mentions differences between the terminology used in the new style editor and the (deprecated) legacy editor which posters in this conversation are having problems with. For this, I would like to talk with Inductiveload before I post additional info. The ProofreadPage extension and toolbar customization seem to be separate, as well as the {{PAGESINCATEGORY}}. I moved this to a subsection at the end of this post. It's also the intent to move my bug list below to the proper extension talk page on mediawiki (as required), as soon as I find the appropriate talk pages.— Ineuw talk 03:49, 12 August 2012 (UTC)
Comment on {{PAGESINCATEGORY}}
To note that there is some commentary about the release and release cycle at w:Wikipedia:Wikipedia_Signpost/2012-08-06/Technology_report#In brief and we should explore the changes for {{PAGESINCATEGORY}} — billinghurst sDrewth 06:14, 11 August 2012 (UTC)
- Has anyone had a discussion with Amir Aharoni about the general situation? There was some chat at Wikimania about the deal with ProofReadPage and Amir's responsibility; and I put my oar in at one point. The top-down theory is that ProofReadPage is supported. If the main point is that MediaWiki 1.20 is a moving target, then maybe someone should be providing feedback. Charles Matthews (talk) 07:46, 11 August 2012 (UTC)
- Charles Matthews, thanks for bringing it to my attention. I also noticed that Amir Aharoni is involved with aspects of extension maintenance, and he is also on my list to contact. — Ineuw talk 03:49, 12 August 2012 (UTC)
Year parameter in the header template
There are a few problems with the year parameter in the header template, both in terms of appearance and function. I've summarised them at Template talk:Header#Year parameter. Some if this is how the year appears and some of it is how the matching category is added. (NB: In related news, I've also recently changed the way {{author}} handles dates. A few of the problems are similar.) - AdamBMorgan (talk) 02:29, 11 August 2012 (UTC)
- As I have commented on the linked page, I would prefer that we didn't add that sort of complexity to the template, rather just fix the problem. Trying to have that added complexity for such a small number of works, and those that can be fixed seems arse about to me. — billinghurst sDrewth 14:18, 11 August 2012 (UTC)
mwf1.20wmf9 bug list
I don't know in which discussion section this post belongs, and which of these issues (re)qualify for a bug report, so I decided to post this here and ask, anyone in the know to move this post to its proper place:
- Custom toolbar button position change. This was discussed in the above post previously, and a Bug report was already filed on July 6th, 2012. Just thought that this should be mentioned in this new post.
- Toolbar macros no longer function in headers and footers. This feature was lost and then corrected twice before in previous versions.
- In User:Inductiveload/Custom toolbar buttons.js, the mw.toolbar.insertButton functions don't work in my scheme of things, but mw.toolbar.addButton does, as in User:Ineuw/vector.js.
Again, I will gladly create the bug reports but at this point, I don't know which of these qualify for one. — Ineuw talk 03:17, 11 August 2012 (UTC)
To anyone having issues with the toolbar button layout, after a long discussion with Aharoni, I posted the bug on Bugzilla
HERE and HERE. — Ineuw talk 13:26, 21 August 2012 (UTC)
"pages" command no longer works at Welsh Wikisource
Can someone help me figure out how to repair the <pages> command over at Welsh Wikisource? The pages cy:Beibl (1588) and cy:Beibl (1620) and their subpages use it, and it used to work fine on them last spring when I was working on them. Now, even though I haven't changed anything since then, I get either an "Error: No such index" message or nothing at all, but the corresponding indices cy:Indecs:Beibl Cyssegr-Lan 01 Intro+Genesis.pdf and cy:Indecs:Y bibl cyssegr-lan.djvu do exist. Any ideas or suggestions? Thanks! Angr 09:01, 11 August 2012 (UTC)
- Your Index: and Page: namespace are not defined. cy compared to en. A bugzilla will need to be submitted to get the namespace created, and the pages updated for those creations. We did that for nlWS recently, see their request and resolution at bugzilla:37482
- Bugzilla:39264, though you will need to chase it. — billinghurst sDrewth 14:50, 11 August 2012 (UTC)
- Thanks. I wonder why it used to work. What do you mean by "you will need to chase it"? Angr 17:11, 11 August 2012 (UTC)
- "Noisy wheel" scenario. — billinghurst sDrewth 14:05, 12 August 2012 (UTC)
- Added myself to the bug, so that we can show there's a bit of interest in this getting fixed. EVula // talk // ☯ // 15:10, 13 August 2012 (UTC)
- Thanks. The weird thing is, I'm pretty sure the Index: and Page: namespaces did exist in the past, because I wouldn't have created indices and Page-namespace pages without them. I don't understand a lot about software, but I know enough not to use pseudo-namespaces. Angr 16:05, 13 August 2012 (UTC)
- Added myself to the bug, so that we can show there's a bit of interest in this getting fixed. EVula // talk // ☯ // 15:10, 13 August 2012 (UTC)
- "Noisy wheel" scenario. — billinghurst sDrewth 14:05, 12 August 2012 (UTC)
- Thanks. I wonder why it used to work. What do you mean by "you will need to chase it"? Angr 17:11, 11 August 2012 (UTC)
- Bugzilla:39264, though you will need to chase it. — billinghurst sDrewth 14:50, 11 August 2012 (UTC)
Upload file
Although I can now see the same link as others in the left menu which leads to commons:Special:Upload, I personally find the link on commons which leads to commons:Special:UploadWizard far more useful as I'm usually uploading multiple rather similar files. Do other people agree and could we change the local link? Chris55 (talk) 20:25, 13 August 2012 (UTC)
- Please do change it.--BirgitteSB 23:35, 13 August 2012 (UTC)
- I also agree.--Erasmo Barresi (talk) 20:35, 15 August 2012 (UTC)
- Absolutely! Directing uploads to the commons is a great idea. — Ineuw talk 21:57, 15 August 2012 (UTC)
- 'Fraid I don't know how to change special pages. I assume we still want to give people the option to load into WS if they ignore the soft redirect: but that leaves a change within a special page. Could someone who knows the black art help please? Chris55 (talk) 23:45, 15 August 2012 (UTC)
Done Beeswaxcandle (talk) 00:12, 16 August 2012 (UTC)
Use of "without text" page status
When a page has been marked as "without text", it currently shows up with a heading "This page does not need to be proofread." I've been reviewing a large number of "almost finished" projects and one of the more common problems is that irrelevant matter is marked "not proofread" or "problematic" thus preventing (or complicating) the advance of the project status to "done" or "proofread but not validated". Common examples are advertisements, many of which are entirely divorced from the subject matter of the book, book covers, and even library information pages about the book. Most of these have text and it's clearly not helpful to be over-literal about the tag "without text", which can even be interpreted to cover pages which have only pictures (which are important).
Help:Page Status confirms this interpretation when it says: "Without text is for blank pages, or other pages that do not require double proofreading". Hesperion has proposed the use of the Category:Not transcluded as an alternative way of marking pages but that has several problems: it's not part of the core proofreading tool, it doesn't show up when the page is viewed (e.g. this page), and it's notoriously hard to get people to make category entries. In addition, I don't see any good reason to stop people transcluding some of these pages if they really want to. Adverts can be fascinating and book covers can be works of art, but more often they are neither.
Therefore I suggest we should stick to the use of "without text" to signify any pages that are not necessary for the completion of the proofreading process. The <pagelist> entry in the index page can be used to clarify this, by marking pages as "advert" or "adv". With adverts it's important to be consistent: I've seen examples where one or two ads have been validated and the rest ignored. In these cases either all should be proofed, or marked as unnecessary. Chris55 (talk) 09:07, 17 August 2012 (UTC)
- I like the idea of special markings for advertising pages. There is a lot of interesting history in them, and while they are not strictly required for the work to be complete, they do add to the ambiance of the work. I would also imagine calling them out would provide an easy access for anyone who is interested in the advertisements themselves. Presumably we could draw in an editor whose only interest is advertisements. JeepdaySock (talk) 10:44, 17 August 2012 (UTC)
- I agree that adverts and similar pages should be marked as "without text".--Erasmo Barresi (talk) 19:24, 17 August 2012 (UTC)
- Strongly disagree with the underlying premise that every page in an Index must be marked in some way before a work can be advanced to "done" or "proofread by not validated". Marking all the advertisements and lists of other works by this author as "without text" just to get the Index off a maintenance list is going about the problem in the wrong way. It's the list that's wrong, not the fact that the pages with the adverts etc. haven't been created or are marked as "not proofread" or "problematic".I do not support marking pages with text that is in some way related to the work, the author of the work or the publisher of the work as "without text." Please stop marking these pages in this way, until community consensus has been reached. Beeswaxcandle (talk) 22:44, 17 August 2012 (UTC)
- I haven't been marking adverts as "without text". The comment I've been using is "advert not proofread". Are you arguing that code 0 in the proofread page extension should apply only to pages which have no text? Should it not be applied to pages which only have the library markings, or to badly reproduced covers or to pages on which Google or Microsoft have put their stamp? I think you're being over-literal.
- It's extremely difficult currently to sort out index files which have genuine deficiencies such as un-proofread pages and missing images from indexes in which the differences are irrelevant. If you can suggest a better way than the one I've suggested, I'm all ears. Chris55 (talk) 23:12, 17 August 2012 (UTC)
- I see this much the same way Beeswaxcandle does. "Without text" should be for blank pages and blank pages only as found to be or recognized as part of the original content. Hand written notes, university or library markings, Google disclaimers and similar that were obviously added after "printing" took place should be the only exceptions to that & Adverts don't fall into that category imho. All too much attention is being paid lately to ancillary name-spaces and what takes place under them for that matter. The Index: & Page: name-spaces exist to facilitate proof-reading or to verify transcription integrity afterwards and should not be treated as some finished product of proof-reading itself. -- George Orwell III (talk) 23:37, 17 August 2012 (UTC)
- My view accords exactly with George's. Hesperian 01:19, 18 August 2012 (UTC)
- I see this much the same way Beeswaxcandle does. "Without text" should be for blank pages and blank pages only as found to be or recognized as part of the original content. Hand written notes, university or library markings, Google disclaimers and similar that were obviously added after "printing" took place should be the only exceptions to that & Adverts don't fall into that category imho. All too much attention is being paid lately to ancillary name-spaces and what takes place under them for that matter. The Index: & Page: name-spaces exist to facilitate proof-reading or to verify transcription integrity afterwards and should not be treated as some finished product of proof-reading itself. -- George Orwell III (talk) 23:37, 17 August 2012 (UTC)
- Isn't the <pagelist /> tag the key to distinguishing the Index files with real issues from those with gaps due to pages that we're choosing not to create because we won't be transcluding them into the Mainspace? The time of creating an Index is the time to make sure that no pages are missing. Sorting out the print-page numbers and prefatory pages and putting them into the pagelist tag is most easily done at the same time. I suggest that the relevant help page is updated with instructions on standardised words to use for particular page types. Then the pagelist tag could be queried as a part of producing the maintenance lists. Beeswaxcandle (talk) 04:22, 18 August 2012 (UTC)
- Just to be clear - all the pages in the source file
mustshould be reflected in some way in the <pagelist /> assignments. The <pagelist /> was once the way to work-around the lack of built-in flexibility of one God damn thing or another in one God damn namespace or the other for a long time and then the developer behind this fabulous disaster up & left like a fart in the wind, leaving us holding the bag. Recently, somebody finally put 2 and 2 together, and added include/exclude/onlyinclude to the <pages /> tag to facilitate a more proper work-around for selective page range transclusion than the old (or customary?) way of fudging the <pagelist /> to the nth degree. -- George Orwell III (talk) 05:05, 18 August 2012 (UTC)
- Just to be clear - all the pages in the source file
- First let me say that what we don't have at the moment is a set of criteria for moving forward projects from "To be proofread" to "To be validated" and "Done". So these actions are done by anyone, without anything to guide them. They are important steps - more important than the proofreading of individual pages and I'd suggest that all such transitions need to be reviewed by other editors.
- I have tried to get people to show some interest in the proofreading process and so far haven't had any success. There are thousands of unfinished proofreading projects on Wikisource and most have been static for several years. I've been using lists of unproofed texts with a few unfinished pages without anyone taking much notice. It's when I recently moved on to those which need verification that I've come across many pages which haven't been touched - both pages without text and those that have ads. But in there are quite a few projects systematically missing illustrations and with unproofed pages. For instance, this file was marked as "Done" more than a year back with illustrations missing from 6 pages, 18 months back. At the beginning of this month Billinghurst marked it as "To be validated" though his comment suggests he thought he had set it as "To be proofread".
- Cases such as these are very hard to spot. There were something like 100 projects marked as "To be validated" which had pages unfinished. Not a huge number as not many have even reached that stage, but enough to make the scanning quite time-consuming. I had previously been through a similar number with a few pages to be proofread and advanced half of them by a stage. I repeat my question: if we don't mark these pages as "not requiring proofreading", how do we make it possible to review them and tidy them up? I accept that my actions have somewhat of a box-ticking flavour, but it's a means to an end: to be able to get a handle on the far larger number of other projects which are still in a mess. There is only one state defined by the proofreading extension and it currently covers 2 types of pages: those with no (original) text and those that, for whatever reason, are not considered necessary to be proofread. We could ask the maintainers to separate the two, but I think we can all see that that pagelist provides an adequate means of doing this. Can you adjust your listings to take this into account without marking that adverts as to be not proofread, Hesperion? That seems possible but difficult to do reliably. Chris55 (talk) 10:16, 18 August 2012 (UTC)
- Your question—"if we don't mark these pages as 'not requiring proofreading', how do we make it possible to review them and tidy them up?"—is meaningless to those of us who regard these pages as warranting being proofread and validated, and therefore legitimately blocking their indices from promotion. Hesperian 11:11, 18 August 2012 (UTC)
- My 2 cents. Can't we disregard the single page status and, if any of such pages are present, categorize instead the Index with a Category such as "Index containing unproofread advert" or something like this? Placing a Category on each and every page is not reliable, but putting such requirement on index page when stepping index to "To be validated" and "Done" should not be difficult to be remembered. Maybe also a reminder could be placed in the drop-down menu when doing such actions. This way I think we can move works ahead n status being able to find through queries which complete indexes have unproofread text for whatever reason. I do not believe working on how we describe the page in <pagelist/> is reliable.--Mpaa (talk) 11:34, 18 August 2012 (UTC)
- The idea that all adverts should be proofread is a new idea, which I hadn't heard anyone suggest until now. That it should be a requirement is imho ridiculous. If people want to do it and even transclude the results I don't care. Chris55 (talk) 12:51, 18 August 2012 (UTC)
- It's not a new idea. Just new to you. Hesperian 03:15, 19 August 2012 (UTC)
- All the same I don't think anyone here is objecting to the promotion of indices while these non-transcluded advertisements remain outstanding. People are simply objecting to falsely labelling the advertisements as "without text". If you don't want to proof the advertisements, by all means promote the index, but leave the advertisements as "not proofread", in case someone else wants to proof them later. For the record, here is a featured text in which the publisher's catalogue has been proofed, validated and transcluded. I like it. Hesperian 03:21, 19 August 2012 (UTC)
- I agree with Hesperian's point of leaving ads a "not proofread" and I agree with AdamBMorgan's statement that sometines the ads are very interesting and can add a lot of good information (not Adam's wording) Too, sometimes I enjoy proofreading ads and I have one now that is not validated. Some ads get overly fancy with art that can only be duplicated using uploaded images. Kind regards to all, Maury ( William Maury Morris II (talk) 03:39, 19 August 2012 (UTC)
- Your question—"if we don't mark these pages as 'not requiring proofreading', how do we make it possible to review them and tidy them up?"—is meaningless to those of us who regard these pages as warranting being proofread and validated, and therefore legitimately blocking their indices from promotion. Hesperian 11:11, 18 August 2012 (UTC)
- Isn't the <pagelist /> tag the key to distinguishing the Index files with real issues from those with gaps due to pages that we're choosing not to create because we won't be transcluding them into the Mainspace? The time of creating an Index is the time to make sure that no pages are missing. Sorting out the print-page numbers and prefatory pages and putting them into the pagelist tag is most easily done at the same time. I suggest that the relevant help page is updated with instructions on standardised words to use for particular page types. Then the pagelist tag could be queried as a part of producing the maintenance lists. Beeswaxcandle (talk) 04:22, 18 August 2012 (UTC)
( tl;dr it all) In short we modified the completion statements on an Index page to allow for works to be marked as completed, when the primary component of the work had been proofread and validated. The blank/problem/not/proof/valid set applies on a per page basis (the individual pages are categorised) and that should remain so. Advertisements are not required to be done to complete a work, though a nice bit of frippery to include if able. Note though that they have importance for other works by an author, sorting through and differentiating authors and name variations, and looking at their periods of work, etc. They provide a good historical record, and are of ephemeral value. My comment re adverts would be that they probably should not be included unless they are proofread.
- I am unsure if we even have a problem to begin with. As Billinghurst said, the statement allows for it to be marked "completed" by acknowledging what a "work proper" constitutes. That being said, text printed in the publication at the time of printing, be it content or advertisements, is text that is subjected to the proofreading process. - Theornamentalist (talk) 17:17, 19 August 2012 (UTC)
- Please provide links to the statements you are quoting. As it stands your statement "Advertisements are not required to be done to complete a work" stands in contradiction to Hesperion's statement: "those of us who regard these pages as ... therefore legitimately blocking their indices from promotion". Not all of us have been around forever or can find these statements. Chris55 (talk) 18:30, 19 August 2012 (UTC)
- [1] see, "when is an index "done"?" which may help somewhat, and shame on you for being new. - Theornamentalist (talk) 18:43, 19 August 2012 (UTC)
- Thanks, I found the quote from Hesperion particularly interesting: "I have no objection to cases like Index:William Blake, a critical essay (Swinburne).djvu being marked as 'Done', because the 16 'Not proofread' pages are advertising end matter not transcluded into the mainspace work." Two years later, drudges like me are still having to look through those files. [I notice an attempt to declare them without text was defeated even then.] Is it time, I wonder, to make some policies and enshrine them properly so the case-lawyers don't get full time employment? Chris55 (talk) 10:10, 20 August 2012 (UTC)
- Just curious; how does the status of a dozen or so pages prevent you from downloading / viewing the work in question as an epub / ebook? -- George Orwell III (talk) 11:03, 20 August 2012 (UTC)
- It doesn't. What I'm concerned about is the green flash on works such as Manual of the New Zealand Flora with its impressive contents list. It isn't till you click on Source that you realise that only 129 of its 1200 pages have been proofread and only a few more created. You can't download the rest. That sign is totally deceitful. At least the sign on the special page tells the truth. Chris55 (talk) 11:41, 20 August 2012 (UTC)
- Which is why we add {{incomplete}}, which I have added to the work. Unfortunately there is no means to apply the Special Index ribbon to a work. It has been asked for and hasn't been able to be achieved as far as I know. —unsigned comment by Billinghurst (talk) .
- But back to the subject at hand: that green flash is indeed misleading ("deceitful" implies intent so I won't use it here). But labeling ads that haven't been proofread as "without text" prima facie contributes to that problem; at any rate it does nothing to help. Hesperian 13:55, 20 August 2012 (UTC)
- "When I use a word," Humpty Dumpty said in rather a scornful tone, "it means just what I choose it to mean——neither more nor less."
- "The question is," said Alice, "whether you can make words mean so many different things."
- "The question is," said Humpty Dumpty, "which is to be master——that's all."
- (and just in case anyone misinterprets, I'm not referring to the word 'deceitful' which I used with care.) Chris55 (talk) 16:25, 20 August 2012 (UTC)
- 'Deceitful' might not be too far off actually - 'misleading' might have been better though. I'm getting a clearer picture of the issues at hand and one of them seems to be the practice of transcluding before proofreading is done. Attempts have been made to curtail this practice but never quite with the threat of deletion & I think that is going to have to be part of the discussion moving forward. Its one thing to transclude as chapter-by-chapter becomes proofread, which to me is fine as long it takes place rather regularly, but the old bit where the toc & front matter gets transcluded and the bulk of the material to follow does not can come across a bit more than just disingenuous or deceitful to the new wave of epub and ebook downloaders that land on en.WS. So I do see your point in that context & agree it is a fundamental problem yet it goes back to my observation earlier where the focus should be more on what is taking place in the main name-space and not so much on the status of the Index: or Page: work-benches at the same time. -- George Orwell III (talk) 12:56, 21 August 2012 (UTC)
- The green bar on Manual of the New Zealand Flora is correct. The one page that is transcluded to the base page has been validated. The "impressive" contents list has all been proofread once. The rest of the contents list will only be added when the pages that relate to the particular order have been proofread (there are over 100 orders in Cheeseman's book). It will gradually be completed, however I figure that if we can make some parts of the various reference works we are working on available to the general reader then we are fulfilling our mission. Beeswaxcandle (talk) 05:55, 22 August 2012 (UTC)
- Thankyou, Beeswax, I finally understand why those colours are all over the place! I'll add an appropriate explanation somewhere in the help files. But it does raise the question whether it is a useful annotation, particularly on the "base page" (where in a number of cases I've found there isn't any transclusion). Unfortunately the idea of a base page is one which doesn't occur in the software itself, so it's not clear one could treat it differently. Chris55 (talk) 09:07, 27 August 2012 (UTC)
- The green bar on Manual of the New Zealand Flora is correct. The one page that is transcluded to the base page has been validated. The "impressive" contents list has all been proofread once. The rest of the contents list will only be added when the pages that relate to the particular order have been proofread (there are over 100 orders in Cheeseman's book). It will gradually be completed, however I figure that if we can make some parts of the various reference works we are working on available to the general reader then we are fulfilling our mission. Beeswaxcandle (talk) 05:55, 22 August 2012 (UTC)
- 'Deceitful' might not be too far off actually - 'misleading' might have been better though. I'm getting a clearer picture of the issues at hand and one of them seems to be the practice of transcluding before proofreading is done. Attempts have been made to curtail this practice but never quite with the threat of deletion & I think that is going to have to be part of the discussion moving forward. Its one thing to transclude as chapter-by-chapter becomes proofread, which to me is fine as long it takes place rather regularly, but the old bit where the toc & front matter gets transcluded and the bulk of the material to follow does not can come across a bit more than just disingenuous or deceitful to the new wave of epub and ebook downloaders that land on en.WS. So I do see your point in that context & agree it is a fundamental problem yet it goes back to my observation earlier where the focus should be more on what is taking place in the main name-space and not so much on the status of the Index: or Page: work-benches at the same time. -- George Orwell III (talk) 12:56, 21 August 2012 (UTC)
- (and just in case anyone misinterprets, I'm not referring to the word 'deceitful' which I used with care.) Chris55 (talk) 16:25, 20 August 2012 (UTC)
- It doesn't. What I'm concerned about is the green flash on works such as Manual of the New Zealand Flora with its impressive contents list. It isn't till you click on Source that you realise that only 129 of its 1200 pages have been proofread and only a few more created. You can't download the rest. That sign is totally deceitful. At least the sign on the special page tells the truth. Chris55 (talk) 11:41, 20 August 2012 (UTC)
- Just curious; how does the status of a dozen or so pages prevent you from downloading / viewing the work in question as an epub / ebook? -- George Orwell III (talk) 11:03, 20 August 2012 (UTC)
- Thanks, I found the quote from Hesperion particularly interesting: "I have no objection to cases like Index:William Blake, a critical essay (Swinburne).djvu being marked as 'Done', because the 16 'Not proofread' pages are advertising end matter not transcluded into the mainspace work." Two years later, drudges like me are still having to look through those files. [I notice an attempt to declare them without text was defeated even then.] Is it time, I wonder, to make some policies and enshrine them properly so the case-lawyers don't get full time employment? Chris55 (talk) 10:10, 20 August 2012 (UTC)
- [1] see, "when is an index "done"?" which may help somewhat, and shame on you for being new. - Theornamentalist (talk) 18:43, 19 August 2012 (UTC)
- Please provide links to the statements you are quoting. As it stands your statement "Advertisements are not required to be done to complete a work" stands in contradiction to Hesperion's statement: "those of us who regard these pages as ... therefore legitimately blocking their indices from promotion". Not all of us have been around forever or can find these statements. Chris55 (talk) 18:30, 19 August 2012 (UTC)
“ | T=[[Category:Index Validated]][[:Category:Index Validated|Done—All pages of the work proper are validated]] |
” |
The discussions for this are in the archives for Scriptorium — billinghurst sDrewth 01:04, 20 August 2012 (UTC)
HathiTrust scraping required
Can someone who has access to Hathi see if there is actually a full copy of the Mennell's work The coming colony. Practical notes on Western Australia. at http://babel.hathitrust.org/cgi/pt?id=uc1.b304901 If there is, can we get it poked over to archive.org for conversion to djvu, or something! Thx — billinghurst sDrewth 13:00, 19 August 2012 (UTC)
- I'm in the U.S. and I can "see" 182 pages total so it appears to be all there. Without a partner login however, I can only download it a page at a time like everybody else. Scraping would be the way to go it seems... unless somebody knows a way to automate the single PDF page downloading - I can easily merge them into a single PDF file (& without the watermarks) if it matters. -- George Orwell III (talk) 14:17, 19 August 2012 (UTC)
- GO3, using that link immediately above I see 144 pages and then come ads. The last of the text is; "..minate it. A small species of porcupine, and the flying fox, are found in the Northern Districts." Also, someone stated that you can only download one image per day but I do not think that is correct. You can download them one after another and then place the individual pages into a .PDF file. That is a slow process though. What is this about "Australia", did someone find Au_Gold? WMM2 ( William Maury Morris II (talk) 01:39, 20 August 2012 (UTC)
- Correct; anyone download a page-a-time all day long without restriction. Even the "scraping" runs had pretty good throttling times between image captures. HaithiTrust is a goldmine of legitimate public domain stuff that primarily Google digitized originally. Since Google has become a bloated fat-cat online entity, they don't feel obligated to go back and put these works into full-view for the public at large. I recall Google had some ridiculous number of copyright violation complaints per month and I don't think that has changed. -- George Orwell III (talk) 10:54, 20 August 2012 (UTC)
- GO3, using that link immediately above I see 144 pages and then come ads. The last of the text is; "..minate it. A small species of porcupine, and the flying fox, are found in the Northern Districts." Also, someone stated that you can only download one image per day but I do not think that is correct. You can download them one after another and then place the individual pages into a .PDF file. That is a slow process though. What is this about "Australia", did someone find Au_Gold? WMM2 ( William Maury Morris II (talk) 01:39, 20 August 2012 (UTC)
- I thought that some kind soul had a login. — billinghurst sDrewth 16:15, 19 August 2012 (UTC)
- Yeah, well, you guys can see more than I can ... "Full view is not available for this item ... due to copyright © restrictions." They play their own make-believe "get out of jail free" card. C'est la vie. <shrug> — billinghurst sDrewth 13:34, 20 August 2012 (UTC)
- I thought that some kind soul had a login. — billinghurst sDrewth 16:15, 19 August 2012 (UTC)
- Billinghurst, do you want that book (The Coming Colony) from the above link connects to or not? If you do I will download the pages for you and place them in a .PDF File. Each page is in .pdf format and as George pointed out this will take a while but within 2 days or less I believe I can get it. I haven't tried this slow process so perhaps faster. What I get will show "Google" on the bottom of the pages. A kind soul, William Maury Morris II (talk) 16:34, 21 August 2012 (UTC)
- Worse. It will have both 'Digitized by Google' & 'University of So-and-so' along the middle to bottom right (much like the normal Google Books scans do) and a hidden timestamp and mention of the Trust offset 180 degrees from the text within the left hand margin reading bottom to top. All of that is not an issue if you have Adobe 10 or higher but it is extremely time consuming to do even with just the 'Digitized by Google' watermarks alone.
I'd be willing to do the cleaning & prep it for text layer/DjVu conversion after all the pages are compiled into a single PDF if it matters any (I'm struggling with bandwidth v. time as it is). -- George Orwell III (talk) 17:05, 21 August 2012 (UTC)
- Worse. It will have both 'Digitized by Google' & 'University of So-and-so' along the middle to bottom right (much like the normal Google Books scans do) and a hidden timestamp and mention of the Trust offset 180 degrees from the text within the left hand margin reading bottom to top. All of that is not an issue if you have Adobe 10 or higher but it is extremely time consuming to do even with just the 'Digitized by Google' watermarks alone.
- Billinghurst, do you want that book (The Coming Colony) from the above link connects to or not? If you do I will download the pages for you and place them in a .PDF File. Each page is in .pdf format and as George pointed out this will take a while but within 2 days or less I believe I can get it. I haven't tried this slow process so perhaps faster. What I get will show "Google" on the bottom of the pages. A kind soul, William Maury Morris II (talk) 16:34, 21 August 2012 (UTC)
- Great Caesar's Ghost![Perry
somebodyWhite from 'Superman'], all that you have stated reminds me of reading about the good ole days when the people were not allowed to read the Bible by decree of the Catholic Church. I have Adobe 10.1.4 and you probably still have that antique version of 10.1.3 :-) Well, what can we do for good ole Billinghurst? What do you suggest? I will try to download all pages and you can clean them, how is that? It's do-able, is it not? We can always collect just the text and refine that on pages. Correct? Respectfully, Maury (William Maury Morris II (talk) 17:18, 21 August 2012 (UTC)- Thanks for the heads up on the update - do believe that it made any difference concerning any of this or you would have mentioned it; right? The key with hathTrust is not to focus (or use?) their page assignments (think our pagelist here). If you go by what they offer in their "viewing tool" you'll wind up missing pages. If you go by the seq=### in the url itself, you can see all the pages in the work and not just what they thought was relevant & assigned a page number for. I believe this is why you said saw ~144 pages while I said I saw ~182 pages (or 182 positions [sequence #s] to be more exact here).
And sure, I'd be willing to work towrds making another insufferable Aussie text available on en.WS - at least it would be one that we wouldn't have to investigate for scanning flaws or missing pages afterwards (ducks quickly). -- George Orwell III (talk) 17:56, 21 August 2012 (UTC)
- George, I go page-by-page and I number them as 001, 002, 003 according to their own page number. It's easy but it takes awhile. However, do you see a page 024 (24)? The one I see is a blank and should not be. Remember, this is gOoGle. -- William Maury Morris II (talk) 18:09, 21 August 2012 (UTC)
- Thanks for the heads up on the update - do believe that it made any difference concerning any of this or you would have mentioned it; right? The key with hathTrust is not to focus (or use?) their page assignments (think our pagelist here). If you go by what they offer in their "viewing tool" you'll wind up missing pages. If you go by the seq=### in the url itself, you can see all the pages in the work and not just what they thought was relevant & assigned a page number for. I believe this is why you said saw ~144 pages while I said I saw ~182 pages (or 182 positions [sequence #s] to be more exact here).
- Great Caesar's Ghost![Perry
I see...
- seq=49, num=23 in url and p. 23 in scan.
- seq=50, num=23 in url and begining of Chapter 6 in scan (assume this is p.24)
- seq=51, num=23 in url and a blank page in scan.
- seq=52, num=24 in url and a blank page in scan.
- seq=53, num=25 in url and p. 25 in scan.
... so yes - 2 blanks exist but I still don't see any missing text content assuming the last sentence on the first page of Chapter 6 (assume p.24) agrees with the first sentence of p.25 for you as it does for me. Now seq 51 & 52 might have been an image and its blank - I didn't check for that.
Otherwise you see what I mean about ignoring the their assigned num[bering]? Assuming Chap 6 is indeed p.24, the fact the number 24 doesn't appear in the scan means their "pagelist" didn't know how to rectify the omission and probably jamed 2 blank pages in there for no good reason. -- George Orwell III (talk) 18:33, 21 August 2012 (UTC)
- Curious - I cannot find a table of contents in the front matter scans. Normal? -- George Orwell III (talk) 19:06, 21 August 2012 (UTC)
- George, please see http://en.wikisource.org/wiki/User:William_Maury_Morris_II/sandbox6 which also will show on the Watchlist. This is the text 1-26 and the the way I see the images I have been d/l in between thunderstorms here. Does this look correct by the numbers? It does look like an interesting book but I hope it isn't 800+ images! William Maury Morris II (talk) 21:51, 21 August 2012 (UTC)
- OK looks the same as the Trust version but what is the point? That plain text is not part of the PDF but their own OCR ran afterwards - it probably won't come along with the pdf pages (if it does all the better) that I thought you were downloading. If you still question the "real" content of the book see if thumbnail view works for you. That should show you all 182 pages including the blanks, etc.
- Btw - I don't know if their are images or not - my old browser doesn't always work as advertised on HathiTrust's whacky system. I didn't see any but I really wasn't looking for any either. -- George Orwell III (talk) 22:35, 21 August 2012 (UTC)
- GO3, My concerns are that no matter what manner (including thumbnail view) I use to view these images I see only those I've mentioned being 144. No way do I see 180+ and I see the complete book including the library stamp. Also, did you want these as .png images? That is what I have been saving them as. Then where do you want me to upload the file, to IA? William Maury Morris II (talk) 03:41, 22 August 2012 (UTC)
- I did not mention in the slightest way, shape or form anything other than PDF(s). However, if you have a single file of some sort at this point upload it here to en.WS and I'll take a look at it. -- George Orwell III (talk) 03:49, 22 August 2012 (UTC)
- I know that you did not mention another format but being new to the process on that site and learning that file save as .png is a lot faster I collected a few dozen. However, it has been storming (thunder+lightening) here, as I think I stated above, so I was off the computer. Since you have limited time and bandwidth let's try this. I will upload 31 .pdfs in a single file (Australia-01) so that you can look at it and let me know if I am doing this right or not. You can download these 31 pdfs in a small single file, separate them into individual .pdfs and be cleaning them off-line as I continue collecting the rest of the pages in .pdf format. Now to figure how to upload to WS... All for Billinghurst William Maury Morris II (talk) 06:02, 22 August 2012 (UTC)
Scraping progress
- Saving-as anything other than PDF flattens the document into one layer or image - no possible hidden text-layer, watermarks become permanent part of page and the timestamp & any metadata becomes embedded. This make removing them 100x harder than already hard than if you just click on the left hand side's 'download page as PDF' at HathiTrust.
Let's just drop this an move on to stuff that we can actually accomplish. -- George Orwell III (talk) 14:18, 22 August 2012 (UTC)
- Saving-as anything other than PDF flattens the document into one layer or image - no possible hidden text-layer, watermarks become permanent part of page and the timestamp & any metadata becomes embedded. This make removing them 100x harder than already hard than if you just click on the left hand side's 'download page as PDF' at HathiTrust.
- Every file I uploaded is a .PDF file. All of the .PDF files were placed into one larger .PDF file. I never uploaded any image file such as .png &c. You have not yet looked at that one .PDF file or you would know this. I downloaded each .PDF file from the left exactly as you are suggesting. My understanding at this point is that either you have not looked at the uploaded file consisting of individual .PDF files or that you just deem that it is too much work and/or too time consuming which it is but it also is a good book. If you want to quit before you start that is your right. I will probably continue and upload to Commons, or here, then work out the rest. Somehow there is a way. I don't quit before I start. For billinghurst. Godspeed, William Maury Morris II (talk) 21:30, 22 August 2012 (UTC)
Then focus on what was actually discussed, provide a link to what you've up loaded here temporarily to en.WS, refrain from further preaching over a volunteer out-of-bounds project and we'll get along fine & finish this, k? -- George Orwell III (talk) 04:39, 23 August 2012 (UTC)
- I am not preaching—I stated facts. What I have done is for Billinghurst (who asked for help) because he was the first administrator I came to know on WS. He helped me a lot with placing books on Wikisource circa 2006/09 while using his manners and was never a grump about any of it. I have just now completed uploading the entire book with each page being a .PDF file, on Internet Archives. William Maury Morris II (talk) 08:18, 23 August 2012 (UTC)
http://archive.org/details/TheComingColony.PracticalNotesOnWesternAustralia
Why would you upload it to IA & its DjVu conversion with all the garbage in place - oh well; I thought I said to upload it here.
- Because it was stated at the beginning, " can we get it poked over to archive.org for conversion to djvu, or something! Thx — billinghurst sDrewth 13:00, 19 August 2012 (UTC)" and I had thought you were bowing out so I followed the above. William Maury Morris II (talk) 16:02, 23 August 2012 (UTC)
Billinghurst, would you like me to trim the excess watermarks and timestamps anyway? This would mean a second upload to IA to create a hidden text-layer and an all new entry for the work. -- George Orwell III (talk) 11:30, 23 August 2012 (UTC)
- The last message I read from you stated that you were bowing out ( "Let's just drop this an move on to stuff that we can actually accomplish ") Therefore, I didn't look back and kept working on gathering the book pages as I said I would thinking that you had indeed discarded the idea of working on this book. It took a while but I collected all pages in .PDF and then placed those into a larger, but yet small, .PDF File and then placed it on IA just as I said I would. I can easily upload it to WS and I will do that next since you are not bowing out. William Maury Morris II (talk) 14:10, 23 August 2012 (UTC)
- Okay, again, just as I stated that I would. William Maury Morris II (talk) 14:24, 23 August 2012 (UTC)
I was cleaning the upload to IA anyway and have removed both watermarks and timestamps already (uploaded over previous @ en.WS). The problem now is that the pages need to be cropped down to remove the excess whitespace per page. This, through no fault of anybody except the Trust themselves, means the current hidden text-layer stored in the upper right half of the left margin will need to be cut (hover your mouse pointer over it to see it change to a cursor). I've tapped my Adobe guy to see if there is any way to save it without whacking it just to recreate again. More when it develops. -- George Orwell III (talk) 15:23, 23 August 2012 (UTC)
- I know you are doing a lot of work although apparently you are very fast with it. IF I knew how to do what you are doing I would not mind doing that aspect of the work. I think the process you are doing should be included on the WS:HELP page so that you would not always have to do this particular work. I am going to check on Amazon.com to see if they have a book on this. I am retired and have nobody "at work" to show me the many ways to work with .PDF Files and I have never seen nor read a book on .PDF files. You are more valuable as an administrator helping people. William Maury Morris II (talk) 16:02, 23 August 2012 (UTC)
- Fast? It took me almost all morning to get that far and there is a ways to go still. And, I don't think double clicking on an 'Object' (a component within a PDF such as the Digitized by Google watermark) until it turns highlighted blue & hitting the keyboard delete button hardly warrants a tutorial.
- Since then I've been waiting for a better solution than the one I usual do with these damn documents re: hidden text-layer in margin instead of under the actual text and have come to realize that I see PDF pages 13 & 14 are duplicates of PDF pages 11 & 12. If you can verify that I can at least delete those before making more edits if & when I get my reply. -- George Orwell III (talk) 20:41, 23 August 2012 (UTC)
- Here's the best that I could muster... feel free to further them in any way to achieve desired results. Of course a move to Commons is needed.
- File:The_Coming_Colony.pdf (last upload)
- File:The coming colony Mennell 1892.djvu (see Index)
- George Orwell III, please move the file to wherever it is supposed to go. I do not know how to "move" a file to Commons. It looks like the version with the "Index" is set up to be edited. William Maury Morris II (talk) 00:34, 25 August 2012 (UTC)
- I wouldn't know how to move the file(s) in the traditional sense to Commons even if Billinghurst had reviewed either one and found it/them acceptable already. Why don't we wait and let him sign off on it/them before doing anything else. If he's OK with one or the other as is - I'm pretty sure he can handle the needed moving and clean-up. -- George Orwell III (talk) 00:47, 25 August 2012 (UTC)
- That's fine and especially since neither of us knows how to "move" the files. However, I am not sure that he is even aware of what we have been doing for him. He keeps many irons in the fire. I am also busy learning 3-dimensional photography (not that old red/blue) from one of my sons using the "HD Hero2" camera from gopro.com It's a fantastic 3-D world. It would really be be something to have old illustrated books text with 3-dimensional images (not anaglyph) on Wikisource for free downloads. My son is deep into that world + animations with others. This is the son that created, owned, and sold a nationwide ISP. Thank you for that tip on removing "diGitiZEd bY gOOgle" from the .pdf files William Maury Morris II (talk) 03:54, 25 August 2012 (UTC)
- I wouldn't know how to move the file(s) in the traditional sense to Commons even if Billinghurst had reviewed either one and found it/them acceptable already. Why don't we wait and let him sign off on it/them before doing anything else. If he's OK with one or the other as is - I'm pretty sure he can handle the needed moving and clean-up. -- George Orwell III (talk) 00:47, 25 August 2012 (UTC)
Front matter, covers, etc.
Hey all,
Although I could mostly predict how most will respond to this from the long-timers who I've had the pleasure to work with and have helped me along the way, I would like to bring up the mostly trivial issues of front matter, covers, and the like. Across the site, there are inconsistencies with what is presented on the landing page for a work. Some use a back link to "front matter," some simply don't transclude covers, forewards, "work by this author", some include pieces in what they find relevant to the work, and some include everything. I happen to fall in the latter group, and without going into it too much, I see value in these things as it really gives me a sense of being "in" the book. I think it would be neat if we could meet somewhere besides "left up to the workers;" which has left with the sensation on walking on eggshells. Not a rule, but I think we will benefit from a general guideline. - Theornamentalist (talk) 18:18, 19 August 2012 (UTC)
- For the record, I like to include all front and back matter. I usually put it in the base page, separated by {{page break}}s, but I don't mind it being put in a subpage of its own, especially if it is very long. I started a stub help page, Help:Front matter, earlier this month for September's help page drive but it doesn't have much content right now. Although I planned this to be a help page when complete, in its present state it could become a guideline instead (or could be complemented by a guideline). - AdamBMorgan (talk) 03:41, 20 August 2012 (UTC)
- I too generally put the front matter on the base page—unless the preface is long or the TOC is complex. However, I tend to amend the sequence and put the title page before the frontispiece so that those with small screens can see that they have landed in the right place without scrolling past an image. I haven't given much thought to transcluding covers—mainly because they're either an uninspiring monochromatic wash of colour that contributes nothing to the æsthetics of our version of the work or they have a library bar code that I don't have the artistic skills to conceal. Beeswaxcandle (talk) 04:54, 20 August 2012 (UTC)
- I include covers as front matter with pulp magazines (including the back covers). I would include a book if it was an interesting and/or illustrated cover but I wouldn't bother if the cover was just blank cloth/paper/leather. If a cover image included a bar code it would be unfortunate but I would still use it if there was no alternative available. One pulp magazine so far had a ripped cover on the scan but I was able to use a different cover image because it was available on Commons. This might be possible for book covers as well. - AdamBMorgan (talk) 16:34, 20 August 2012 (UTC)
- Same here; I won't upload every cover, but only based on my idea of interesting. I have a pretty low bar for interesting covers: typically the title/author/some silly graphic. This is all a result of the cover for Black Beauty, which was marked no text (with no uploaded image) a day or so ago. I added it because I thought it looked nice, but the thought that others found it uninteresting and thus, did not upload it, did cross my mind. I guess what I would like to get it is where and when it is appropriate, because I could imagine someone else (the original workers) envisioned it without it when stylising the page. I know this is all trivial. But I look at something like Bunny Brown and His Sister Sue at Camp Rest-a-While and think the nice little illustration belongs there; did Beeswax not think so? would they care if I added it? Not give a crap either way? Would Xxagile care about moving this cover and deleting /front matter for First Lessons in Geography? - Theornamentalist (talk) 20:44, 20 August 2012 (UTC)
- The problem with the cover for Bunny Brown is that it is a series cover (same image for every book in the series) and part of the distinguishing title is obliterated by the library label. That said, I'm not going to object if someone wants to add that cover to the mainspace base page. Beeswaxcandle (talk) 06:00, 22 August 2012 (UTC)
- Same here; I won't upload every cover, but only based on my idea of interesting. I have a pretty low bar for interesting covers: typically the title/author/some silly graphic. This is all a result of the cover for Black Beauty, which was marked no text (with no uploaded image) a day or so ago. I added it because I thought it looked nice, but the thought that others found it uninteresting and thus, did not upload it, did cross my mind. I guess what I would like to get it is where and when it is appropriate, because I could imagine someone else (the original workers) envisioned it without it when stylising the page. I know this is all trivial. But I look at something like Bunny Brown and His Sister Sue at Camp Rest-a-While and think the nice little illustration belongs there; did Beeswax not think so? would they care if I added it? Not give a crap either way? Would Xxagile care about moving this cover and deleting /front matter for First Lessons in Geography? - Theornamentalist (talk) 20:44, 20 August 2012 (UTC)
- I include covers as front matter with pulp magazines (including the back covers). I would include a book if it was an interesting and/or illustrated cover but I wouldn't bother if the cover was just blank cloth/paper/leather. If a cover image included a bar code it would be unfortunate but I would still use it if there was no alternative available. One pulp magazine so far had a ripped cover on the scan but I was able to use a different cover image because it was available on Commons. This might be possible for book covers as well. - AdamBMorgan (talk) 16:34, 20 August 2012 (UTC)
- I too generally put the front matter on the base page—unless the preface is long or the TOC is complex. However, I tend to amend the sequence and put the title page before the frontispiece so that those with small screens can see that they have landed in the right place without scrolling past an image. I haven't given much thought to transcluding covers—mainly because they're either an uninspiring monochromatic wash of colour that contributes nothing to the æsthetics of our version of the work or they have a library bar code that I don't have the artistic skills to conceal. Beeswaxcandle (talk) 04:54, 20 August 2012 (UTC)
One-click "Without text"
Tagging an empty page as "Without text" potentially requires
- Clearing the header box (of the index-specified default)
- Clearing the edit box (of any spurious OCR)
- Clearing the footer box (of the index-specified default)
- Clicking the "without text" radio box
- Clicking "Save"
That seems like an awful lot of work just to save an empty page — and I've been doing that a lot lately — so I wrote a script to turn the "without text" radio box into a one-click button that does all of the above. If you want it, add the following to your .js page:
addOnloadHook(function(){
qualityContainer = document.getElementById('wpQuality-container');
if (qualityContainer == null) return;
quality0 = qualityContainer.getElementsByClassName('quality0')
withouttext = quality0[0].childNodes[0];
withouttext.addEventListener('click', function () {
document.getElementsByName('wpTextbox1')[0].value = "";
document.getElementsByName('wpHeaderTextbox')[0].value = "";
document.getElementsByName('wpFooterTextbox')[0].value = "";
document.getElementsByName('wpSave')[0].click();
});
});
Hesperian 01:12, 20 August 2012 (UTC)
- Thanks for sharing.--Mpaa (talk) 06:55, 20 August 2012 (UTC)
Helping Wikipedia users to cite Wikisource
My query is pretty much as per Wikisource:Scriptorium/Archives/2011-11#Wikipedia_citation-template_style: Can we yet get a (basic even) preformed WP template into Special:Cite/1911_Encyclopædia_Britannica/Grace,_William_Gilbert? Moondyne (talk) 13:02, 22 August 2012 (UTC)
- Maybe. The Wikipedia citation would need to be added to MediaWiki:Cite text and this looks pretty straightforward. The complicated part is going to be Wikipedia. Wikipedia uses several citation templates related to Wikisource and each is slightly different. My attempt to standardise and unify and them into {{cite wikisource}} was reverted, although that template still exists. In this case, you would want Wikipedia's {{cite EB1911}}. It should be possible to create a template to pick the right citation template based on the basepage but I'm not sure if we are meant to be putting templates into the MediaWiki namespace. (Another complication is metadata: there is no simple way that I know of to extract this from a page, so there will be no author or year in the citation. This won't be a big problem with {{cite EB1911}}, however, as it doesn't use much metadata.) - AdamBMorgan (talk) 15:04, 22 August 2012 (UTC)
- Trial version implemented and tested. It will only select special templates for 1911 Encyclopædia Britannica, 1922 Encyclopædia Britannica and Catholic Encyclopedia (1913) at the moment. I can add support for a few more templates later. All citations include blank parameters that can be filled in manually, ignored or deleted. Is this what you wanted? - AdamBMorgan (talk) 19:38, 22 August 2012 (UTC)
- Yes, thats cool. Can you include a blank page=| also? Moondyne (talk) 15:01, 23 August 2012 (UTC)
- Trial version implemented and tested. It will only select special templates for 1911 Encyclopædia Britannica, 1922 Encyclopædia Britannica and Catholic Encyclopedia (1913) at the moment. I can add support for a few more templates later. All citations include blank parameters that can be filled in manually, ignored or deleted. Is this what you wanted? - AdamBMorgan (talk) 19:38, 22 August 2012 (UTC)
Category cleanup work
- Update: I've been doing some category cleanup work -- there should (at least at present) no longer be any uncategorized categories. Feel free to be in touch if you want any input or advice about how to go forward with this type of initiative. ;) Cheers, -- Cirt (talk) 17:57, 22 August 2012 (UTC)
Transferring Files via Dropbox
I and my sons have used Dropbox for easy transfer of files of all kinds. Just consider it after reading about it. This eliminates downloading/uploading to areas like Internet Archives aka Archives.org Use your alias used here if you want. The program is free up to X# gigabytes William Maury Morris II (talk) 16:58, 23 August 2012 (UTC)
- Wikipedia: http://en.wikipedia.org/wiki/Dropbox_(service)
- Dropbox.com "Tour" https://www.dropbox.com/tour
- In what scenario, related to Wikisource, did Dropbox save you any "downloading/uploading to areas like Internet Archive"? --LA2 (talk) 18:18, 23 August 2012 (UTC)
- Use your imagination, Lars. William Maury Morris II (talk) 03:56, 25 August 2012 (UTC)
- I would like you to actually answer his question, since I'm as equally unimaginative as Lars is and don't know how I would use Dropbox, which I use daily, to assist in my Wikisource work. Given my affinity for interwiki/cross-wiki edits, it may not be as relevant to me... but since you didn't bother explaining what you meant, I don't know. EVula // talk // ☯ // 18:36, 25 August 2012 (UTC)
- WMM2, That wasn't a very helpful answer. A clearer reply would help others (like myself), who may like to take advantage of the service in the context of Wikisource. — Ineuw talk 20:43, 25 August 2012 (UTC)
- I think he was only trying to provide a useful resource for handling large files, say, if I have scanned something but need help with file conversion, ordering pages, etc., and someone can help me, that we can use that service instead of IA to send it to each other when a mailbox limit has been reached. Of course, speculation, but that's how I took it. - Theornamentalist (talk) 21:08, 25 August 2012 (UTC)
- Perhaps my statement was too hasty. I don't recall if I was in a hurry at that time or not. I assumed people here, especially those smarter than I with technology, could get good use of that program. It was a passing thought meant to help others. I never intended any "quip" and I disliked that overall comment beyond the word "quip" as it was never intended by me to be a quip. Theornamentalist you are on center of what I meant and unlike a couple of the other comments others stated, you used manners and I felt not even the slightest badgering from your comments--which I would never accept. Ineuw, I am somewhat surprised at your first statement which was not needed. You of all people in this conversation should have realized what my initial statement of the program could provide. I say this because of your excellent cleaning of many images on the Darien Expedition article. In our conversation we covered the problem of not being able to send files to one another because your mailbox was limited. You worked out an easy solution but we did discuss the fact that you could not send the cleaned images to me for the book to be placed into a .PDF File created with my Adobe Acrobat program. Had you and I been using dropbox at that point we could have easily sent the image files to me to be placed in a .PDF file. The statement of "Use your imagination" by me to Lars was not intended to offend anyone. He is a smart person and I really did think he would be able to use that "dropbox" program in more ways than I could think of on Wikisourse and aside from Wikisource. My intent was only to point out that program so each and every person would possibly know, or figure out, an alternative method of working with large files aside from uploading to IA to achieve a common goal. Too, I was thinking of two people working on a project. George Orwell III and I recently had a task getting and then cleaning a book Billinghurst requested. What I wrote was meant to help others--each according to needs and their imagination in solving any potential problems with files whether they be large files or combining files and bypassing the limitations of sending through files via email. I placed one version on IA and George had to download it from there, work on removing watermarks, &c. and placed more than the version I uploaded to Wikisource. In the back of my mind came the thought that perhaps, in some way, dropbox could have saved the two of us from some extra work and frustration. Too, now (if not deleted) there are several versions stuck on Wikisource to be deleted and one to be moved to Commons. I thought of replying only to the both Ineuw & Theornamentalist via email but finally decided to post here with the possibility, the risk, of creating an argument or more insisting from others. I will not accept the feeling of being badgered into answering anything from anyone. Theornamentalist, I thank you for using manners but then you always use good manners. I would not have made any further statements here on this subject if not for you and those good manners of yours. You gave me the benefit of any doubt. Like Ineuw, you both can contact me via e-mail on any question. —William Maury Morris IITalk 02:15, 26 August 2012 (UTC)
- I think he was only trying to provide a useful resource for handling large files, say, if I have scanned something but need help with file conversion, ordering pages, etc., and someone can help me, that we can use that service instead of IA to send it to each other when a mailbox limit has been reached. Of course, speculation, but that's how I took it. - Theornamentalist (talk) 21:08, 25 August 2012 (UTC)
- WMM2, That wasn't a very helpful answer. A clearer reply would help others (like myself), who may like to take advantage of the service in the context of Wikisource. — Ineuw talk 20:43, 25 August 2012 (UTC)
Layout Choice
Hi. Does the layout choice work? It disappeared for me. I also noticed that some scripts that were working until a few days ago now are not working any longer.--Mpaa (talk) 22:13, 24 August 2012 (UTC)
- Still working here - both inline & otherwise - dohs XP w/IE. -- George Orwell III (talk) 22:30, 24 August 2012 (UTC)
- I do not know what happened. The link in the margin to select layouts disappeared. I can't figure out why and how to restore it. I can't see also the small page number link to pages in page ns, but that is probably due to the kind of default layout that is loaded … Anyone else experiencing the same problem?--Mpaa (talk) 17:05, 25 August 2012 (UTC)
- I've just checked and I'm having the same problem but only when using Chrome. When I open the same page in Internet Explorer, I see both the Display options and the floating page links. - AdamBMorgan (talk) 17:47, 25 August 2012 (UTC)
- Working fine for me in Chrome. While on the topic, can we remove the forced font in layout 2? - Theornamentalist (talk) 21:09, 25 August 2012 (UTC)
- Forcing that font in Layout 2 was done by proposal & "consensus" awhile back so I don't think reversing it will be easy. I do think that change was made in haste since the font changes should have been isolated to just the content and not the header, etc.
fwiw.... you can override the defaults by adding the desired Layout 2 parameters to your local .js file. -- George Orwell III (talk) 23:56, 25 August 2012 (UTC)
- Forcing that font in Layout 2 was done by proposal & "consensus" awhile back so I don't think reversing it will be easy. I do think that change was made in haste since the font changes should have been isolated to just the content and not the header, etc.
- FYI, working again.--Mpaa (talk) 09:25, 26 August 2012 (UTC)
- Working fine for me in Chrome. While on the topic, can we remove the forced font in layout 2? - Theornamentalist (talk) 21:09, 25 August 2012 (UTC)
- I've just checked and I'm having the same problem but only when using Chrome. When I open the same page in Internet Explorer, I see both the Display options and the floating page links. - AdamBMorgan (talk) 17:47, 25 August 2012 (UTC)
- I do not know what happened. The link in the margin to select layouts disappeared. I can't figure out why and how to restore it. I can't see also the small page number link to pages in page ns, but that is probably due to the kind of default layout that is loaded … Anyone else experiencing the same problem?--Mpaa (talk) 17:05, 25 August 2012 (UTC)
Something is broken
The "revision history" of this page, nor Special:Contributions/JeepdaySock show the edit made Wikisource:Scriptorium#Bot and signed. "JeepdaySock (talk) 15:50, 28 August 2012 (UTC)"…. Jeepday (talk) 00:25, 29 August 2012 (UTC)
- Wikisource:Scriptorium/Help is a separate page with its own history transcluded into this one. I can see it there and on the contributions page (edit id: 4039672, timestamp 11:50, 28 August 2012). Does it appear for you? Prosody (talk) 01:25, 29 August 2012 (UTC)
- Yes, I see it on Scriptorium/Help history, and am now seeing it on Special:Contributions/JeepdaySock. Not sure what happened. Thanks for taking a look. JeepdaySock (talk) 10:31, 29 August 2012 (UTC)