Jump to content

Wikisource:Scriptorium/Archives/2009-05

From Wikisource
Latest comment: 15 years ago by Billinghurst in topic Nuremberg +++

Announcements

Wikipedia AFD: The Free Bible

I have nominated w:The Free Bible for deletion on Wikipedia. The article is about Bible (Wikisource), the free Bible translation here on Wikisource, so regulars here are most likely to be able to opine informatively and/or want to rescue the article if it can be saved. John Vandenberg (chat) 00:32, 2 April 2009 (UTC)

I would love to keep the page. To be honest, in general, I think Wikipedia is way too delete happy. I guess the question is if it is significant enough or not. I say keep it. --Mattwj2002 (talk) 01:48, 2 April 2009 (UTC)

Licensing update progress notice

The licensing update proposal to dual license all Wikimedia Foundation wikis under both the GNU Free Documentation License (GFDL) and the Creative Commons Attribution-ShareAlike License (CC-BY-SA) is moving into its final phase. This proposal has been put forward by the Foundation and made possible by recent changes in the GFDL. Adopting the new licensing scheme is contingent on community approval. In several days a site notice for all editors will announce the start of three weeks of community voting on this proposal. In the mean time we would invite you to visit the update proposal and its associated FAQ if you want to learn more. We would also appreciate your help finishing the translation effort for the core documents associated with this process. —Anonymous DissidentTalk for the Licensing Update Committee 20:53, 2 April 2009 (UTC).

Proposals

Transwiki Bot Request

Hello, everyone. My name is Hersfold, and while this is my first time here, some of you may know me from the English Wikipedia. Just recently, I've set up a bot, w:User:HersfoldBot, to automatically handle Transwiki requests between the English Wikipedia and the English Wiktionary using the Special:Import API. The bot was fully approved between those two projects a few days ago, and so far has been running without any problems. Since Wikipedia gets a large number of articles that don't really belong there, I am interested in expanding the scope of the bot to import articles to other projects as well. Currently, our largest transwiki backlog is for articles slated to be transferred here (see w:Category:Copy to Wikisource), as your import log shows that none have been completed (or at least logged) for nearly a year. Before I begin adding to the bot's code to permit allow this, I wanted to know if you as a community would be open to this idea.

The bot operates by using the Special:Import feature to transfer the entire attribution history of an article to the destination project, instead of simply copy-and-pasting the most recent version into the edit box. Should the import be successful, it will log the transwiki in the appropriate locations on both sites, clean up the transwiki templates on both ends, and move onto the next article. When run, the bot will be under my supervision. The bot reports all of its actions to my screen as it runs. Should a problem occur, any user can easily stop the bot themselves. Administrators on either project may block the bot on their end, and any user may leave a message on the bot's talk page to make it stop (although it will only check the talk page and block status on Wikipedia and the current destination project). The bot will not allow itself to run while it is blocked or while it has new messages.

Technically speaking, the bot is coded in Java using MER-C's Java library and is set up to run directly from my desktop whenever necessary. It normally runs at about a five-to-ten-second throttle speed (depending on what sort of mood it's in and the server load), and will stop all actions for 30 seconds when the database lag becomes more than 5 seconds long. While the only permission it strictly requires to operate is the import flag (which would need to be granted by a steward), I noticed that according to your bot policy it would require a bot flag as well if it continues to operate at these speeds. Tech-savvy users are welcome to browse through the bot's source code, which is available for viewing at w:User:HersfoldBot/Source; others may simply wish to view the bot's work so far at Wikipedia and Wiktionary. I am more than happy to answer any questions you may have about this; also, if there are any particulars I should know about your processes before coding begins (assuming you would like the bot to work here as well), please let me know. Thanks for your time! Hersfold (talk) 07:00, 28 March 2009 (UTC)

There is a bit of discussion about the transwiki log at Wikisource:Proposed_deletions#Wikisource:Transwiki_log_deletions
John Vandenberg (chat) 00:12, 31 March 2009 (UTC)
I think the backlog cant be automated. For example, w:Gresham College would be imported, and we dont want that page. And the Will of Sir Thomas Gresham on that Wikipedia page is only an excerpt, and excerpts are discouraged.
An important distinction with importing to Wikisource is that we do not need to maintain the edit history in order to be legally compliant with the GFDL, as the original text should not be under copyright.
All of the media files in w:Category:Copy to Wikisource should be moved to Wikimedia Commons, and maybe added to Requested Texts for someone to work on.
I would feel more comfortable if you did a few transwiki's by hand, and then manually selected which texts to import after reviewing the Wikipedia page.
John Vandenberg (chat) 00:44, 31 March 2009 (UTC)
Regardless of the technical implementations of the bot, like John, I don't think the process can be automated. Unfortunately, Transwiki namespace on here, in our local project, is mostly a dumping ground for texts with no licensing information, incomplete texts, completely random texts, etc, that have been copied from Wikipedia in the past few years, and, glancing over the category in Wikipedia that you linked to, it appears that there is much the same problem there.
w:My Inventions: The Autobiography of Nikola Tesla, for example, is an encyclopedic page with a considerable amount of excerpts from the text. As they are excerpts and not complete texts, they don't belong on Wikisource. The excerpts could be removed from the page on Wikipedia to return it to an encyclopedic page. I'll make the assumption that it was tagged by somebody who doesn't properly understand Wikisource policies on texts, and has remained tagged ever since then.
My point here being that, if we just start importing the texts one by one, yes, they may be gone from Wikipedia and that category is cleared out, but then we end up with a large amount of text here that we need to deal with and decide what we want to do with, and we already have quite a backlog in the Transwiki namespace, though myself and User:Billinghurst have been doing some work on it. Instead, I think it would be a good idea if suitable texts were tagged as appropriate for Wikisource (perhaps your bot could have a subpage where we can list suitable works?) and then these were imported and removed from Wikipedia as required. Jude (talk) 02:23, 31 March 2009 (UTC)
The w:Category:Copy to Wikisource was new to me, and not something that I had seen much discussed in this community. In the last few days, having looked at some of the pages, moved some, commented on some, shrugged at some, I would agree with John's and Jude's commentaries that some belong, some do not. It seems that there may be the need for a greater awareness, and consultation between the projects, and that is preferential to a bot. It would seem that what does need to be clarified is whether it is a push [we don't want them] or a pull [we want them here] process, and somewhere in between. We can do a little on Requested Texts that may encourage us to go and look and inhale more of these texts. WP probably needs to do more about educating what WS does and how to get the texts brought over. -- billinghurst (talk) 02:50, 31 March 2009 (UTC)
Transwiki was intended as a push process. If editors here aren't keen to process what has already been transwikied, it's unrealistic to believe that they will look at a Wikipedia category to see what needs to be done. Most of these articles will need a lot of work before they can fit in. Eclecticology - the offended (talk) 04:45, 31 March 2009 (UTC)
The impression I'm getting, then (and the impression I got from discussing this with John on IRC), is that setting up the bot for this project would be best left until...
  1. Wikipedia and Wikisource work out what is best to transfer and what is best left alone, and get more editors on both ends reviewing articles marked for transfer,
  2. The bot's code can be developed to the point it can do some cleanup itself (John mentioned the removal of most wikilinks, since you only link to document titles; it seems as though most references would need removing as well), and
  3. The import tool can be set up to only transfer a select number of revisions rather than the full history,...
...assuming it ever would be appropriate to have a Transwiki bot for this project in the first place. If this is the case, I understand; at the same time, though, Wikipedia will need a good amount of help from your end to get the general knowledge we need to make this whole process, automated or not, work. Simply saying "hey, before you mark something for Transwiki, make sure to read Wikisource guidelines X, Y, and Z" isn't really enough; we need help from active Wikisource editors like yourselves to help explain the nuances of each of the relevant guidelines and document those on our end to really make it effective. Right now all we know is "This has a large portion of a document - send it to Wikisource", which apparently isn't quite correct. Wikipedia isn't in much of a position to educate themselves about this; if your polices are anything like ours, there are a bunch of "unwritten rules" that are only known through experience on this project. Hersfold (talk) 16:17, 31 March 2009 (UTC)
My point is precisely that a Wikipedian should not need to know anything at all about Wikisource rules and guidelines before putting something into Transwiki here. The technique may very well have been employed as a compromise in a Wikipedia deletion discussion, but that discussion has no bearing on the decision about what to do with the transwikied aticle here. It might even be worthwhile to say that if nothing is done with a transwikied that has been here for three months it can be speedy deleted. Eclecticology - the offended (talk) 18:30, 1 April 2009 (UTC)
Sorry - I misunderstood. As a quick update, I am now adding functionality for the bot to strip out red links and templates (except citation templates) in the imported articles. It seems to be working in testing, but I haven't added it to the bot's working code just yet. Hersfold (talk) 20:29, 1 April 2009 (UTC)

Headerless pages and Special:AbuseFilter

This is both an announcement and a proposal. I've recently finished implementing a Special:AbuseFilter filter that notices when works without headers (only in the main namespace) are edited (clarification: when either there was no header and you haven't added one, or you are removing a header). In theory, these edits should be tagged as no header in recent changes, user contibutions, etc. Unfortunately, due to a bug in the software, these tags are not being added anywhere, though edits matching the pattern can be seen here.

The functionality exists for a message to be displayed at the top of the edit window to inform users that they are editing a page which is headerless. This works similarly to the "notify me if I edit without an edit summary" and the captcha system found on other pages: upon making the edit, the edit window is reloaded with the notification at the top of it, basically asking the user to confirm that they do indeed wish to make this edit.

I'm of the opinion that having a message to the extent of "This page does not have a header. Please add one, or, if you are uncertain how to do so, continue editing as normal" would be very useful. It could even contain the blank template and further instructions (that way experienced users can just copy and paste and fill in the blanks).

I've started a basic idea of the template at MediaWiki:Headerless-edit-notice, but obviously it would need to be expanded. Further to this, I've recently run a query on the toolserver to gather information about pages which are not tagged with {{no header}} or any permutation of {{header}}. The results can be found in my userspace: User:Bookofjude/headerless. Jude (talk) 11:00, 31 March 2009 (UTC)

Treating headerless pages as abuse is excessive. An abuse filter should only apply to real abuse. Eclecticology - the offended (talk) 12:40, 31 March 2009 (UTC)
Looking at the system, it isn't treating it as abuse, just using the tool for an unenvisaged use. I think that it is quite an ingenious use of the tool. Congrats Jude. Support -- billinghurst (talk) 12:49, 31 March 2009 (UTC)
Just to clarify, nowhere in the procedure would any irregular or new editor find themselves being accused of "abuse", and I assume that the other regular members of the community realise that, just because a tool is titled "abuse filter", other uses for it which are completely unrelated to abuse can easily be found. Jude (talk) 13:01, 31 March 2009 (UTC)
I support this measure.—Zhaladshar (Talk) 23:51, 31 March 2009 (UTC)

Update: I've altered the MediaWiki:Headerless-edit-notice message to as follows:

Hello! You're editing a work that does not contain a functioning header template. If you know how, please consider adding the following template and filling in the relevant details. If you are uncertain how this system works, or believe that this message is a mistake, please feel free to ignore this notice and continue editing. Also refer to Wikisource:Style guide.

{{header
  | title = {{subst:PAGENAME}}
  | author =
  | section =
  | previous =
  | next =
  | year =
  | notes =
}}

Note: You can set a preference in your Gadgets to have the appropriate headers to automatically self-load.

If this page already contains a header, and it is generating an error, then please check your template for open links, or each of the template components. If you need help, please leave a message at the Scriptorium or contact someone with "abuse filter editor" rights.

Jude (talk) 06:01, 1 April 2009 (UTC)

I support it if correctly implemented. We need to make sure it is smart enough to not alert on pages such as Index: and Wikisource: Also, it needs to be kind and gentle especially to newbies. But it if can help to eliminate headerless pages, I support it. --Mattwj2002 (talk) 06:38, 1 April 2009 (UTC)
support I don't think we really have the type of issue this extension was meant to address, but if we can find creative uses for it, why not. The only issue I see is that there seems to be some redundancy in the rules, which would make them slower, but it doesn't seem to be causing problems. -Steve Sanbeg (talk) 18:46, 1 April 2009 (UTC)
I only just noticed that there seems to be several regular expression that are identical, so I've fixed this. In truth, they could probably all be combined into the one expression, but for ease of use and editing at the minute, I think it best to have them all separated. According to the actual page on the filter, its run time is 0ms, so there shouldn't be any lag or delay for editors. I've also been refining it by checking the abuse log every evening and fixing false positives by adding new rules. I believe it's relatively complete, but I think I'll give it at least another few days of testing, and see whether anyone else has any opinions on it, before implementing the notice. Jude (talk) 10:28, 2 April 2009 (UTC)
That's good, it looks much better than before. I agree; combining all the template regexes to avoid repeated searching for the beginning would be more efficient, but most articles should match the first regex in the beginning of the text, and the other articles are probably few and small, so that would be a very small improvement -Steve Sanbeg (talk) 22:44, 2 April 2009 (UTC)
Implemented. Following what appears to be a majority of support (with only minor opposition based on the delivery system), I've implemented this proposal. I believe that I have massaged all the kinks out of it, and that all of the required header templates are included in the rules. Hopefully, there should be no problems, but if anyone experiences any difficulties with this, please either leave a message here or on my talk page, and I'll try and sort it out as quickly as possible. In all likelihood, though, the worst that could happen is that you'll need to press "Save page" twice. Jude (talk) 10:57, 10 April 2009 (UTC)
  • Is there a way to opt-out of it? I don't want to see it every time I look at a page.—Markles 20:16, 13 April 2009 (UTC)
    One should only see it when they create a NEW page and attempt to save it without the header in place. It appears as a warning above the edit box. Are you getting this confused with the {{no header}} template, as that displays the box. That will be displayed until someone adds the box and completes the detail. The hope is that this alert will obviate the requirement to use the latter template. -- billinghurst (talk) 01:01, 14 April 2009 (UTC)
    You should only be seeing this when you edit a page. I believe you may be mistaking this notice with the {{no header}} notice, as, according to the Special:AbuseLog, only two of your edits have been notified, and only one tagged as not containing a header since the alert was implemented. If you are indeed seeing the message at MediaWiki:Headerless-edit-notice, can you drop a list of pages you were editing on my talk page? This might be a bug somewhere along the lines. Jude (talk) 01:07, 14 April 2009 (UTC)

Template:PD-Manifesto

What is our exact stance on the use of {{PD-Manifesto}}? I initially noticed it has been applied to tens of speeches by Author:Barack Obama, in instances where clearly {{PD-USGov}} was appropriate, but apart from this, there are instances where I'm unsure that its use is correct, or where I find it difficult to believe that the text could be considered "public domain". Some examples:

Karl Popper: Prague lecture, 1994, a speech given by a professor in 1994, with no evidence that the text is indeed in the public domain, apart from the fact that it was spoken aloud to several people. U.S. foreign policy in Central America, another speech given by a professor, this time in 1984. Winning the Cultural War, an address by the actor Charlton Heston, in 1999. A lot of works by the Author:Dalai Lama are also tagged with this.

There are many more, and though I haven't personally gone through every page where the template is used, it seems to be that a large amount of them are of doubtful public domain status. Some of them appear to be tagged as such simply because they are of speeches, and {{PD-Manifesto}} is the "Speech" template that allows us to host any recent oratory work, regardless of its actual copyright status.

Of course, there are instances where I don't dispute its use (indeed, as the template was originally intended): actual manifestos, for instance, Manifesto for an Independent Socialist Canada, or The FLQ Manifesto. Open Letter To Tarja Turunen describes itself as an "open letter", something clearly intended for public distribution. In these instances there is an least the possibility that the author intended to release the work into the public domain, and simply didn't do so explicitly, rather than the assumption.

We are "The free library", not the "We hope (or want) this is free, so we'll cross our fingers and pray that nobody sues us library".

So, finally, I have a proposal: implement a system not dissimilar to Wikipedia's policy on Fair Use Images. Yes, we'll accept some texts using the {{PD-Manifesto}} template, but only on the proviso that documented on the talk page are the uploader's (or the person applying the template) reasons, evidence of why they believe the template's usage is appropriate.

I know this suggestion dumps more bureaucracy on us, which some people are averse to, but I think that in instances where the template is used correctly, it will not be difficult to show the reasons why. Jude (talk) 10:30, 10 April 2009 (UTC)

User:VolkovBot

I'm not sure of the process for flagging interlanguage link bots as such, but User:VolkovBot has been jamming my watchlist with its linking activities, and by extension (to a lesser extent), recent changes. Would it be appropriate to flag the account as a bot, and if so, do we need to get some sort of assurance from the user that he won't do anything with the account other than interlanguage links? --Spangineerwp (háblame) 02:13, 14 April 2009 (UTC)

The bot is a global bot, but, according to the list, enwikisource has not opted-in for global bots. This may be something we should look into. —Anonymous DissidentTalk 10:24, 14 April 2009 (UTC)
Looks great to me—is there a downside? --Spangineerwp (háblame) 15:40, 14 April 2009 (UTC)
Yes the interlanguage link programming designed for Wikipedia creates errors on Wikisource. I will look into the contribs and block this bot if it is doing what I suspect.--BirgitteSB 19:58, 17 April 2009 (UTC)
He hasn't yet caused any of the normal problems. He is largely focused on categories and the texts seem to have been checked by a human. So I didn't block. I did ask him to respond to the concerns here. He speaks English.--BirgitteSB 20:08, 17 April 2009 (UTC)
I haven't seen any problems with the edits; it's just that I have thousands of pages on my watchlist, including many that require fairly heavy interlanguage link updating. Giving VolkovBot a bot flag would solve the problem for me, and the (smaller) problem for people keeping an eye on Recent Changes. I'm not sure of the process for doing this, however. --Spangineerwp (háblame) 20:13, 17 April 2009 (UTC)
Discussion here is the process for a bot flag. That is why I asked him to respond here and explain his method. I didn't see any problem either which makes me believe he is doing something different than past bot operators have done.--BirgitteSB 21:00, 17 April 2009 (UTC)
  • Hi! You're right, I (or better saying my bot) am primarily focussed at linking categories. I don't use -force mode to avoid any erroneous removes. I have also slightly amended the script to understand links to "Author" pages. Should you notice any problem in bot's edits, please let me know on my talk page at ru.wiki. --Volkov (talk) 21:22, 17 April 2009 (UTC)
  • So long as the edits in question are not causing issues, I have no aversion to the bot being granted a flag, though I believe that perhaps implementing the global bot policy here on Wikisource would be a better solution for future, similar issues. I haven't personally noticed any issues with the bot, and I checked a large block of the edits when they first started showing up on my watchlist. I'm all for further interlanguage linking, and this doesn't seem to be causing any trouble. Jude (talk) 02:22, 18 April 2009 (UTC)
  • I will give him the bot flag tomorrow if there are still no objections.--BirgitteSB 17:10, 23 April 2009 (UTC)

I have found one problem[1] however I wouldnt object to the bot flag for that edit. The mainspace edits all look good.

This seems to be a formatting problem. In order not to confuse the bots, language code should be preceeded by a colon. --Volkov (talk) 18:29, 24 April 2009 (UTC)

Volkov, the big problem is that we have:

  1. more than one page for the same text
  2. translations are based on different languages

i.e. our iwlinks on all our Bible translations are not bidirectional. That is a pretty fundamental difference to the expectations of the interwiki bot. Koran will have similar problems, as the early English translations were from the French translation rather than the original language. There are likely to be some other texts which have complex translation matrices, but there are not many texts which have this problem. Any chance you can create a blacklist for the few which are incompatible with the interwiki bot? Alternatively, can the bot be told to never stray into mainspace? John Vandenberg (chat) 00:01, 24 April 2009 (UTC)

If you have some suggestions for the blacklist, please let me know on my Russian talk page, or if you feel like the mainspace is worth excluding from the bot's scope at all, please also let me know. There is not a big problem to impose these restrictions if necessary. --Volkov (talk) 18:29, 24 April 2009 (UTC)
I granted the bot flag, but asked him to look over John's concerns.--BirgitteSB 15:07, 24 April 2009 (UTC)
Thanks. Should you notice any issue in bot's edits, please let me know on my talk page at ru.wiki. --Volkov (talk) 18:29, 24 April 2009 (UTC)

Illustrator parameter on {{header}}

As the topic says, I'd like to propose we look into adding an |illustrator= parameter in the header; we have a number of works where we have taken great pains to upload the original woodcuts, illustrations and such. I'm not sure whether we'd be better to make it link to something like Author:John Tenniel (ugh), Commons:Category:John Tenniel or simply remain plaintext - but I do think the optional parameter would be a nice touch. Sherurcij Collaboration of the Week: Author:Carl Jung. 06:04, 21 April 2009 (UTC)

While this sort of information is certainly appropriate to a header, "illustrator" is only one of the possible subsidiary contributors to a book that could or should be recognized. The Header should be reserved for the most essential data, and some of these other data would be better placed in a separate and optional sub-header. Eclecticology - the offended (talk) 06:37, 21 April 2009 (UTC)
I am not adverse to having it in either and happy to hear other opinions of pros and cons of placement, obviously optional. Having optional sub-headers may be more confusing, and would like the simplest proposal, so all things being equal inclusion in the header sounds easiest, especially in these books, the copyright of the illustrator is as important as the text, hence is akin to translator field

Definitely would like | override_translator

As per the the discussion that I added to the talk page of {{header}}, it would be very nice to have | override_translator = which could overlay the translator field so that we could be able to eradicate wikilinking of that parameter. -- billinghurst (talk) 13:07, 21 April 2009 (UTC)

Rule of shorter term exception on Template:Pd/1923 group

This template (group) has served well where it has replaced Template:PD-1923, with one exception: on author pages, PD-1923 mentions the rule of the shorter term, the rule whereby countries can reduce the copyright terms of individuals who produce works in foreign countries, while Pd/1923 doesn't. I would propose writing a brief clause in the members of the Pd/1923 template group to alert foreign readers to that fact, like PD-1923 does.

For example, Template:PD-old-80-1923 reads:

This work is in the public domain in the United States because it was published before January 1, 1923.


The author died in {{{1}}}, so this work is also in the public domain in countries and areas where the copyright term is the author's life plus 80 years or less.

—to which we could add: (This work may also be public domain in countries with longer native copyright terms that apply the rule of the shorter term to works written in foreign countries)

ResScholar (talk) 08:11, 25 April 2009 (UTC)

I don't really see that as helpful unless we can tell them that it actually has gone into the public domain in its home country. I'm not even sure in complex cases what the home country would be; the translation by Nabokov (a Russian or German citizen at the time) of Alice in Wonderland first published in France has where as a home country? Even in simpler cases, the question of where a work was first published and the nationality of its author can be hard research questions. And do all rules of the shorter term count the same factors in the same way for figuring the home country?--Prosfilaes (talk) 13:36, 25 April 2009 (UTC)
On a practical basis, the most common use of the rule of the shorter term would probably be for users from English-speaking non-US countries accessing American works written before 1923 and deriving a copyright term which would be almost trivial for the user to determine once the specific rules of his or her home country were known. The PD-1923 template, when it is placed on author pages, contains a link to w:Rule of the shorter term that explains the various permutations, and we could also use the link in the new clause. As for the different rules of the shorter term being different in different countries, we could make the clause even vaguer, to read: "This work may also be public domain in countries with longer native copyright terms that apply the rule of the shorter term to foreign works" (instead of written in foreign countries). Whatcha think? ResScholar (talk) 18:28, 25 April 2009 (UTC)
I have been thinking whether to mention the rule of the shorter term in these automated tags after testing them on Chinese Wikisource anbringing them here, but I just wonder how. If any expert can, how about adding an extra parameter to enter whether something is in the public domain at home or not? If something is still copyright-restricted at home, I do not expect the rule of the shorter term to apply.--Jusjih (talk) 00:56, 26 April 2009 (UTC)
Well, that extra parameter would be adding a subordinate criterion on top of a larger criterion that needs to be determined IF the work is in fact public domain in its home country. For example, take U.S. Author:e. e. cummings. In 1922, he wrote The Enormous Room, so it's public domain in its home country, and we could show that as a parameter. But Cummings died in 1962, as second piece of data, so in "Public Domain at life+50 years" countries like Canada, his work is ordinarily still copyrighted. But then we have to further consider if Canada is a "rule-of-the-shorter term" as a third piece of data to determine whether Canada's ordinary copyright term can be disregarded. In this case the rule of the shorter term would apply as a determining factor.
What I'm saying is that an American host PD determination, which appears on every license, a copyright-term-in-your-country link (plus the year the author died) and a home country PD determination isn't going to be enough to communicate the copyright status of the work for every country, especially if it's a U.S. work. But a copyright-term-in-your-country link (plus the year the author died) and a rule-of-the-shorter-term-in-your-country link should cover every ordinary case. ResScholar (talk) 03:18, 26 April 2009 (UTC)
If adding that extra parameter is difficult, how about adding a simple sentence to all related templates in not only Pd/1923 but also Pd/1996 series? Please see my demonstration at Template:PD-old-80-1923 and if it is fine, I will extend the note to other related tags.--Jusjih (talk) 17:04, 26 April 2009 (UTC)
I was all ready to argue that I liked my version better, because it provides context to what the rule is about, but trying to think as a casual user of Wikisource would, I think I would be less likely to want to learn the copyright status, if I saw unfamiliar phrases like "native copyright term" and "foreign work". So I don't know. ResScholar (talk) 18:03, 26 April 2009 (UTC)
How about now at Template:PD-old-80-1923?--Jusjih (talk) 19:45, 26 April 2009 (UTC)
I'm having a little trouble making myself understood. I was trying to say that I couldn't decide between the two and would welcome the opinions of others. But whichever one is picked, the phrase "foreign works" should not be italicized. I just italicized the phrase in my example to Prosfilaes to emphasize the part of the sentence that I was proposing that we change. ResScholar (talk) 02:13, 27 April 2009 (UTC)
I wouldn't mind seeing a PD-shorterterm-US for works published in the US, PD there, PD-shorterterm-70, PD-shorterterm-50, etc., where we could say that the rule of the shorter term does in fact apply, for such and such countries. (Though I would note that many US works were published in Britain or Canada simultaneously, to gain protection under the Berne convention, that would presumably stop the rule of the shorter term from coming into play.) I don't think it helpful to add a message that says it "may" be in the public domain; the only people who would know whether it is or not are those who already know all about the rule of the shorter term and hence don't need to be reminded of it.--Prosfilaes (talk) 03:34, 27 April 2009 (UTC)
You don't think the linked Wikipedia w:Rule of the shorter term article with the accompanying table is clear enough? ResScholar (talk) 04:16, 27 April 2009 (UTC)
The Wikipedia link to the rule of the shorter term is vital to help casual better understand what exactly it is. Mentioning it does not hurt. Say the phrase "foreign works" should not be italicized, I would like to ask one thing. Is the grammar correct in the phrase "in countries and areas where the copyright term is the author's life" correct? Should the phrase read "in countries and areas where the copyright terms are the authors' lives" or else?--Jusjih (talk) 18:21, 27 April 2009 (UTC)
But mentioning it does hurt; it makes every page longer, and makes it more likely the license terms will trigger someone's "tl;dr" filter. Repeating the same boilerplate on each page is a bad thing.--Prosfilaes (talk) 03:24, 29 April 2009 (UTC)
I suspect that Prosfilaes didn't know about the link, but he may have meant the link isn't helpful not 'because a casual reader doesn't need it', but 'because it isn't written clearly enough ' .
As for choosing between "the copyright term is the author's life" and "the copyright terms are the author's lives", the first is correct. ResScholar (talk) 20:13, 27 April 2009 (UTC)
I didn't see the link, true. But this doesn't change my fundamental point that I don't think this template should be bloated to include a statement of possibility. If we know that it will be clear in various nations, we should post that, not a generic statement on all works. I think it more helpful to include this in a page linked from the main page that tells people their copyright terms, including things like which nations are life+x. And, again, specifically templating works where we do know that they're under the rule of the shorter term.--Prosfilaes (talk) 03:22, 29 April 2009 (UTC)

Large deletion proposal of United States Code

Other discussions

Headerless pages requiring cleanup and Wanted Authors

On top of the pages contained within Category:Works with no header template, I recently an a query on the toolserver to discover all pages in the main namespace which do not link to any of the different permutations of header templates that I could find, which were also not tagged with {{no header}}. The result is an interesting collection of old works that are in need of cleanup. The full list can be found in my userspace: User:Bookofjude/headerless.

Most noticeable are several large works which are headerless, quite a few works of Author:John Keats which are not only in need of a header but also in need of wikifying, and quite a few disambiguation pages which require prettying up. I'll be re-running the query on a regular basis, so there's no need to remove a link once you've added a header to the page. There are also a wide variety of old comic books, though none of these appear to be complete. I'm not sure what the sources are for them for attempting to find the rest of the missing images.

On a final note, I've also run a query for links to authors in the [[Author:]] namespace that do not exist. There are quite a few of these that require the creation of author pages, though some which merely require delinking (IE, they don't have any works available, etc). A list of these pages can also be found in my userspace, though currently split into two parts due to its immensity: User:Bookofjude/wanted authors User:Bookofjude/wanted authors 2. Hopefully, if we can all do a couple every day or so, we should get through everything pretty quickly. Jude (talk) 08:12, 3 April 2009 (UTC)

There are also many pages without a proper header here: Special:WhatLinksHere/Template:Standardise. Regards, Yann (talk) 11:22, 10 April 2009 (UTC)

Questions

Need a book imported.

From Google Books, The American Illustrated Medical Dictionary, last major medical dictionary in the public domain. Would be grateful if someone can bring this in, maybe by letter of the alphabet? Cheers! BD2412 T 08:16, 28 February 2009 (UTC)

From Oz, one cannot see a full text version. If you can see it from your side of the pond, it would be worth having a chat to Matt -- billinghurst (talk) 10:22, 28 February 2009 (UTC)
Use a proxy! [2], [3]. Yann (talk) 11:15, 28 February 2009 (UTC)
Too big for me ... PDF - 105.2M ... talk to Matt :-) -- billinghurst (talk) 11:38, 28 February 2009 (UTC)
Is there an easy way to get a text layer for something like this? I can convert this to djvu, but I don't have OCR. --Spangineerwp (háblame) 17:57, 28 February 2009 (UTC)
I do the OCR with Tesseract and a script as Help:DjVu files/OCR with Tesseract. Yann (talk) 19:01, 1 March 2009 (UTC)
Adobe allows you to Save As, and get the text. Or convert it to DjVu using tools at Help:DjVu files. -- billinghurst (talk) 04:07, 1 March 2009 (UTC)
Right; I have Cygwin and PdfToDjvu, but I don't think that automatically brings in the text layer (maybe I'm wrong?). I'm happy to convert the pdf, but without a text layer the transcription work will be rather arduous. --Spangineerwp (háblame) 18:37, 1 March 2009 (UTC)
It worked for me when I did the Copyright article, so I am presuming that Google PDFs have the text layer. Matt was able to pull the text out of the DjVU version for me with his bot. How? Dunno, he has his magic. -- billinghurst (talk) 04:38, 2 March 2009 (UTC)
I'm looking for the easy way to do it. Arduous I could do myself by copying and pasting the "View plain text" versions of the pages in Google Books. That's what I need a clever programmer to automate! BD2412 T 03:24, 2 March 2009 (UTC)
Being nice to Matt is the easiest way, though being nice to Matt is very very very arduous<g>. -- billinghurst (talk) 04:38, 2 March 2009 (UTC)
I have a hard copy of that edition (which is the 11th, not the 24th as Google claims). Unfortunately, I can't figure out how to get the proxies working. Eclecticology (talk) 01:43, 1 March 2009 (UTC)
The edition doesn't matter - what matters is that the book itself was published in the U.S. in 1922, which is as up to date a medical dictionary as is likely to fall into the public domain for some time to come. Having a scan of this will be useful for multiple wikiprojects - 'pedia, wiktionary, and here, at least. BD2412 T 02:28, 1 March 2009 (UTC)

I am working on getting the text. I'll let you guys know when it is done. --Mattwj2002 (talk) 11:36, 2 March 2009 (UTC)

Awesome - thanks! BD2412 T 02:53, 5 March 2009 (UTC)
I have been having a lot of problems converting this book to a djvu file. Can we try a different edition? Please let me know on my talk page. --Mattwj2002 (talk) 06:49, 31 March 2009 (UTC)

zoom

hello,

when editing in the page namespace, it is now possible to switch between horizontal and vertical layout (button 5 in the toolbar). The new zoom works well in the horizontal mode, bit it might not be well adapted to the vertical mode. Question : should I restore the old zoom for the vertical layout ? (the drawback is that users will need to learn how to use two different zooms )

ThomasV (talk) 10:53, 27 March 2009 (UTC)

My thoughts are very much yes, as when working on books in columns, the horizontal style will not be my preferred way of working. An example of such a page is Page:Dictionary of National Biography volume 13.djvu/8 and a column fitted nicely within the old zoom capability, whereas the new style crunches the working column. That they are different may be something that we have to note, or direct people to instructions from within pages in the namespace. -- billinghurst (talk) 21:45, 27 March 2009 (UTC)
Could my preferred zoom and layout be remembered when I continue with the next page? I guess this could mean submitting the settings when I press the save button, and stored as a personal preference. --LA2 (talk) 14:59, 28 March 2009 (UTC)

yes, you can add this to your .js to make it default :

var proofreadpage_default_layout = 'horizontal';

ThomasV (talk) 20:58, 27 April 2009 (UTC)

end-of-line hyphenation

When I edit a page, is it OK for me to rejoin a word broken by end-of-line hyphenation?

  • (b) Are edits to rejoin broken words -- as recommended by pgdp -- welcome? Are end-of-line hyphens unwanted leftovers from the OCR process? or
  • (a) Do we deliberately do things differently than pgdp?
Personally, I remove line-ending hyphens where the word is simply broken. The one exception is at the end of a page, where I use {{Hyphenated word start}} and {{Hyphenated word end}} as per the guidance at H:SIDE. We are with user preference on the first, we are firm on the second. -- billinghurst (talk) 14:39, 31 March 2009 (UTC)
I generally remove these hyphens too, though I tend to be conservative when it's about a word that would normally be hyphenated by itself. American publications tend to treat many of these hyphenated compounds as one word, but this practice is by no means universal or consistent. Whether we add an asterisk is a novel idea to me. Eclecticology - the offended (talk) 20:27, 1 April 2009 (UTC)

Meta-question:

I see that Help:Footnotes and endnotes that specifically says one kind of printed detail (the pilcrow) is not important enough to preserve exactly. Is there a more general "guideline" page somewhere that discusses, in general, how closely our edited text should match the printed work? I think that one page could have one-line answers to many common questions about: paragraph indents and other uses of nbsps (&nbsp;), number of spaces after a period, whether to put a hard return for each line in the source text, end-of-line hyphenation, etc. I clicked the "Help" in the left sidebar and hunted around a bit, but I found nothing. Should Help:Digitising texts and images for Wikisource#Final editing and Help:Editing Wikisource link to a general guideline that discusses how closely our edited text should match the printed work? --DavidCary (talk) 14:26, 31 March 2009 (UTC)

I had hoped that some of the longer-term Sourcerers may have addressed this bit (I am still John-come-lately, though well immersed), in lieu of that ... IMNSHO the documentation for many parts of WS has not been the focus for a period of time. There is a lot of convention, levels of evolution and heaps of discussion and that takes place in this forum, and often not reflected in our documentation. Much of the focus is in texts, and new texts, especially around personal projects, and we try and catch each new document and each new person as they come in the door (which in a smaller active group of people is somewhat manageable), though the consequence is that personal touch and interactive negatively reflects in our documentation system and processes. Even as an admin, I am still reserved about radically playing with pages of text where a person has had an interest and I may not know the full history, but that could be my 'hasten slowly' nature kicking in. To your questions:
  • Our overarching principle is that the text should be correct.
    • If editors then wish to replicate the look, that is nice, but not necessary.
  • The Page: domain is our growing place of importance as we can have both text and original, and the Side-by-Side has been our guidance
    • working in the Page domain does put added emphasis on the presentation, and that is reflected above
  • Stylistically we are still on the journey, and keep picking up new ideas, and hopefully a few of our number will be able to take a step back and reflect on the curatorial and archival, rather than acquisitional.
Summary? Well let me try it this way ... Principles-based. Rule-guided. Community-directed. All said, flexibility exists, and a fair bit of tolerance. smiley -- billinghurst (talk) 01:21, 1 April 2009 (UTC)
I have always treated flexibility and tolerance as having premium importance. To do anything else is to ignore the immensity of our task; that has been a concern to me as far back as November 2003.
I neither add to nor use the Page: namespace for a variety of reasons. Of course, I don't object to its use by others. Your point about putting too much emphasis on presentation is well taken; a lot of time can be wasted talking about maintaining the long "s". Transcluding "Pages:" into the mainspace also makes it more difficult to develop a system for annotating pages, a proposed feature that has received very little attention.
I see no reason to deprecate the use of pilcrows, asterisks, daggers and the like. It can still be useful for situations where we would like to have dual footnoting with one series indicated by numbers, and another, sometimes with such symbols. This could distinguish between notes that come with the source, and notes that are wiki-added.
Help:Footnotes and endnotes has been here without any substantive edit for three years, and it is still tagged as a proposal. Part of the problem is that because it has been there for a long time, it has built up a presumption that there is a consensus in its favour. Someone who is bold enough to change that page (to use it as an example) may very well be ignored, and his edits will stand. It is just as probable that he will be challenged on the basis that there is no consensus for change, even if there was no real consensus for what is there in the first place. That no-one paid attention to it when it was first written is no evidence of consensus. A lot of these policies remain on the books because discussing them invites drama; it's easier to ignore these policies. Eclecticology - the offended (talk) 23:42, 1 April 2009 (UTC)

If a book is no longer copyrighted in one country, say, Canada, but is copyright in others, like the US, can it be posted on Wikisource? For example, some of H.G. Wells' later works are still copyrighted in the US, but not in many other countries, as he has been dead for 66 years or so. Those are not displayed on Wikisource, even though Wikisource is an international website.

We're familiar with these concerns; we need to balance "making all these works available", with copyright restrictions. In cases where a book is available in Canada but not the United States, we use a separate webhost at Wikilivres, and then link the works from our Author pages and other texts. For example, on Author:Albert Einstein you will notice that "Why Socialism?" exists and is linked...but if you click it...it's not hosted on our American webhost. Sherurcij Collaboration of the Week: Author:Romain Rolland. 18:52, 3 April 2009 (UTC)
Wikisource is internationally available, but, as it is hosted in America (specifically Florida), it is subject to the Copyright Laws of the United States of America. Works must be public domain or otherwise freely licensed in America to be appropriate. Jude (talk) 00:46, 4 April 2009 (UTC)
Why Socialism does not appear to have had its copyright renewed in the United States, so it may be OK to host. The problem is with how to interpret the non-acceptance by the U.S. of the rule of the shorter term. The tendency on this site has been to interpret it strictly. Canada, with a basic copyright duration of life + 50, does accept the rule of the shorter term except for works written by Americans or Mexicans. Wells was British so those later works could be hosted on wikisource.ca as well as wikilivres. Eclecticology - the offended (talk) 08:51, 4 April 2009 (UTC)
Wikilivres, which is not a Wikimedia Project, is hosted in Canada, so it accepts works that are PD there but not in the U.S. Angr 09:08, 4 April 2009 (UTC)
Where did you look to see if Why Socialism did or did not have its copyright renewed? A quick search on U.S. Copyright Renewals 1950 - 1977 at Gutenberg reveals that its renewal id is R640895. In any case, rule of the shorter term would be irrelevant on Why Socialism, as it's a work written by an American author published in an American magazine. I'm not sure why you say the "problem is with how to interpret the non-acceptance by the U.S. of the rule of the shorter term"; it's crystal-clear that the US doesn't accept it in any way, with the sole exception of unrenewed works not under copyright in the source nation in 1996. Interpreting it other than strictly is simply wishful thinking.--Prosfilaes (talk) 12:22, 4 April 2009 (UTC)
I looked up Why Socialism on the Rutgers database; if it shows up differently elsewhere I'm not committed enough to that work to argue about it. I raised the rule of the shorter term because it's so obviously debatable. Eclecticology - the offended (talk) 17:31, 4 April 2009 (UTC)
It's not any more debatable then any other part of US copyright law.--Prosfilaes (talk) 17:45, 4 April 2009 (UTC)
Rutgers only has class A (book) registrations; Why Socialism, as a magazine article, is a class B registration.--Prosfilaes (talk) 17:48, 4 April 2009 (UTC)
Anything so debatable cannot be considered crystal-clear. Also it would be good to see a more efficient treatment of Class B registrations; perhaps hosted by Wikisource. Eclecticology - the offended (talk) 04:45, 5 April 2009 (UTC)
I guess I can file it with the existence of the tax laws; if you want to debate something hard enough, you can debate it. Highlights of Copyright Amendments Contained in the URAA is clear: "A French short story that was first published [...] in 1935 will [...] expire on December 31, 2030 (95 years after the U.S. copyright would have come into existence)." If you can point to the actual law that says otherwise, there might be a point. I fail to see why moving the copyright renewals from Gutenberg to here would gain you anything; in either case, you're reduced to searching plain text.--Prosfilaes (talk) 14:12, 5 April 2009 (UTC)
The relevance of your reference to tax laws escapes me. Reference to government pamphlets represent a simplistic view of the law. Statutes and judicial precedent are far more authoritative, and Itar-Tass Russian News Agency v. Russian Kurier, Inc. raised some interesting questions in this regard. My suggestion relating to Class B renewals would have separated them from the already adequately searchable Class A renewals. Eclecticology - the offended (talk) 19:18, 5 April 2009 (UTC)
Statues are more authoritative; do you care to point me to the statue that mentions the rule of the shorter term? Judicial precedent is tricky, as case rulings are generally limited to the case at hand. There's no direct connection between that case and the rule of the shorter term, and I think that making that extrapolation in unwarranted. The relevance of the tax laws is that there is a whole body of people out there who claim the US income tax is voluntary or illegal, despite solid evidence otherwise; the mandatory nature of said tax is crystal-clear, yet debated.--Prosfilaes (talk) 05:28, 6 April 2009 (UTC)
The rule of the shorter term for the US is based on section 104(c) of the Copyright Act. The relevance of the Itar-Tass case is in establishing that the ownership of the copyright is determined by the laws of the owner's country. The tax law parallel would only be applicable if someone here were challenging the legality of copyright law in its entirety, and no-one here is doing that. A more relevant parallel might be in determining the extent to which a resident of one state or no state would be subject to the the state income tax of another state, and that's not an easy task. Eclecticology - the offended (talk) 07:51, 6 April 2009 (UTC)
Section 104(c) says that the Berne convention, including implicitly the law of the shorter term, is not law in the United States.--Prosfilaes (talk) 14:36, 6 April 2009 (UTC)
The rule of shorter term in USA is debatable precisely because it is not written in the law, but it is our interpretation of the court decisions, and because US law is quite an exception in this regard. Yann (talk) 09:19, 6 April 2009 (UTC)
Which court decisions? Since Canada and Germany, for two examples, don't uniformly apply the rule of the shorter term, I don't see it as that much of an exception.--Prosfilaes (talk) 14:36, 6 April 2009 (UTC)
Perhaps this reference which appeared on to-day's Foundation mailing list may help you to understand: [4] Eclecticology - the offended (talk) 16:56, 6 April 2009 (UTC)
A case that ruled that US copyright law should follow "traditional [US] contours of copyright" above the Berne convention means that US copyright law should follow the Berne convention above the "traditional [US] contours of copyright"? It's a wild stretch any way you interpret it, and that's extrapolating from a recent court case that may well get overturned by a higher court.--Prosfilaes (talk) 17:09, 6 April 2009 (UTC)
Introducing a direct link on the Einstein page to Wikilivres without an explanatory note that U.S. residents shouldn't be following that link seems morally questionable. Does anyone have any objections to making a boilerplate template acknowledging that fact like the one I came up with on the Author:G. K. Chesterton page? ResScholar (talk) 07:07, 5 April 2009 (UTC)
I'd support either a template explaining why they are hosted on Wikilivres, or at least putting the works hosted there in a separate section on each author page. I don't think giving a direct link to Wikilivres without an explanation is good practice; we wouldn't host Author:J. K. Rowling with a link to the Harry Potter books somewhere, so likewise it would be a good idea to at least explain. I recently used this reasoning to delete several soft redirects to 'This text now available on Wikilivres'. Jude (talk) 07:29, 5 April 2009 (UTC)
I don't think a comparaison with J. K. Rowling is relevant here, because texts hosted on Wikilivres are legal where they are hosted. In addition, many texts on Wikilivres are public domain almost everywhere, but not in USA. Hosting Harry Potter books would not be legal anywhere. That said, I agree that links to Wikilivres are put in a separate section, because 1. Wikilivres is not hosted by the Wikimedia Foundation, 2. These texts are under a different legal status (Canadian law) that those hosted on Wikisource (US law). Yann (talk) 13:15, 5 April 2009 (UTC)
Bringing J. K. Rowling into the discussion is a red herring, since no-one is seriously considering adding her works to Wikisource. If Einstein qualifies as a U.S. person, a properly renewed Why Socialism would not be legal on Wikilivres either since Canada's acceptance of the rule of the shorter term does not apply to the United States or Mexico because of specific exceptions. Linking to a legal host for a work is fine with me with a proper warning for people living in places where the access might not be legal. I think that it would be preferable for maintaining the bibliographic integrity of a page if it could be done with the links in their normal place rather than in a separate section. Eclecticology - the offended (talk) 20:50, 5 April 2009 (UTC)
Why Socialism is still in copyright in the nation of first publication and of the nation of its author. (Einstein was indeed a citizen of the United States at the time.) As such, it doesn't fall under the rule of the shorter term, because the rule of the shorter term only applies to works that are out of copyright in their home nation. However, since Einstein died more than 50 years ago, it's out of copyright in Canada for that reason.--Prosfilaes (talk) 05:28, 6 April 2009 (UTC)
No. See subsection 9(2) of Copyright Act of Canada, where the law was changed to accommodate the special privileges received by the US in NAFTA. Eclecticology - the offended (talk) 07:05, 6 April 2009 (UTC)
Did you bother reading what I was saying? 9(2) says "Authors who are nationals of any country, other than a country that is a party to the North American Free Trade Agreement, that grants a term of protection shorter than that mentioned in subsection (1) are not entitled to claim a longer term of protection in Canada." Great. The heirs of American authors can claim the full life+50 on their pre-1923 works, which is what subsection 1 gives them. However, Einstein died more than 50 years ago, so he gets zilch, NAFTA or no NAFTA.--Prosfilaes (talk) 14:20, 6 April 2009 (UTC)
As for not separating out Wikilivres links, that would seem to require asterisks or something and a message at the botttom if the links are alphabetically ordered, or a message after 1922 works if chronologically ordered and the genres are all lumped together. The first possibility...I don't know. Some people aren't going to bother to check the asterisk. Or a message like "hosted at Wikilivres--check your local copyright laws" for each entry seems cumbersome and incomplete, and suggests Wikilivres is an affiliate. I think the difference of wikilivres works is pronounced enough to be separated from the main entry kind of like a special collection at a large library. My 2¢. ResScholar (talk) 05:50, 6 April 2009 (UTC)
A red stop sign might work better than an asterisk. Another alternative might be an intermediate page that is triggered whenever one links to Wikilivres for a page that is not allowed here. It would warn the person that he is leaving this site and that viewing such a page may be against the law in the user's home country, and asking him if he wants to proceed. Project Gutenberg did something like that for links to their Australian server. Eclecticology - the offended (talk) 07:51, 6 April 2009 (UTC)
All this talk ignores the actual question of legality. It is not illegal for somebody to read something on Wikilivres by any definition. Intellectual property law makes it illegal to sell, give or offer the forbidden material, consuming, purchasing and watching it are perfectly legal. There is no part of the law that prohibits an American citizen sitting in the United States from reading what is hosted on Wikilivres. Sherurcij Collaboration of the Week: Author:Romain Rolland. 17:44, 6 April 2009 (UTC)
It's generally accepted that copyright law makes it illegal for you to download material to your computer that's under copyright in your jurisdiction--you're certainly making a copy--, and there are frequent civil cases against those who import material copyrighted in the US without permission of the copyright holders. Personally, I'm more concerned about keeping Wikilivres at arm's length by avoiding statements like "we use a separate webhost at Wikilivres" and making it clear to users and potential litigants that Wikilivres is a separate organization under a different jurisdiction that follows different policies.--Prosfilaes (talk) 17:55, 6 April 2009 (UTC)
The term "generally accepted" is a term of art used in pop law by those unable to substantiate their position. I would not be so quick as to claim that such downloading is either legal or illegal in all cases. There is enough doubt that it suffices to say that it may be illegal, and leave the downloader to accept responsibility for his own actions in his own jurisdiction. Downloading, and storing for personal use is generally legal in Canada; that's why we have a special tax on electronic storage media. Eclecticology - the offended (talk) 18:36, 6 April 2009 (UTC)

Unfortunately for me, although Wikilivres is just what I needed, it lacks almost all of Wells' works...

Wikilivres has a much smaller economic base so it prefers that works without copyright difficulties remain on Wikisource. Both sites are volunteer sites, so if you feel that there are significant holes in the Wells collection your help in adding some to the relevant project would be appreciated. Eclecticology - the offended (talk) 19:36, 14 April 2009 (UTC)

Djvu repair

I have found a djvu file at the Internet Archive (a book scanned at the University of Toronto) and have started to import it here, with great success. But I also found another djvu file (a book scanned by Google) where several pages are poorly scanned and then scanned again. The resulting djvu file has far more pages than the original book. Should I just upload that djvu as it is, and then mark some of the pages as invalid? Or should I rather repair the djvu file first, before uploading it? And what tools can I use for that? Is that a standard routine? --LA2 (talk) 14:42, 7 April 2009 (UTC)

Repair it (presuming that the original was not flawed). DjVuLibre (find via Help:DjVu files) allows you to save components of a file, though if you have lots of splits to do it may be a little tedious. -- billinghurst (talk) 14:49, 7 April 2009 (UTC)
Yes, what exactly are the commands? There is a djvused command, and that sounds promising, but it's not like I can do "14,17d" (in sed syntax) to delete pages 14-17 and keep the rest? --LA2 (talk) 16:40, 7 April 2009 (UTC)
If all you need to do is delete pages, use the djvm command. The syntax djvm -d DOCUMENT.djvu PAGENUM will delete the page number you supply (PAGENUM) from whatever DOCUMENT.djvu is. I've never needed to test whether it can delete a whole range of pages, however, but I do know it can do it one page at a time.—Zhaladshar (Talk) 18:45, 7 April 2009 (UTC)
Thanks, that's good enough! --LA2 (talk) 20:10, 7 April 2009 (UTC)

Request for Importation

Could someone import [5] for me? It's linked from w:Hurricane Janet; I'm not finding any renewals on it; as a 1955 publication, renewals would have been around 1983 and hence available from The Copyright Office online catalog, but I'm only finding new registrations of Monthly Weather Review there, and they say that it "[c]ontinues a publication by the same ti. issued by the U. S. Weather Bureau." Given that the author line says "Weather Bureau Office", which was part of the US Gov, I think I'm safe in calling this {{PD-USGov}}; if not, {{PD-US-no-renewal}}.--Prosfilaes (talk) 16:06, 8 April 2009 (UTC)

Wikimania 2009: Scholarships

Wikimania 2009, this year's global event devoted to Wikimedia projects around the globe, is now accepting applications for scholarships to the conference. This year's conference will be handled from August 26-28 in Buenos Aires, Argentina. The scholarship can be used to help offset the costs of travel and registration. For more information, check the official information page. Please remember that the Call for Participation is still open, please submit your papers! Without submissions, Wikimania would not be nearly as fun! - Rjd0060 (talk) 02:20, 9 April 2009 (UTC)

Search problem

TCALSS, why does this search fail to give Fatal fall of Wright airship in the results? Cygnis insignis (talk) 16:59, 9 April 2009 (UTC)

I believe it's because Fatal fall of Wright airship transcludes pages that have "Luke E. Wright" in them, so it itself doesn't show up. To me, that isn't ideal behavior. Psychless 21:02, 9 April 2009 (UTC)
Not ideal? It's terrible! If Google isn't scraping transcluded page content, then we have a big problem, and our whole approach to transcription may need to be revisited. Hesperian 02:19, 16 April 2009 (UTC)
Google does indeed scape transcluded page content. The actual Wikipedia search engine, however, works based on the text-content of a page before the Parser is run over it, I believe, thus causing this issue. Jude (talk) 02:23, 16 April 2009 (UTC)
Ah, yes, sorry. I see "search" and I think Google. It is still a problem though. I wonder what can be done about it. Hesperian 02:39, 16 April 2009 (UTC)
It all comes down to finding a balance between accuracy and accessibility. Using "subst" would solve the problem. Some of us are just as happy ignoring the "page" system. Eclecticology - the offended (talk) 22:05, 16 April 2009 (UTC)
Ignoring the "page" system would require keeping two separate versions of one page up-to-date. I'd be very averse to substituting the transclusions; better to come up with a solution that allows indexes on the text of the Page: namespace, rather than creating more work for us. Jude (talk) 02:25, 18 April 2009 (UTC)
The separate versions is indeed a possible unfortunate outcome, but I also see where the "page" system will in time paint itself into a corner. It is often too tightly bound to a specific edition or printing of a work. That will not necessarily be the best version of a work. Indexing adds more complications, and until someone can develop a sophisticated search engine it's not going to get any easier. It was interesting to incorporate a book's own index into the site at The Book of the Thousand Nights and a Night/Volume 1/Index, but manually that was a lot of work. What other ways are there to incorporate a work's own index? The trend seems to be to dutifully copy, but meaningfully ignore these. Eclecticology - the offended (talk) 20:31, 18 April 2009 (UTC)

Changing the "prose" class for printing

In my personal monobook.css, I have removed for printing the width restriction imposed on the pages by the CSS class "prose", or specifically "div.prose". I did it as follows:

@media print {
  div.prose {
    width: 100%
  }
}

I think that something of the sort should be in place by default. That is, the CSS class "prose" should leave the pages intact for print media, only limitting the width for the screen media, for which it is useful. --Dan Polansky (talk) 16:05, 10 April 2009 (UTC)

What makes you say that? Oftentimes, texts can be much more difficult to read if not straightened by <div class=prose>, whether on screen or in print. —Anonymous DissidentTalk 00:49, 11 April 2009 (UTC)
I have tried to print a book styled with "prose" before I have added to my monobook.css the styling that I have posted above. The result looked poor, and it depended on the zoom factor. Specifically, when I have used the zoom of 140%, my favorite one for printing, it forced some of the text behind the right printable margin, making it the text unreadable as a whole. When I have changed the zoom back to 100%, unpleasantly wide left and right margins were left.
If you want to know what I mean, you can try it for yourself, for instance here: The_Analysis_of_Mind/Preface.
Anyway, I propose to remove the width constraint from "div.prose" for printing, and if some more people support the proposal, it can be implemented. --Dan Polansky (talk) 07:49, 11 April 2009 (UTC)
Fundamentally, no-one should need to understand the arcana of monobook.css to be able to get something to print right. Even if they do understand them they will not necessarily be applicable to other skins. Some may very well find the texts easier to read, but it should not be imposed on those others who don't. These styling conventions should be limited to the user's own preferences. Eclecticology - the offended (talk) 17:59, 11 April 2009 (UTC)

I need help linking a wikisource file to an author page

Hello,

I just posted "The Journal of Major Andre" and now I need help linking it to the author page of John Andre. If someone can help me do this, I would really appreciate it.

thank you.

Copbuddy25 (talk) 07:08, 15 April 2009 (UTC)

In your article you will see the author's name is red. Clicking on that will will open up the properly named author page. If you have previously set your user preferences in the Gadgets section to preload useful templates you will already have the heading format where you can fill in the blanks. Eclecticology - the offended (talk) 07:24, 15 April 2009 (UTC)
It automatically links to the author name in the title, which in this case was missing the accents of the author page. First step is always to see if you're using the full correct name in the header, and if there's already an author page for that name or some variant of it.--Prosfilaes (talk) 13:04, 15 April 2009 (UTC)
While it is certainly true that the correct name has an accent, it is also important to remember the maxim, "Don't bite the newbies." Copbuddy25's question was clearly at a more elementary mechanical level, and did not call for a condescending lecture. If he wrongly links to "Andre" instead of "André" it's no big deal; that can be fixed very easily. When you start by telling him what his "first step" should be according to your interpretation of the rules, you are showing that this is not a friendly place. Eclecticology - the offended (talk) 16:38, 15 April 2009 (UTC)
The response to Copbuddy25's question should not be to encourage him to create a new author page, which is not what he asked for; it should be to encourage him to find the existing author page, which is what he asked for, and is much easier to boot. I fail to see how someone with "the offended" in their sig line attacking other users is going to convince him that this is a friendly place.--Prosfilaes (talk) 16:53, 15 April 2009 (UTC)
Nobody was encouraging the creation of a new author page; if a newbie happens to do that, so what? Your statement that "the offended" is an attack is false. Eclecticology - the offended (talk) 19:57, 18 April 2009 (UTC)
Expanding on the above a little. Often with situations where an accent is involved in a name, then adding a redirect from the non-existent version to the existing page is appropriate. It makes navigation a little more helpful. I have created such for Author:John Andre. unsigned comment by Billinghurst (talk) .
Indeed, it is less harmful that way. Eclecticology - the offended (talk) 19:57, 18 April 2009 (UTC)

Portal:Marcus Witmark & Sons and others

I propose to move this to the pseudo-namespace Publisher:Marcus Witmark & Sons. This recognizes that authors and publishers are different kinds of entities. This issue is distinct from the one about whether a corporate body can be an author. I certainly support the notion that a corporation can be an author. The publisher is the person responsible for assuring that a work is made available to the public. It may sometimes hold copyrights to the work, but that is not an essential characteristic. Where the person is clearly both author and publisher its status as an author would take precedence. A header for publishers, similar to the one for authors should probably be developed. I'll wait to see if there are other good alternatives before I go ahead. Eclecticology - the offended (talk) 19:46, 18 April 2009 (UTC)

I am not sure that you have identified to me the benefit for yet another namespace. What is it that you are trying to achieve? If it is just tidiness, then, while the existing system is not perfect, I have a preference for KISS. My preference would be to delete the page, and use an | override_author = to not have a link to an author page. -- billinghurst (talk) 23:05, 18 April 2009 (UTC)
The one work listed for Witmark already shows an author, Stoddard King. Witmark was treated as an author because it owned the copyright. In the short term it is a tidiness issue, but looking beyond that it is a matter of recognizing the role of publishers in a book's history. A book's library description always includes the publisher, and sometimes it is helpful to know when a publisher was in business. Knowing the publisher also helps us to identify which version of the book we are hosting. For now I suggest only a "pseudo-namespace", but if the idea proves itself full namespace status in the future is a possibility. Eclecticology - the offended (talk) 04:12, 19 April 2009 (UTC)
While it is an interesting idea, we already are having issues with the non-addition of headers, licences, Wikisource: namespace etc. without the addition of a Publisher: namespace. We still have MANY MANY Author: pages that need to be created. We have projects, hundreds of incomplete works, and thousands of odds and ends curatorial tasks, and a very limited set of volunteers who do the less glamorous work.
Wouldn't it be more easily undertaken by the addition of a category? Though as categories are something we are not special at managing, alternatively, add a publisher field to the header that adds the publisher as a category for the moment, that may be able to modified in the times to come (if that time ever comes.) The thought of having to go to yet another place to manually maintain more data is not stimulating me. The more that I think about it, the more it seems like lead weight.-- billinghurst (talk) 11:35, 19 April 2009 (UTC)
I know there are a lot of tasks to be done, but one needs to avoid the sense of anxiety that comes with the overwhelming feeling that all of this needs to be done. The ones who attach a lot of importance to headers and licences, or to maintaining the contents of the Wikisource: namespace will be happy to do that. They will work at their own pace, and if some pages are left without headers or licences, why worry about it? The lists of MANY, MANY wanted authors have been unduly inflated by automated processes. Is everyone in Abraham Lincoln Brigade really an author? For the authors in McClure's Magazine or Littell's Living Age can we reasonably expect all those authors to be identified before we start to include their contributions? At least with these latter we know that they have written something. Wikisource:WikiProject_CrankyLibrarian is a great idea, but the last comments in the archived discussion left me with the impression that its protagonist was not happy with the discussion and was going to play elsewhere. He's entitled to do that, of course, but that just leaves us with another long list of things for someone else to do. The manpower to do that work is woefully inadequate, and there is great resistance to reducing these lists into anything manageable. We just end up with so many unpruned and unproductive trees in the orchard. Wikiproject_CrankyLibrarian is just a list with no description describing the project, its aims or its members.
I really do sympathize with what you say about yet another namespace to maintain. I find the possibility of categories through a publisher line in the header entirely workable. So for the sample publisher having the header line produce Category:Publisher:Marcus Witmark & Sons ?? I concur that categories are seriously underdeveloped; there has been some effort to categorise authors, but that is not where the greatest need lies. Better that I save expanding on that point for another time. Eclecticology - the offended (talk) 22:57, 19 April 2009 (UTC)

Umm, I just saw that the Publisher: redlink had changed to a real link. Surely that must have been a mistake, as surely there cannot be considered to have been a consensus at this point. -- billinghurst (talk) 11:59, 23 April 2009 (UTC)

OK, there was no consensus for a move, so I've set it up as a new page without moving. That does not require a consensus. Eclecticology - the offended (talk) 17:51, 23 April 2009 (UTC)
There is no consensus indicated here to create any new namespace--whether by a moving or creating a page. Jude (talk) 03:42, 24 April 2009 (UTC)

Although typically anti-category and much more in favour of indexes, I have to admit that we are not equipped to try and handle "publisher" indexes - and that it should be based on a category system at least for the forseeable future. Sherurcij Collaboration of the Week: Author:Carl Jung. 19:21, 23 April 2009 (UTC)

I'm certainly willing to be flexible on this. This isn't so far gone as to be incapable of adapting to a constructive solution. The fact remains that Witmark is not an author. Merely saying that a proffered approach won't work is not a solution. Eclecticology - the offended (talk) 20:58, 23 April 2009 (UTC)

I like the idea of having pages for publishers, as they are/were often a vital part of the authoring process. However having a "Publisher" namespace should wait for local consensus, and then a new namespace being created by the system admins. In the meantime I think we should consider them as Author pages, and group them into Category:Publishers. I think we need to have one or two good quality Publisher pages before we decide to create a namespace. John Vandenberg (chat) 22:55, 23 April 2009 (UTC)

I never even suggested that this is the time for a new namespace. Simply beginning a name with the prefix "Publisher:" does not create a namespace. Also you won't get "one or two good quality Publisher pages" unless you allow a few of them to be started. It makes no sense to treat publishers as authors when they are not authors; they're not even corporate authors. Eclecticology - the offended (talk) 08:35, 24 April 2009 (UTC)
Indeed, creating a page with a prefix does not make a new namespace. It just creates a new work page with a "Publisher:" prefix, which show up as a work in search results, count as a work for statistics, et cetera. Pseudo-namespaces should be avoided for this reason. —Pathoschild 09:09:30, 24 April 2009 (UTC)

Authors with unknown years of deaths

As I replace more and more {{PD-old-50}} and {{PD-old-60}} with {{Pd/1923}} and {{Pd/1996}} to automate and enhance the copyright tagging, I would like to ask what to do with authors with unknown years of deaths, i.e. Category:? deaths, as I have not replaced presumptive {{PD-old-50}} in several author pages. Based on [6], 17 U.S.C. § 302(e) provides presumed deaths of authors for defenses in case of unintentional copyright violations. Should we presume affected works as having "unknown" authorship? Please advise before I re-tag affected pages. Thanks.--Jusjih (talk) 01:21, 21 April 2009 (UTC)

They need to be individually considered. There is no doubt that Author:Alexander of Lycopolis is long dead. We have works of others that were published before 1923. Still others can be treated as works for hire. Once you have dealt with the low-hanging fruit, we can discuss those that remain. If you think that a certificate from the Copyright Office is the right way to go in some case, feel free to apply for the certificate. Where we know the author but not his death year, it would be deceptive to say that the authorship is unknown. Eclecticology - the offended (talk) 04:33, 21 April 2009 (UTC)
In terms of life+x, if we assume a lifespan of 80 or 90 years from first publication, we're in a pretty safe position.--Prosfilaes (talk) 10:31, 21 April 2009 (UTC)
We can probably adjust that downward if we know the birth year. Eclecticology - the offended (talk) 07:20, 22 April 2009 (UTC)
My numbers came from assuming first writing was at 20 and a max lifespan of 100 to 110 years, so yes, if we know the year of birth, we can add 100 or 110 to that.--Prosfilaes (talk) 13:13, 22 April 2009 (UTC)
Having authors named in works does not always identify them well, so some may be considered "pseudonymous". If no other comments, I will tag {{PD-old-50-1923}} without the year of death as a compromise.--Jusjih (talk) 01:11, 26 April 2009 (UTC)

Build your own

DIY High-Speed Book Scanner from Trash and Cheap Cameras, posted by Daniel Reetz on instructables.com. Perhaps the 79 steps makes for an overly complicated design. But the basic ideas are sound. More people should build these devices. --LA2 (talk) 15:06, 21 April 2009 (UTC)

The optimistic spirit kind of reminds me of Bill Sutton's song "Do it yourself" (you can build a mainframe from the things you find at home), Youtube recording, annotated lyrics, author's website. --LA2 (talk) 20:49, 21 April 2009 (UTC)
Another cute, and much simpler, design is http://bkrpr.org/ --LA2 (talk) 01:33, 25 April 2009 (UTC)

This text applied here doesn't seem to match our existing Copyright tags. Would someone with a little more background look to see which tag applies. Thanks. -- billinghurst (talk) 01:26, 26 April 2009 (UTC)

If that copyright license is compatible with GFDL, let us make a new tag as I see no existing tags fitting its license term.--Jusjih (talk) 17:11, 26 April 2009 (UTC)
According to this page on GNU's website, the Open Publication License is incompatible with the GNU Free Document License. It should probably be deleted, but best to discuss this at WS:COPYVIO. Jude (talk) 10:54, 27 April 2009 (UTC)

Nuremberg +++

These files

  1. Nuremberg Code
  2. Nuremberg Defendants and Defense Counsel
  3. Nuremberg Indictment
  4. Nuremberg Indictment Appendix B
  5. Nuremberg Judgement Sentences
  6. Nuremberg London Agreement
  7. Nuremberg Members and alternate members of the tribunal
  8. Nuremberg Officials of the General Secretariat
  9. Nuremberg Prosecution Counsel
  10. Nuremberg Rules of Procedure

all need some love and attention. They are not clearly works against specific documents/works, and it needs some people familiar with these works to get them to our current standards, especially with headers, and possibly to be subpages. Thx. -- billinghurst (talk) 09:23, 26 April 2009 (UTC)

Adding

  1. London Agreement which seems to correspond but not match
  2. London Charter of the International Military Tribunal

-- billinghurst (talk) 14:53, 26 April 2009 (UTC)

  1. Interrogation of Wolfram Sievers

-- billinghurst (talk) 05:51, 29 April 2009 (UTC)

{{anchor}} broken?

As you can see at Fountains of Papal Rome/St. Peter's, after an anchor tag the text is moved to a new line and is formatted like there is a space in front of it. Now, I haven't used the tag since June of '08, so I don't know when it started acting up or if there was some sort of change that I missed. Anyone know why this is happening? -- Editor at Largetalk 23:49, 27 April 2009 (UTC)

Whoever added the noinclude at the bottom left a trailing carriage return. Fixed now. Hesperian 01:27, 28 April 2009 (UTC)
I thought I'd fixed that... there must have been two and I missed one. Thanks :) -- Editor at Largetalk 01:40, 28 April 2009 (UTC)
There's no history of you editing that template; perhaps you did fix it, and the edit didn't stick. Hesperian 01:52, 28 April 2009 (UTC)