Wikisource:Scriptorium/Archives/2009-03/CrankyLibrarian project

CrankyLibrarian project

CrankyLibrarian has kindly decided to assist us pull his collection into Wikisource. I have created a page listing all of the books with links (the links dont work yet, ..) to the pages on the crankylibrarian website.

Wikisource:WikiProject_CrankyLibrarian

We will probably be slurping these pages in via bots so to assist the bots get it right the first time we will need author pages to be created, page names disambiguated, and copyright checked.

If we already have an edition, it would be good to spot check that they are the same, and that our edition is better quality - any that we dont want imported can be removed from the list. John Vandenberg ^(chat) 03:12, 6 December 2008 (UTC)[reply]

Wow, this is quite the collection we'll be getting! (Too bad there aren't pagescans to go with it, but oh well. :) ). It's going to take forever to do those author pages, I must say.—Zhaladshar ^(Talk) 16:52, 6 December 2008 (UTC)[reply]

Would it help if I rebuilt the page as a table? Then we could have a "notes" column, or an "action" column to record whether we want to import it or not, or whether it needs to be manually merged into our copy. John Vandenberg ^(chat) 22:37, 7 December 2008 (UTC)[reply]

I don't think it matters much, one way or the other. Simple notes after each entry should suffice.I presume that this is all happening because he wants to get out of the text hosting business. It would, in either case help to add letter headers for ease of navigation. Are we working to a time limit? Comparing two editions can involve a whole raft of problems; if we can't be sure of the source of either we can't know which is the better. Eclecticology (talk) 23:23, 7 December 2008 (UTC)[reply]

I corrected a bit this wikiproject page. I think most of these works where copied from Gutenberg. That way, I even found an error where Gutenberg attributes a work to the wrong author, and Cranky most probably copying the error with the text. I have found a few works in the list which are copyrighted in USA, and were deleted from WS, notably The Great Gasby, so copyright has to be carefully checked. Yann (talk) 20:09, 29 December 2008 (UTC)[reply]

To the extent that the Cranky list includes material copied over from Gutenberg with the usual lack of sourcing, we would do better to remove them from the list because we can copy them directly from Gutenberg if we want. The list would then be left with only those works that are relatively unique to the Cranky site, and these could be given greater priority in our efforts. Eclecticology (talk) 21:15, 29 December 2008 (UTC)[reply]

I'm thinking, to help "prioritise" this project - we should remove from the list those works we already have. I'm wary about removing works we don't have that are copied from Gutenberg, since Cranky seems to have an easier set-up for a bot to parse however. Sherurcij ^{Collaboration of the Week: Author:Joseph McCabe.} 06:13, 16 February 2009 (UTC)[reply]

Since the blatantly obvious has been stated regarding the source of the Cranky content, I'll reiterate the implication of the Cranky interface. If Wiki's intention is simply raw content and is not usability of content, the CL will be of little value. The implied CL interface would be a subset of the book list presentation which, if a CL copy was avaiable, would route the user to CL for portrayal. There would be a "Please Convert" link for unconverted manuscripts in Gutenburg, et al. This is the only interface that makes sense. The inherent nature of cost-efficiency is the re-use of information and it's packaging into a user friendly format. The inherent costs of scanning, vetting, legal clearing, and other numerous trivialities have prevented a usability philosophy in public domain content. Consistent formatting and portrayal enables future exploitation without re-architecting; just re-implementing. But in the conversations on this microscopic issue I detect the inherent creep of bureaucratic mentality and stagnation. This is exhibited by a mind-set that focuses on established process and ignores potential interoperability. At one point I envisioned being able to help with architecture and inter-activity of Wiki systems and external providers, but I detect the same "built-here" mentality and technical naivete that pervades the public domain providers. The best to you in your endevors. Ghost Out.