Wikisource:Scriptorium/Archives/2009-03/double pages in djvu
double pages in djvu
[edit]With the help of Help:DjVu files I now managed to create a djvu file from my png scans. But one issue remains: My scans were scans of double pages, always a left side and a right side on one png. So my scan of a 180 page book results in a djvu of 90 pages. Is there any convenient way to split the original pngs or the pages in the djvu so I will get a djvu with 180 pages? Does anybody know how to solve this problem? --Slomox (talk) 15:08, 18 February 2009 (UTC)
- I do something like this all the time, but under Linux. Given files labeled 001.png through 999.png that are 3500 pixels across and 300 DPI:
mkdir Output
for i in `seq -w 1 999`
do
pngtopnm "$i".png > temp.pnm
pnmcut -right 1750 temp.pnm > temp1.pnm
cjb2 -dpi 300 temp1.pnm "$i"a.djvu
pnmcut -left 1750 temp.pnm > temp1.pnm
cjb2 -dpi 300 temp1.pnm "$i"b.djvu
rm temp.pnm temp1.pnm
done
djvm -c book.djvu [0-9][0-9][0-9][ab].djvu
If they aren't even pages, half the width (1750, in this case) may not work, and you may want to cut a bit off the edges, too. If the scans aren't totally even, you may need to change that value part way through the book. Probably less than helpful, but that's how I do it.--Prosfilaes (talk) 16:47, 18 February 2009 (UTC)
- The unpaper utility, which I generally try to use when cleaning up scanned pages, will optionally convert a single scanned image of two side-by-side pages into two separate output files (see the
--input-pages
and--output-pages
options in the documentation). It locates the proper content for each page semi-intelligently by searching for margins consisting of mostly white space. I have been happy with its output so far. Tarmstro99 (talk) 17:15, 18 February 2009 (UTC)
- Unpaper looks good, but I couldn't find a pre-compiled download. Although personally I like GUI programs most, I'm fine with command-line tools. But if I even have to compile the program, that's a bit too much for me ;-) Is there a pre-compiled Windows version available for unpaper? --Slomox (talk) 17:56, 18 February 2009 (UTC)
- Google is your friend!
:-)
See http://www.abs.net/~donovan/pgdp.html. Tarmstro99 (talk) 18:26, 18 February 2009 (UTC)
- Google is your friend!
- Unpaper looks good, but I couldn't find a pre-compiled download. Although personally I like GUI programs most, I'm fine with command-line tools. But if I even have to compile the program, that's a bit too much for me ;-) Is there a pre-compiled Windows version available for unpaper? --Slomox (talk) 17:56, 18 February 2009 (UTC)
- Thank you. I still have one problem: If I provide a multi-page pbm as input, it will only handle the first page. Is there any special parameter I have to provide to handle all pages? --Slomox (talk) 20:44, 18 February 2009 (UTC)
- I don’t believe so. The solution is to split
document.pbm
intodoc0000.pbm, doc0001.pbm, doc0002.pbm, ... doc0099.pbm
with pamsplit, then feed the resulting files into unpaper (which will accept an input parameter such asdoc%04d.pbm
to automatically start processing multiple files starting fromdoc0000.pbm
). If you want to start processing at, say, pagedoc0004.pbm
instead of counting from 0, just give unpaper the parameters-si 4 doc%04d.pbm
. E-mail me if you have further problems with unpaper; I’ve used it for quite a few projects now, and the time spent mastering its idiosyncrasies is well worth it given the quality of its output. Tarmstro99 (talk) 21:02, 18 February 2009 (UTC)
- I don’t believe so. The solution is to split
- Thank you. I still have one problem: If I provide a multi-page pbm as input, it will only handle the first page. Is there any special parameter I have to provide to handle all pages? --Slomox (talk) 20:44, 18 February 2009 (UTC)