User:Inductiveload/Scripts/PDF page converter
Appearance
This script "bursts" a PDF into page images. The images will be created in the current directory.
Parameters
[edit]This bash script takes a single parameter: the file to be burst into images:
./pdf_conv.sh file.pdf
Internal parameters
[edit]- CORES is the number of concurrent processes to spawn. It won't affect your interactive session if you use every core of your processor since it uses "nice", but there is no benefit to having more processes than cores.
- EXT is the image extension to convert to.
- DENSITY is the resolution with which Imagemagick converts the PDF. There is no benefit to exceeding the PDF resolution, but if you have a smaller value, quality will suffer.
Requirements
[edit]- Pdftk for the interrogation of the number of pages
- Imagemagick for the image conversion
Both of these are easily available from standard repos.
Source code
[edit]EXT=".png"
CORES=4
DENSITY=300
FILE=$1
echo "Processing $FILE"
#Get the number of pages in the file
tmp=`pdftk "$FILE" dump_data output | grep -i "NumberOfPages"`
PAGES=${tmp#*:}
echo "Processing $PAGES pages"
convert_page(){
#Takes one argument: the current page number
local CURRENT_PAGE=$1
FILENAME=`printf "%04d$EXT" $CURRENT_PAGE`
echo " Converting page $CURRENT_PAGE to $FILENAME"
nice -n 19 convert -density $DENSITY "$FILE"[$CURRENT_PAGE] $FILENAME
}
THREAD_COUNTER=1
for (( PAGE=0; PAGE<$PAGES; PAGE++ ))
do
convert_page $PAGE &
if test $THREAD_COUNTER -ge $CORES
then
wait
THREAD_COUNTER=1
else
let THREAD_COUNTER=$THREAD_COUNTER+1
fi
done
wait