痛点
在windows中合并个pdf还要让你冲会员,真的是狗。linux下有没有一款工具能完成对pdf的合并、切分的工具呢?
有:
PDFtk
功能
* Merge PDF Documents or Collate PDF Page Scans * Split PDF Pages into a New Document * Rotate PDF Documents or Pages * Decrypt Input as Necessary (Password Required) * Encrypt Output as Desired * Fill PDF Forms with X/FDF Data and/or Flatten Forms * Generate FDF Data Stencils from PDF Forms * Apply a Background Watermark or a Foreground Stamp * Report PDF Metrics, Bookmarks and Metadata * Add/Update PDF Bookmarks or Metadata * Attach Files to PDF Pages or the PDF Document * Unpack PDF Attachments * Burst a PDF Document into Single Pages * Uncompress and Re-Compress Page Streams * Repair Corrupted PDF (Where Possible)
安装
OS Version
CentOS Linux release 7.8.2003 (Core)
安装依赖
yum install -y gcc gcc-c++ libXrandr gtk2 libXtst libart_lgpl
安装 pdftk 2.02
yum localinstall https://www.linuxglobal.com/static/blog/pdftk-2.02-1.el7.x86_64.rpm
pdftk 使用
合并pdf
pdftk ./PDF/*.pdf cat output merge.pdf
也支持枚举文件合并
pdftk ./PDF/1.pdf ./PDF/2.pdf cat output merge.pdf
切分pdf
将PDF 拆分成单页并将其数据转储到 doc_data.txt
Splits a single input PDF document into individual pages
pdftk merge.pdf burst
目前没找到按照大小切分的选项,这个选项打散成每页之后再自己按需求merge
压缩pdf
pdftk ./merge.pdf cat output merged-copmress.pdf compress
压缩效果不明显,建议使用convert 、gs
品质和大小的trade-off
Pixelated (lossy):
convert input.pdf -compress Zip output.pdf
Unpixelated (lossless, but may display slightly differently):
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dBATCH -dQUIET -sOutputFile=output.pdf input.pdf
issue
1、本地执行内存不足warning
GC Warning: Repeated allocation of very large block (appr. size 139264): May lead to memory leak and poor performance.
如何将pdftk迁移到其他机器上执行
直接copy二进制文件过去,无法执行
pdftk ./PDF/*.pdf cat output merge.pdf pdftk: error while loading shared libraries: libgcj.so.10: cannot open shared object file: No such file or directory
如何将该二进制命令所需库文件复制一份?
ldd命令可以完成
ldd /bin/pdftk | egrep -o '/lib.*\.[0-9]' /lib64/libgcj.so.1 /lib64/libstdc++.so.6 /lib64/libm.so.6 /lib64/libgcc_s.so.1 /lib64/libc.so.6 /lib64/libpthread.so.0 /lib64/librt.so.1 /lib64/libdl.so.2 /lib64/libz.so.1 /lib64/ld-linux-x86-64.so.2
使用如下小脚本将pdftk依赖的so文件copy到性能好一点的机器上去.
list=$(ldd /bin/pdftk | egrep -o '/lib.*\.[0-9]') for i in $list; do scp -p "$i" ip:/lib64/; done
libgcj.so.10 仍然缺失
发现该文件并没有被copy过来,或许是这个正则写的有点问题。后来发现并不是,因为在/usr目录
解决: 使用find找到然后copy过去。没有外网真的很痛苦。
find /usr -name "libgcj.so*" /usr/lib64/libgcj.so.10
如何改造这个正则以匹配到
⚡ root@localhost /tmp ldd /bin/pdftk | egrep -o '*/lib.*\.[0-9]' /lib64/libgcj.so.1 /lib64/libstdc++.so.6 /lib64/libm.so.6 /lib64/libgcc_s.so.1 /lib64/libc.so.6 /lib64/libpthread.so.0 /lib64/librt.so.1 /lib64/libdl.so.2 /lib64/libz.so.1 /lib64/ld-linux-x86-64.so.2 ⚡ root@localhost /tmp ldd /bin/pdftk | egrep -o '*/lib.*\.[0-9]+' /lib64/libgcj.so.10 /lib64/libstdc++.so.6 /lib64/libm.so.6 /lib64/libgcc_s.so.1 /lib64/libc.so.6 /lib64/libpthread.so.0 /lib64/librt.so.1 /lib64/libdl.so.2 /lib64/libz.so.1 /lib64/ld-linux-x86-64.so.2
windows GUI的限制
free版本只能合并、拆分
pro版本split, merge, rotate, watermark, stamp and secure PDF pages and documents 。3,99$