大家好,我是小悟。
Word到PDF的奇幻之旅
Word文档就像个穿着睡衣在家办公的程序员——舒服但有点随意。而PDF呢?就是穿上西装打上领带,准备去参加董事会的同一人——专业且纹丝不动!
这转变过程好比:
- Word文档:“哈!我的字体可以随便换,边距可以随意调,图片还能拖来拖去~”
- PDF:“闭嘴!现在开始我说了算,每个像素都给我站好岗!”
SpringBoot实现这个转换,就像是请了个文档变形金刚,把自由散漫的Word驯化成纪律严明的PDF士兵。下面就让我带你见证这场“格式驯化仪式”!
准备阶段:装备你的“变形工具箱”
第一步:Maven依赖大采购
<!-- pom.xml 里加入这些法宝 --> <dependencies> <!-- SpringBoot标准装备 --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <!-- Apache POI - Word文档的“读心术” --> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>5.2.3</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>5.2.3</version> </dependency> <!-- OpenPDF - PDF的“打印机” --> <dependency> <groupId>com.github.librepdf</groupId> <artifactId>openpdf</artifactId> <version>1.3.30</version> </dependency> <!-- 文件类型检测 - 避免把图片当Word处理 --> <dependency> <groupId>org.apache.tika</groupId> <artifactId>tika-core</artifactId> <version>2.7.0</version> </dependency> </dependencies>
第二步:配置属性文件
# application.yml word-to-pdf: upload-dir: "uploads/" # Word文档临时停靠站 output-dir: "pdf-output/" # PDF成品仓库 max-file-size: 10MB # 别想用《战争与和平》来考验我 spring: servlet: multipart: max-file-size: 10MB max-request-size: 10MB
核心代码:变身吧,Word君!
1. 文件上传控制器(接待员)
import org.springframework.web.bind.annotation.*; import org.springframework.web.multipart.MultipartFile; import javax.servlet.http.HttpServletResponse; import java.io.*; @RestController @RequestMapping("/api/doc-transform") public class WordToPdfController { @PostMapping("/word-to-pdf") public void convertWordToPdf( @RequestParam("file") MultipartFile wordFile, HttpServletResponse response) throws IOException { // 1. 检查文件:别想用猫咪图片冒充Word文档! if (!isWordDocument(wordFile)) { response.getWriter().write("喂!这不是Word文档,别骗我!"); response.setStatus(HttpServletResponse.SC_BAD_REQUEST); return; } // 2. 临时存放Word文件(像安检前的暂存) File tempWordFile = new File("temp_" + System.currentTimeMillis() + ".docx"); wordFile.transferTo(tempWordFile); // 3. 开始变形! byte[] pdfBytes = WordToPdfConverter.convert(tempWordFile); // 4. 清理现场(像用完的变形金刚恢复原状) tempWordFile.delete(); // 5. 把PDF交给用户 response.setContentType("application/pdf"); response.setHeader("Content-Disposition", "attachment; filename=\"" + wordFile.getOriginalFilename().replace(".docx", ".pdf") + "\""); response.getOutputStream().write(pdfBytes); System.out.println("转换成功!又一个Word被成功驯化成PDF!"); } private boolean isWordDocument(MultipartFile file) { String fileName = file.getOriginalFilename().toLowerCase(); return fileName.endsWith(".docx") || fileName.endsWith(".doc"); } }
2. 转换器核心(真正的变形引擎)
import org.apache.poi.xwpf.usermodel.*; import com.lowagie.text.*; import com.lowagie.text.pdf.PdfWriter; import java.io.*; @Component public class WordToPdfConverter { public static byte[] convert(File wordFile) throws IOException { ByteArrayOutputStream pdfOutputStream = new ByteArrayOutputStream(); try (FileInputStream fis = new FileInputStream(wordFile)) { // 1. 打开Word文档(像打开潘多拉魔盒) XWPFDocument document = new XWPFDocument(fis); // 2. 创建PDF文档(准备新家) Document pdfDocument = new Document(); PdfWriter.getInstance(pdfDocument, pdfOutputStream); pdfDocument.open(); // 3. 逐段搬运内容(像蚂蚁搬家) System.out.println("开始搬运段落,共" + document.getParagraphs().size() + "段..."); for (XWPFParagraph para : document.getParagraphs()) { if (para.getText().trim().isEmpty()) continue; // 处理文本样式 Font font = new Font(); if (para.getStyle() != null) { switch (para.getStyle()) { case "Heading1": font = new Font(Font.HELVETICA, 18, Font.BOLD); break; case "Heading2": font = new Font(Font.HELVETICA, 16, Font.BOLD); break; default: font = new Font(Font.HELVETICA, 12, Font.NORMAL); } } Paragraph pdfPara = new Paragraph(para.getText(), font); pdfDocument.add(pdfPara); pdfDocument.add(Chunk.NEWLINE); // 加个换行,喘口气 } // 4. 处理图片(最难搬家的部分) System.out.println("开始处理图片,共" + document.getAllPictures().size() + "张..."); for (XWPFPictureData picture : document.getAllPictures()) { try { byte[] pictureData = picture.getData(); Image image = Image.getInstance(pictureData); image.scaleToFit(500, 500); // 给图片上个紧箍咒,别太大 image.setAlignment(Element.ALIGN_CENTER); pdfDocument.add(image); pdfDocument.add(Chunk.NEWLINE); } catch (Exception e) { System.err.println("图片" + picture.getFileName() + "太调皮,转换失败: " + e.getMessage()); } } // 5. 处理表格(Excel表示:我也想来凑热闹) for (XWPFTable table : document.getTables()) { com.lowagie.text.Table pdfTable = new com.lowagie.text.Table(table.getNumberOfRows()); for (XWPFTableRow row : table.getRows()) { for (XWPFTableCell cell : row.getTableCells()) { pdfTable.addCell(cell.getText()); } } pdfDocument.add(pdfTable); } pdfDocument.close(); document.close(); System.out.println("转换完成!生成PDF大小: " + (pdfOutputStream.size() / 1024) + " KB"); } catch (Exception e) { System.err.println("转换过程出现意外: " + e.getMessage()); throw new IOException("转换失败,Word文档可能被施了魔法", e); } return pdfOutputStream.toByteArray(); } }
3. 异常处理(变形失败的救护车)
@ControllerAdvice public class DocumentConversionExceptionHandler { @ExceptionHandler(IOException.class) public ResponseEntity<String> handleIOException(IOException e) { return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR) .body("文档转换失败,可能原因:\n" + "1. Word文档被外星人加密了\n" + "2. 文件太大,服务器举不动了\n" + "3. 网络连接在打瞌睡\n" + "错误详情: " + e.getMessage()); } @ExceptionHandler(InvalidFormatException.class) public ResponseEntity<String> handleInvalidFormat(Exception e) { return ResponseEntity.badRequest() .body("喂!你上传的是Word文档吗?\n" + "我猜你上传的是:\n" + "□ 猫咪图片 \n" + "□ Excel表格 \n" + "□ 心灵鸡汤文本 \n" + "请上传正经的.docx或.doc文件!"); } }
4. 进度监控(变形过程直播)
@Component public class ConversionProgressService { private Map<String, Integer> progressMap = new ConcurrentHashMap<>(); public void startConversion(String fileId) { progressMap.put(fileId, 0); System.out.println("开始转换文件: " + fileId); } public void updateProgress(String fileId, int percent) { progressMap.put(fileId, percent); // 打印进度条(假装很高级) StringBuilder progressBar = new StringBuilder("["); for (int i = 0; i < 20; i++) { progressBar.append(i * 5 < percent ? "█" : "░"); } progressBar.append("] ").append(percent).append("%"); System.out.println(fileId + " 转换进度: " + progressBar.toString()); // 说点骚话鼓励一下 if (percent == 50) { System.out.println("转换过半,坚持住!"); } else if (percent == 90) { System.out.println("马上完成,准备发射PDF!"); } } public void completeConversion(String fileId) { progressMap.remove(fileId); System.out.println(fileId + " 转换完成,深藏功与名~"); } }
前端调用示例(用户操作界面)
<!DOCTYPE html> <html> <head> <title>Word转PDF变形工坊</title> <style> body { font-family: 'Comic Sans MS', cursive; padding: 20px; } .container { max-width: 600px; margin: 0 auto; } .drop-zone { border: 3px dashed #4CAF50; border-radius: 10px; padding: 40px; text-align: center; background: #f9f9f9; cursor: pointer; } .drop-zone:hover { background: #e8f5e9; } .convert-btn { background: linear-gradient(45deg, #FF6B6B, #4ECDC4); color: white; border: none; padding: 15px 30px; border-radius: 25px; font-size: 18px; cursor: pointer; margin-top: 20px; } .progress-bar { width: 100%; height: 20px; background: #ddd; border-radius: 10px; margin-top: 20px; overflow: hidden; display: none; } .progress-fill { height: 100%; background: linear-gradient(90deg, #4CAF50, #8BC34A); width: 0%; transition: width 0.3s; } </style> </head> <body> <div class="container"> <h1>Word转PDF变形工坊</h1> <p>把你的Word文档扔进来,还你一个乖巧的PDF!</p> <div class="drop-zone" id="dropZone"> <h2>拖拽Word文件到这里</h2> <p>或者 <label style="color: #2196F3; cursor: pointer;">点击选择文件 <input type="file" id="fileInput" accept=".docx,.doc" hidden> </label></p> </div> <button class="convert-btn" onclick="convertToPdf()"> 开始变形! </button> <div class="progress-bar" id="progressBar"> <div class="progress-fill" id="progressFill"></div> </div> <div id="status" style="margin-top: 20px;"></div> </div> <script> const dropZone = document.getElementById('dropZone'); const fileInput = document.getElementById('fileInput'); let selectedFile = null; // 拖拽功能 dropZone.addEventListener('dragover', (e) => { e.preventDefault(); dropZone.style.background = '#e8f5e9'; }); dropZone.addEventListener('drop', (e) => { e.preventDefault(); dropZone.style.background = '#f9f9f9'; selectedFile = e.dataTransfer.files[0]; document.getElementById('status').innerHTML = `已选择: <strong>${selectedFile.name}</strong>`; }); fileInput.addEventListener('change', (e) => { selectedFile = e.target.files[0]; document.getElementById('status').innerHTML = `已选择: <strong>${selectedFile.name}</strong>`; }); // 转换函数 async function convertToPdf() { if (!selectedFile) { alert('请先选择一个Word文件!'); return; } const formData = new FormData(); formData.append('file', selectedFile); // 显示进度条 const progressBar = document.getElementById('progressBar'); const progressFill = document.getElementById('progressFill'); progressBar.style.display = 'block'; // 模拟进度(实际项目可以用WebSocket) let progress = 0; const interval = setInterval(() => { progress += 10; progressFill.style.width = `${progress}%`; if (progress >= 90) clearInterval(interval); }, 300); try { const response = await fetch('/api/doc-transform/word-to-pdf', { method: 'POST', body: formData }); clearInterval(interval); progressFill.style.width = '100%'; if (response.ok) { // 下载PDF const blob = await response.blob(); const url = window.URL.createObjectURL(blob); const a = document.createElement('a'); a.href = url; a.download = selectedFile.name.replace(/\.docx?$/i, '.pdf'); document.body.appendChild(a); a.click(); a.remove(); document.getElementById('status').innerHTML = '转换成功!PDF已开始下载~'; // 3秒后重置 setTimeout(() => { progressBar.style.display = 'none'; progressFill.style.width = '0%'; document.getElementById('status').innerHTML = ''; }, 3000); } else { const errorText = await response.text(); document.getElementById('status').innerHTML = `转换失败: ${errorText}`; } } catch (error) { document.getElementById('status').innerHTML = `网络错误: ${error.message}`; } } </script> </body> </html>
高级功能扩展
批量转换(群变模式)
@Service public class BatchConversionService { @Async // 异步处理,不卡界面 public CompletableFuture<List<File>> convertMultiple(List<MultipartFile> files) { System.out.println("开始批量转换,共" + files.size() + "个文件,冲鸭!"); List<File> pdfFiles = new ArrayList<>(); List<CompletableFuture<File>> futures = new ArrayList<>(); for (int i = 0; i < files.size(); i++) { final int index = i; CompletableFuture<File> future = CompletableFuture.supplyAsync(() -> { try { System.out.println("正在转换第" + (index + 1) + "个文件..."); byte[] pdfBytes = WordToPdfConverter.convert(convertToFile(files.get(index))); File pdfFile = new File("converted_" + index + ".pdf"); Files.write(pdfFile.toPath(), pdfBytes); return pdfFile; } catch (Exception e) { System.err.println("第" + (index + 1) + "个文件转换失败: " + e.getMessage()); return null; } }); futures.add(future); } // 等待所有转换完成 CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join(); for (CompletableFuture<File> future : futures) { try { File pdf = future.get(); if (pdf != null) pdfFiles.add(pdf); } catch (Exception e) { // 忽略失败的文件 } } System.out.println("批量转换完成!成功: " + pdfFiles.size() + "/" + files.size() + " 个文件"); return CompletableFuture.completedFuture(pdfFiles); } }
转换记录(变形档案室)
@Entity @Table(name = "conversion_records") @Data @NoArgsConstructor @AllArgsConstructor public class ConversionRecord { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; private String originalFileName; private String pdfFileName; private Long originalSize; private Long pdfSize; private LocalDateTime conversionTime; private String status; // SUCCESS, FAILED, PROCESSING private String errorMessage; @PrePersist protected void onCreate() { conversionTime = LocalDateTime.now(); } } @Repository public interface ConversionRecordRepository extends JpaRepository<ConversionRecord, Long> { List<ConversionRecord> findByStatusOrderByConversionTimeDesc(String status); }
部署与优化建议
1. 性能优化
# application.yml 添加 server: tomcat: max-threads: 200 # 增加线程数处理并发转换 min-spare-threads: 20 spring: task: execution: pool: core-size: 10 # 异步任务线程池 max-size: 50
2. 内存管理
@Component public class MemoryWatcher { @Scheduled(fixedRate = 60000) // 每分钟检查一次 public void monitorMemory() { long usedMemory = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory(); long maxMemory = Runtime.getRuntime().maxMemory(); double usagePercentage = (double) usedMemory / maxMemory * 100; if (usagePercentage > 80) { System.out.println("内存警告:使用率 " + String.format("%.1f", usagePercentage) + "%"); // 触发垃圾回收 System.gc(); } } }
总结:Word转PDF的奇幻旅程终点站
经过这一番折腾,我们成功打造了一个SpringBoot牌文档变形金刚!总结一下这场冒险:
我们实现了什么:
- 格式驯化:把自由的Word变成规矩的PDF
- 异步处理:大文件转换不卡界面
- 进度监控:实时查看转换进度
- 错误处理:优雅处理各种意外情况
- 批量操作:一次性驯化整个Word文档家族
注意事项:
- 字体问题:有些特殊字体PDF可能不认识,需要额外处理
- 复杂格式:Word里的高级排版(如文本框、艺术字)可能变形
- 内存消耗:大文档转换时注意内存溢出
- 并发限制:同时转换太多文档可能导致服务器喘不过气
Word转PDF就像给文档穿上“防改铠甲”,SpringBoot就是我们打造这副铠甲的智能工厂。虽然过程中会遇到各种奇葩格式的“刺头文档”,但只要有耐心调试,最终都能把它们治理得服服帖帖!
谢谢你看我的文章,既然看到这里了,如果觉得不错,随手点个赞、转发、在看三连吧,感谢感谢。那我们,下次再见。
您的一键三连,是我更新的最大动力,谢谢
山水有相逢,来日皆可期,谢谢阅读,我们再会
我手中的金箍棒,上能通天,下能探海