使用ABAP操作office Word文档-阿里云开发者社区

开发者社区> jerrywangsap> 正文

使用ABAP操作office Word文档

简介: 使用ABAP操作office Word文档
+关注继续查看

There is a useful class CL_DOCX_DOCUMENT provided by SAP which could support read and write access to a word document with file extension “.docx”.


This document gives a brief introduction about its usage and could be used as a starting point to build your own application which needs to manipulate word document via ABAP.


Office OpenXML

Starting with Microsoft Office2007, when you create a new word document, you will get a file with “.docx” file extension by default which follows the Office openXML format. You can find its detailed definition from wiki.


For example, I create a very simple word document which contains a header area, a paragraph with three lines as body, and a picture.


image.png


According to Office OpenXML protocal, after you change the file extension from “.docx” to “.zip”, its icon changes to an archive file and thus could be opened via winrar. All information about my sample document are spreaded inside a series of xml files in the archive file ( plus media file like picture, music and video if the word document has such one).


The most efficient way to study is create a word document by yourself, change extension to zip and explore it.


image.png


Using CL_DOCX_DOCUMENT to read word document

I use the following sample code to explain how to use this class.

In order to avoid unnecessary local variable declaration, I use the new feature “inline declaration” available in release 740. If this version is not available for you, just replace them with old manual declaration for local variable.

image.png

Comments

(1) you can get a instance of word document via methodcl_docx_document=>load_document. It is necessary to pass the document binary data with type xstring into this method. I don’t list source code of subroutine get_doc_binary as it is not relevant. Just find it from attachment.


(2) The system administrative data like author, creation and last modification date are stored in so called “Core property part”, which could be fetched via document instance got in step1. Once you own the instance of Core property part, you can get its binary data via method get_data().


The returned data has xml format( so does all the left other kinds of parts in this document ) so it could be easily parsed via DOM or SAX parser.


image.png


(3) from document instance we can get main part instance. Its binary data includes all the three body line texts with their font color:



image.png


(4) The binary data of all pictures embedded in the word document could be retrieved via two steps. Firstly get the image part collection from main part instance and then loop each image part instance from the image collection. The get_part method accepts the index starting from 0. The way to read header block information is exactly the same.


Using CL_DOCX_DOCUMENT to change word document

See the nice document How to – Add Custom XML Parts to Microsoft Word using ABAP from Leon Limson.


You could also achieve the same requirement with the respective class below.


image.png


Further reading

If you would like to know how a word template is merged with data from xml file ( for example a response file from web service ), you can find technical detail in my blog Understand how the word template is merged with xml data stream.


版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

相关文章
C#使用OleDB操作ACCESS插入数据时提示:参数 @p_Contract 没有默认值
C#使用OleDB操作ACCESS插入数据时提示:参数 @p_Contract 没有默认值 OleDbParameter param = new OleDbParameter("" + dc.ColumnName, dc.DataType); 出现该问题的原因是创建了Parameter,却没有为Parameter.value指定一个值。
832 0
【jacob word】使用jacob,合并多个word为一个word文件
将几个word文件合并到一个word文件,使用注意点: 1.后面附项目运用的jar包jacob-1.9, 2.并且jacob运用中,需要将附件内的jacob.dll放到windows/system32下   语法介绍: 将一个关于JACOB的代码分成下面几个步骤:     1) Acti...
1286 0
C#使用OleDB操作ACCESS插入数据时提示:标准表达式中数据类型不匹配。
C#使用OleDB操作ACCESS插入数据时提示:标准表达式中数据类型不匹配。 OleDbParameter param = new OleDbParameter("" + dc.
647 0
【POI word】使用POI实现对Word的读取以及生成
项目结构如下:   那第一部分:先是读取Word文档 1 package com.it.WordTest; 2 3 import java.io.FileInputStream; 4 import java.
2007 0
在word文档里提取出所有的邮箱地址
怎样在word文档里提取出所有的邮箱地址 文档内容太多,邮箱也有很多,一个个复制粘贴太浪费时间,怎样把这些邮箱简单的提取出来  答案:用查找功能。 查找目标:[A-z,0-9]{1,}\@[A-z,0-9,\.]{1,} 查找全部即可一次性选中。 如图操作:
798 0
使用 WordPress 插件模板开发高质量插件
  WordPress 插件样板是标准化的,有组织的,面向对象的基础,用于构建高品质的 WordPress 插件。样板遵循编码标准和文件标准,所以你不必自己学习这些,根据注释编写代码即可。     官方网站      源码下载   您可能感兴趣的相关文章 网站开发中很有用的 j...
590 0
vnc操作使用指南----开启多个VNCserver
应用场景 在上面博客中介绍了一台Linux服务器上开启一个VNCServer,然后通过Windows端的VNCviewer去连接该Linux,最后可以在Windiwos系统中,远程看到Linux的系统界面。
2195 0
+关注
2627
文章
0
问答
来源圈子
更多
+ 订阅
文章排行榜
最热
最新
相关电子书
更多
《2021云上架构与运维峰会演讲合集》
立即下载
《零基础CSS入门教程》
立即下载
《零基础HTML入门教程》
立即下载