java读取doc文档
生活随笔
收集整理的這篇文章主要介紹了
java读取doc文档
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
為什么80%的碼農都做不了架構師?>>> ??
本文永久地址:https://my.oschina.net/bysu/blog/1528130
相關jar下載地址:
http://mirror.bit.edu.cn/apache/poi/dev/bin/poi-bin-3.17-beta1-20170701.tar.gz
import java.io.File; import java.io.FileInputStream; import org.apache.poi.POIXMLDocument; import org.apache.poi.POIXMLTextExtractor; import org.apache.poi.hwpf.extractor.WordExtractor; import org.apache.poi.openxml4j.opc.OPCPackage; import org.apache.poi.xwpf.extractor.XWPFWordExtractor; import org.apache.poi.xwpf.usermodel.XWPFDocument;public class ReadFromDoc {public static void main(String[] args) {System.out.println(readWord("D:\\workspace\\java\\大學英語.doc"));}public static String readWord(String filePath) {String text = "";File file = new File(filePath);// 2003if (file.getName().endsWith(".doc")) {try {FileInputStream stream = new FileInputStream(file);WordExtractor word = new WordExtractor(stream);text = word.getText();// 去掉word文檔中的多個換行text = text.replaceAll("(\\r\\n){2,}", "\r\n");text = text.replaceAll("(\\n){2,}", "\n");stream.close();} catch (Exception e) {e.printStackTrace();}} else if (file.getName().endsWith(".docx")) { // 2007try {OPCPackage oPCPackage = POIXMLDocument.openPackage(filePath);XWPFDocument xwpf = new XWPFDocument(oPCPackage);POIXMLTextExtractor ex = new XWPFWordExtractor(xwpf);text = ex.getText();// 去掉word文檔中的多個換行text = text.replaceAll("(\\r\\n){2,}", "\r\n");text = text.replaceAll("(\\n){2,}", "\n");System.out.println("ok");} catch (Exception e) {e.printStackTrace();}}return text;} }?
轉載于:https://my.oschina.net/bysu/blog/1528130
總結
以上是生活随笔為你收集整理的java读取doc文档的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: win10 中的eclipse无法新建w
- 下一篇: ansible之setup模块常用的信息