Have been trying to use the XWPFWordExtractor text extractor to read Word 2007 files into a string. It is fast and simple to use but the problem is that there doesn't appear to be a way to read text from a textbox within a Word file. The extractor only reads text contained in paragraphs. I was intending to use it for an application to search CVs, but most text is written in textboxes in CVs.
↧