@Singleton @Named public class ContentParserDoc extends AbstractContentParserPoi
ContentParser
interface for binary
MS-Word documents.Modifier and Type | Field and Description |
---|---|
static String |
KEY_EXTENSION
The default extension.
|
static String |
KEY_MIMETYPE
The mimetype.
|
POIFS_EXCEL_DOC, POIFS_POWERPOINT_DOC, POIFS_WORD_DOC
VARIABLE_NAME_CREATOR, VARIABLE_NAME_KEYWORDS, VARIABLE_NAME_LANGUAGE, VARIABLE_NAME_TEXT, VARIABLE_NAME_TITLE
Constructor and Description |
---|
ContentParserDoc()
The constructor.
|
Modifier and Type | Method and Description |
---|---|
protected String |
extractText(org.apache.poi.poifs.filesystem.POIFSFileSystem poiFs,
long filesize,
ContentParserOptions options)
This method extracts the text from the office document given by
poiFs . |
String[] |
getAlternativeKeyArray()
|
String |
getExtension()
This method gets the default filename extension excluding the dot (e.g.
|
String |
getMimetype()
This method gets the default mimetype (e.g.
|
parse
doInitialize, getPrimaryKeys, getSecondaryKeyArray, getSecondaryKeys, parse, parse, setGenericContextFactory
createLogger, getLogger
doInitialized, getInitializationState, initialize
public static final String KEY_MIMETYPE
public static final String KEY_EXTENSION
public String getExtension()
ContentParser
.null
if this is the
generic parser
.public String getMimetype()
ContentParser
.null
if this is the
generic parser
.public String[] getAlternativeKeyArray()
getAlternativeKeyArray
in class AbstractContentParser
AbstractContentParser.getPrimaryKeys()
protected String extractText(org.apache.poi.poifs.filesystem.POIFSFileSystem poiFs, long filesize, ContentParserOptions options) throws Exception
poiFs
.extractText
in class AbstractContentParserPoi
poiFs
- is the POI filesystem of the office document.filesize
- is the size (content-length) of the content to parse in
bytes or 0
if NOT available (unknown). If available,
the parser may use this value for optimized allocations.options
- are the ContentParserOptions
.Exception
- if something goes wrong.Copyright © 2001–2016 mmm-Team. All rights reserved.