public abstract class AbstractContentParserText extends AbstractContentParser
ContentParser
for plain text.Modifier and Type | Field and Description |
---|---|
private EncodingUtil |
encodingUtil |
static String |
KEY_EXTENSION
The default extension.
|
static String |
KEY_MIMETYPE
The mimetype.
|
private XmlUtil |
xmlUtil |
VARIABLE_NAME_CREATOR, VARIABLE_NAME_KEYWORDS, VARIABLE_NAME_LANGUAGE, VARIABLE_NAME_TEXT, VARIABLE_NAME_TITLE
Constructor and Description |
---|
AbstractContentParserText()
The constructor.
|
Modifier and Type | Method and Description |
---|---|
protected void |
doInitialize()
This method performs the actual
initialization . |
protected EncodingUtil |
getEncodingUtil()
This method gets the
EncodingUtil to use. |
String |
getExtension()
This method gets the default filename extension excluding the dot (e.g.
|
String |
getMimetype()
This method gets the default mimetype (e.g.
|
protected XmlUtil |
getXmlUtil()
This method gets the
XmlUtil . |
void |
parse(BufferedReader bufferedReader,
ContentParserOptions options,
MutableGenericContext context,
StringBuilder textBuffer)
This method parses the content of the given
bufferedReader and
appends the textual content to the textBuffer . |
void |
parse(InputStream inputStream,
long filesize,
ContentParserOptions options,
MutableGenericContext context) |
protected void |
parseLine(MutableGenericContext context,
String line)
This method may be overridden to parse additional metadata from the
content.
|
protected void |
parseProperty(MutableGenericContext context,
String line,
Pattern pattern,
String propertyName,
int group)
This method checks if the property identified by
propertyName
already exists. |
protected String |
parseProperty(String line,
Pattern pattern,
int group)
This method tries to extract a property value using the given
pattern that has to produce it in the given
. |
void |
setEncodingUtil(EncodingUtil encodingUtil) |
void |
setXmlUtil(XmlUtil xmlUtil)
This method sets the
XmlUtil to use. |
getAlternativeKeyArray, getPrimaryKeys, getSecondaryKeyArray, getSecondaryKeys, parse, parse, setGenericContextFactory
createLogger, getLogger
doInitialized, getInitializationState, initialize
public static final String KEY_MIMETYPE
public static final String KEY_EXTENSION
private EncodingUtil encodingUtil
getEncodingUtil()
private XmlUtil xmlUtil
getXmlUtil()
protected EncodingUtil getEncodingUtil()
EncodingUtil
to use.EncodingUtil
to use.@Inject public void setEncodingUtil(EncodingUtil encodingUtil)
encodingUtil
- is the encodingUtil to set@Inject public void setXmlUtil(XmlUtil xmlUtil)
XmlUtil
to use.xmlUtil
- is the XmlUtil
to use.protected void doInitialize()
initialization
. It is called when AbstractComponent.initialize()
is
invoked for the first time. super.
AbstractComponent.doInitialize()
.doInitialize
in class AbstractContentParser
protected String parseProperty(String line, Pattern pattern, int group)
pattern
that has to produce it in the given
group
.line
- is a single line read from the text.pattern
- is the regular expression pattern.group
- is the group
number of the property
in the pattern
.null
if the pattern
did
NOT match.protected void parseProperty(MutableGenericContext context, String line, Pattern pattern, String propertyName, int group)
propertyName
already exists. If NOT, it tries to extract it using the given
pattern
that produces the property in the given
group
.context
- are the properties with the collected metadata.line
- is a single line read from the text.pattern
- is the regular expression pattern.propertyName
- is the name of the property to extract.group
- is the group
number of the property
in the pattern
.public void parse(InputStream inputStream, long filesize, ContentParserOptions options, MutableGenericContext context) throws Exception
parse
in class AbstractContentParser
inputStream
- is the fresh input stream of the content to parse.filesize
- is the size (content-length) of the content to parse in
bytes or 0
if NOT available (unknown). If available,
the parser may use this value for optimized allocations.options
- are the ContentParserOptions
.context
- is the MutableGenericContext
where the extracted
metadata from the parsed inputStream
will be
added
to.Exception
- if the operation fails for arbitrary reasons.ContentParser.parse(InputStream, long)
public void parse(BufferedReader bufferedReader, ContentParserOptions options, MutableGenericContext context, StringBuilder textBuffer) throws Exception
bufferedReader
and
appends the textual content to the textBuffer
. Additional
metadata can directly be set in the given properties
.bufferedReader
- is where to read the content from.options
- are the ContentParserOptions
.context
- is where the metadata is collected.textBuffer
- is the buffer where the textual content should be
appended to.Exception
- if something goes wrong.protected void parseLine(MutableGenericContext context, String line)
context
- are the properties with the collected metadata.line
- is a single line read from the text.public String getExtension()
ContentParser
.null
if this is the
generic parser
.public String getMimetype()
ContentParser
.null
if this is the
generic parser
.Copyright © 2001–2016 mmm-Team. All rights reserved.