Module io.github.mmm.scanner
Provides scanners that help to parse character sequences efficient and easily.
However in some situations it is more suitable to write a handwritten parser. The tradeoff is that this may result in ugly monolithic code that is hard to maintain.
The
As a motivation and anti-pattern, here is a little example of an entirely handwritten parser:
Scanner API
For efficient parsers of complex grammars it is best practice to use a parser generator likejavacc
or
antlr
. However in some situations it is more suitable to write a handwritten parser. The tradeoff is that this may result in ugly monolithic code that is hard to maintain.
The
CharStreamScanner
is an interface that covers typical tasks when paring strings or
streams and therefore makes your life a lot easier. You can concentrate on the syntax you want to parse and do NOT
need to repeat checks if the end is already reached all the time. For parsing enitre streams (e.g. from a
Reader
) there is the implementation CharReaderScanner
while for simple
String
s there is the implementation CharSequenceScanner
. In any case
the entire data and state (parsing position) is encapsulated so you can easily delegate a step to another method or
class. Otherwise you would need to pass the current position to that method and return the new one from there. This
is tricky if the method should already return something else. As a motivation and anti-pattern, here is a little example of an entirely handwritten parser:
String input = getInputString(); int i = 0; boolean colonFound = false; while (i < input.length()) { char c = input.charAt(i++); if (c == ':') { colonFound = true; break; } } if (!colonFound) { throw new IllegalArgumentException("Expected character ':' not found!"); } String key = input.substring(0, i - 1); String value = null; if (i < input.length()) { while ((i < input.length()) && (input.charAt(i) == ' ')) { i++; } int start = i; while (i < input.length()) { char c = input.charAt(i); if ((c < '0') || (c > '9')) { break; } i++; } value = input.substring(start, i); }Here is the same thing when using
CharSequenceScanner
:
String input = getInputString();This is just a simple example. The API offers all real-live scenarios you will need to parse your data. The implementations are highly efficient and internally directly operate onCharStreamScanner
scanner = newCharSequenceScanner
(input); String key = scanner.readUntil
(':', false); if (key == null) { throw new IllegalArgumentException("Expected character ':' not found!"); } scanner.skipWhile
(' '); String value = scanner.readWhile
(CharFilter.LATIN_DIGIT_FILTER
);
char[]
. Streaming implementations use
optimized lookahead buffers that can even be configured at construction time.-
-
Packages
Exports Package Description io.github.mmm.scanner Provides the API for scanners that help to parse character sequences efficient and easily.
-
Modules
Requires Modifier Module Description transitive io.github.mmm.base Provides fundamental APIs and helper classes.
-