POE::Filter::Line - serialize and parse terminated records (lines)
#!perl
use POE qw(Wheel::FollowTail Filter::Line);
POE::Session->create(
inline_states => {
_start => sub {
$_[HEAP]{tailor} = POE::Wheel::FollowTail->new(
Filename => "/var/log/system.log",
InputEvent => "got_log_line",
Filter => POE::Filter::Line->new(),
);
},
got_log_line => sub {
print "Log: $_[ARG0]\n";
}
}
);
POE::Kernel->run();
exit;
POE::Filter::Line parses stream data into terminated records. The
default parser interprets newlines as the record terminator, and the
default serializer appends network newlines (CR/LF, or ``\x0D\x0A'') to
outbound records.
Record terminators are removed from the data POE::Filter::Line
returns.
POE::Filter::Line supports a number of other ways to parse lines.
Constructor parameters may specify literal newlines, regular
expressions, or that the filter should detect newlines on its own.
POE::Filter::Line's new() method has some interesting parameters.
new() accepts a list of named parameters.
In all cases, the data interpreted as the record terminator is
stripped from the data POE::Filter::Line returns.
InputLiteral may be used to parse records that are terminated by
some literal string. For example, POE::Filter::Line may be used to
parse and emit C-style lines, which are terminated with an ASCII NUL:
my $c_line_filter = POE::Filter::Line->new(
InputLiteral => chr(0),
OutputLiteral => chr(0),
);
OutputLiteral allows a filter to put() records with a different
record terminator than it parses. This can be useful in applications
that must translate record terminators.
Literal is a shorthand for the common case where the input and
output literals are identical. The previous example may be written
as:
my $c_line_filter = POE::Filter::Line->new(
Literal => chr(0),
);
An application can also allow POE::Filter::Line to figure out which
newline to use. This is done by specifying InputLiteral to be
undef:
my $whichever_line_filter = POE::Filter::Line->new(
InputLiteral => undef,
OutputLiteral => "\n",
);
InputRegexp may be used in place of InputLiteral to recognize
line terminators based on a regular expression. In this example,
input is terminated by two or more consecutive newlines. On output,
the paragraph separator is ``---'' on a line by itself.
my $paragraph_filter = POE::Filter::Line->new(
InputRegexp => "([\x0D\x0A]{2,})",
OutputLiteral => "\n---\n",
);
MaxBuffer sets the maximum amount of data that the filter will hold onto
while trying to find a line ending. Defaults to 512 MB.
MaxLength sets the maximum length of a line. Defaults to 64 MB.
If either the MaxLength or MaxBuffer constraint is exceeded,
POE::Filter::Line will throw an exception.
POE::Filter::Line has no additional public methods.
POE::Filter::Line exports the FIRST_UNUSED constant. This points to
the first unused element in the $self array reference. Subclasses
should store their own data beginning here, and they should export
their own FIRST_UNUSED constants to help future subclassers.
Please see the POE::Filter manpage for documentation regarding the base
interface.
The SEE ALSO section in POE contains a table of contents covering
the entire POE distribution.
The default input newline parser is a regexp that has an unfortunate
race condition. First the regular expression:
/(\x0D\x0A?|\x0A\x0D?)/
While it quickly recognizes most forms of newline, it can sometimes
detect an extra blank line. This happens when a two-byte newline
character is broken between two reads. Consider this situation:
some stream dataCR
LFother stream data
The regular expression will see the first CR without its corresponding
LF. The filter will properly return ``some stream data'' as a line.
When the next packet arrives, the leading ``LF'' will be treated as the
terminator for a 0-byte line. The filter will faithfully return this
empty line.
It is advised to specify literal newlines or use the autodetect
feature in applications where blank lines are significant.
Please see POE for more information about authors and contributors.
|