Perl Diver 2.33
Main Environment Variables Perl Default Values Perl Config - Summary Perl Config - Full Installed Modules List Directory uptime Docs

Module Documentation
Details and documentation about a specific module, including version and documentation (if available). Note that while links to perldoc.com and search.cpan.org are provided, the module may be part of a larger distribution. If you reach a File Not Found page on either site, please try the parent module.

Plucene::Index::Writer

Name Plucene::Index::Writer
Version
Located at /usr/share/perl5
File /usr/share/perl5/Plucene/Index/Writer.pm
Is Core No
Search CPAN for this module Plucene::Index::Writer
Documentation Plucene::Index::Writer
Module Details Plucene::Index::Writer

NAME

Plucene::Index::Writer - write an index.


SYNOPSIS

        my $writer = Plucene::Index::Writer->new($path, $analyser, $create);
        $writer->add_document($doc);
        $writer->add_indexes(@dirs);
        $writer->optimize; # called before close

        my $doc_count = $writer->doc_count;
        my $mergefactor = $writer->mergefactor;
        $writer->set_mergefactor($value);


DESCRIPTION

This is the writer class.

If an index will not have more documents added for a while and optimal search performance is desired, then the optimize method should be called before the index is closed.


METHODS

new

        my $writer = Plucene::Index::Writer->new($path, $analyser, $create);

This will create a new Plucene::Index::Writer object.


The third argument to the constructor determines whether a new index is
created, or whether an existing index is opened for the addition of new
documents.

mergefactor / set_mergefactor

        my $mergefactor = $writer->mergefactor;
        $writer->set_mergefactor($value);

Get / set the mergefactor. It defaults to 5.

doc_count

        my $doc_count = $writer->doc_count;

add_document

        $writer->add_document($doc);

Adds a document to the index. After the document has been added, a merge takes place if there are more than $Plucene::Index::Writer::mergefactor segments in the index. This defaults to 10, but can be set to whatever value is optimal for your application.


=cut

sub add_document {
my ($self, $doc) = @_;

        my $dw = Plucene::Index::DocumentWriter->new($self->{tmp_directory},
                $self->{analyzer}, MAX_FIELD_LENGTH);
        my $segname = $self->_new_segname;
        $dw->add_document($segname, $doc);
        #lock $self;
        $self->{segmentinfos}->add_element(
                Plucene::Index::SegmentInfo->new({
                                name      => $segname,
                                doc_count => 1,
                                dir       => $self->{tmp_directory} }));
        $self->_maybe_merge_segments;
}

sub _new_segname { ``_'' . $_[0]->{segmentinfos}->{counter}++ # Urgh }

sub _flush { my $self = shift; my @segs = $self->{segmentinfos}->segments; my $min_segment = $#segs; my $doc_count = 0; while ($min_segment >= 0 and $segs[$min_segment]->dir eq $self->{tmp_directory}) { $doc_count += $segs[$min_segment]->doc_count; $min_segment--; } if ( $min_segment < 0 or ($doc_count + $segs[$min_segment]->doc_count > $self->mergefactor) or !($segs[-1]->dir eq $self->{tmp_directory})) { $min_segment++; } return if $min_segment > @segs; $self->_merge_segments($min_segment); }

optimize

        $writer->optimize;

Merges all segments together into a single segment, optimizing an index for search. This should be the last method called on an indexer, as it invalidates the writer object.

add_indexes

        $writer->add_indexes(@dirs);

Merges all segments from an array of indexes into this index.

This may be used to parallelize batch indexing. A large document collection can be broken into sub-collections. Each sub-collection can be indexed in parallel, on a different thread, process or machine. The complete index can then be created by merging sub-collection indexes with this method.

After this completes, the index is optimized.

Perl Diver brought to you by ScriptSolutions.com © 1997- 2026