package Dict::IntegerDict; use strict; sub new { my ( $class, %opt ) = @_; $class = ref($class) || $class; my $self = {}; $self->{MAXLEN} = ( exists $opt{MAXLEN} && defined $opt{MAXLEN} ) ? $opt{MAXLEN} : 6; $self->{REJECTLONG} = ( exists $opt{REJECTLONG} && defined $opt{REJECTLONG} ) ? $opt{REJECTLONG} : 0; return bless( $self, $class ); } sub lemms { my ( $self, $number ) = @_; if ( defined $number && $number =~ /^([+-]?)(\d*)$/ ) { return $number if $self->{REJECTLONG}; return $1 . substr( $2, 0, $self->{MAXLEN} ); } else { return (); } } sub is_stoplexem { my ( $self, $number ) = @_; if ( defined $number && $number =~ /^([+-]?)(\d*)$/ ) { return 0 if !$self->{REJECTLONG}; # no stop lexem for this type return ( ( length($2) > $self->{MAXLEN} ) ? 1 : 0 ); } else { return undef; } } 1; __END__ =head1 NAME Dict::IntegerDict - Dictionary for integers =head1 SYNOPSIS use Dict::IntegerDict; my $d = Dict::IntegerDict->new(MAXLEN=>6, REJECTLONG=>1); my $number_to_index = $d->lemms('12345678); # returns 12345678 my $stop_lexeme = $d->is_stoplexem('12345678'); # returns 1, not indexed my $stop_lexeme = $d->is_stoplexem('123456'); # returns 0, indexed ------------------------------------------------------------- my $d = Dict::IntegerDict-->new(MAXLEN=>6, REJECTLONG=>0); my $number_to_index = $d->lemms('12345678'); # returns 123456, the number is shortened, indexed my $number_to_index = $d->lemms('123456'); # returns 123456, indexed my $stop_lexeme = $d->is_stoplexem('12345678'); # always returns 0 =head1 DESCRIPTION This module is designed to work with OpenFTS. Motivation for this module is to control process of indexing of integers (signed and unsigned). MAXLEN parameter specifies maximum length of the number considered as a 'good' integer. REJECTLONG parameter specifies if method S is responsible for decision if number considered as a 'garbage' should be indexed or rejected. In case REJECTLONG=>1 method S returns 1 for 'garbage' numbers, so it will not passed to indexer and consequently couldn't be found. Original number will not changed by S method. In case REJECTLONG=>0 method S ALWAYS returns 0 and method S returns prefixed part of integer number with length MAXLEN. =head1 EXAMPLE See script init.pl =head1 AUTHOR Oleg Bartunov, oleg@sai.msu.su =head1 SEE ALSO perldoc Dict::DecimalDict - dictionary for decimal numbers =cut