package Dict::DecimalDict; use strict; sub new { my ( $class, %opt ) = @_; $class = ref($class) || $class; my $self = {}; $self->{MAXLENFRAC} = ( exists $opt{MAXLENFRAC} && defined $opt{MAXLENFRAC} ) ? $opt{MAXLENFRAC} : 2; $self->{REJECTLONG} = ( exists $opt{REJECTLONG} && defined $opt{REJECTLONG} ) ? $opt{REJECTLONG} : 0; return bless( $self, $class ); } sub lemms { my ( $self, $number ) = @_; if ( defined $number && $number =~ /^([+-]?)(\d*)(\.)(\d*)$/ ) { return $number if $self->{REJECTLONG}; return "$1$2$3" . substr( $4, 0, $self->{MAXLENFRAC} ); } else { return (); # not a number } } sub is_stoplexem { my ( $self, $number ) = @_; if ( defined $number && $number =~ /^([+-]?)(\d*)(\.)(\d*)$/ ) { return 0 if !$self->{REJECTLONG}; # no stop lexem for this type return ( ( length($4) > $self->{MAXLENFRAC} ) ? 1 : 0 ); } else { return undef; } } 1; __END__ =head1 NAME Dict::DecimalDict - Dictionary for decimal numbers =head1 SYNOPSIS use Dict::DecimalDict; my $d = Dict::DecimalDict->new(MAXLENFRAC=>2, REJECTLONG=>1); my $number_to_index = $d->lemms('123.5678); # returns 123.5678 my $stop_lexeme = $d->is_stoplexem('123.5678'); # returns 1, not indexed my $stop_lexeme = $d->is_stoplexem('123.56'); # returns 0, indexed ------------------------------------------------------------- my $d = Dict::DecimalDict->new(MAXLENFRAC=>2, REJECTLONG=>0); my $number_to_index = $d->lemms('123.5678'); # returns 123.56, fraction part is shortened, indexed my $number_to_index = $d->lemms('123.56'); # returns 123.56, indexed my $stop_lexeme = $d->is_stoplexem('123.5678'); # always returns 0 =head1 DESCRIPTION This module is designed to work with OpenFTS. Motivation for this module is to control process of indexing of decimal numbers. MAXLENFRAC parameter specifies maximum length of fraction part of the 'good' number. REJECTLONG parameter specifies if method S is responsible for decision if number considered as a 'garbage' should be indexed or rejected. In case REJECTLONG=>1 method S returns 1 for 'garbage' numbers, so it will not passed to indexer and consequently couldn't be found. Original number will not changed by S method. In case REJECTLONG=>0 method S ALWAYS returns 0 and method S returns the number with shortened fraction part of the 'garbage' number. =head1 EXAMPLE See script init.pl =head1 AUTHOR Oleg Bartunov, oleg@sai.msu.su =head1 SEE ALSO perldoc Dict::IntegerDict - dictionary for integers =cut