We have to test something.
'; my $readmore = '... [readmore]'; my $html_truncate = HTML::Truncate->new(); $html_truncate->chars(20); $html_truncate->ellipsis($readmore); print $html_truncate->truncate($html), $/; # or my $ht = HTML::Truncate->new(utf => 1, chars => 1_000, ); print $ht->truncate($html), $/; =head1 XHTML This module is designed to only work on XHTML-style nested tags. More below. =head1 WHITESPACE & ENTITIES Repeated natural whitespace (i.e., "\s+" and not " ") in HTML -- with rare exception (pre tags or user defined styles) -- is not meaningful. Therefore it is normalized when truncating. Entities are also normalized. The following is only counted 14 chars long. \n\nthis is ‘text’\n\n
^^^^^^^12345----678--9------01234------^^^^^^^^ =head1 METHODS =head2 HTML::Truncate->new Can take all the methods as hash style args. "percent" and "chars" are incompatible so don't use them both. Whichever is set most recently will erase the other. my $ht = HTML::Truncate->new(utf8 => 1, chars => 500, # default is 100 ); =cut sub new { my $class = shift; my %stand_alone = map { $_ => 1 } qw( br img hr input link base meta area param ); my %skip = map { $_ => 1 } qw( head script form iframe object embed title style base link meta ); my $self = bless { _chars => 100, _percent => undef, _utf8 => undef, _style => 'text', _ellipsis => '…', _raw_html => '', _repair => undef, _skip_tags => \%skip, _stand_alone_tags => \%stand_alone, }, $class; while ( my ( $k, $v ) = splice(@_, 0, 2) ) { next unless exists $self->{"_$k"}; $self->$k($v); } return $self; } =head2 $ht->utf8 Set/get, true/false. If utf8 is set, entities will be transformed with C