my $real_h1 = $tree->look_down(
'_tag', 'h1',
sub {
my $link = $_[0]->look_down('_tag','a');
return 1 unless $link; # no link means it's fine
return 0 if $link->attr('href') =~ m{/dyna/}; # a link to there is bad
return 1; # otherwise okay
}
);
A piggy bank of commands, fixes, succinct reviews, some mini articles and technical opinions from a (mostly) Perl developer.
Jump to
Scanning HTML
Screen scraping with HTML::TreeBuilder: