HTML::TableContentParser - Do interesting things with the contents of tables. |
HTML::TableContentParser - Do interesting things with the contents of tables.
use HTML::TableContentParser; $p = HTML::TableContentParser->new(); $tables = $p->parse($html);
This package pulls out the contents of a table from a string containing HTML. Each time a table is encountered, data will be stored in an array consisting of a hash of whatever was discovered about the table -- id, name, border, cellspacing etc, and of course data contained within the table.
The format of each hash will look something like
attributes keys from the attributes of the <table> tag @{$table_headers} array of table headers, in order found @{$table_rows} rows discovered, in order
If the table has a caption, this will be provided as
caption keys from the caption tag's attributes data the text of the <caption>..</caption> element
then for each table row, @{$table_data} td's found, in order other attributes the ... in <tr ...>
then for each data cell, data what comes between <td> and </td> other attributes the ... in <td ...>
use HTML::TableContentParser; $p = HTML::TableContentParser->new(); $html = read_html_from_somewhere(); $tables = $p->parse($html); for $t (@$tables) { for $r (@{$t->{rows}}) { print "Row: "; for $c (@{$r->{cells}}) { print "[$c->{data}] "; } print "\n"; } }
=head1 METHODS
Nothing.
Simon Drabble E<lt>sdrabble@cpan.orgE<gt>
(C) 2002 Simon Drabble
This software is released under the same terms as perl.
HTML::TableContentParser - Do interesting things with the contents of tables. |