HTTP::Proxy::BodyFilter::htmlparser - Filter using HTML::Parser |
HTTP::Proxy::BodyFilter::htmlparser - Filter using HTML::Parser
use HTTP::Proxy::BodyFilter::htmlparser;
# $parser is a HTML::Parser object $proxy->push_filter( mime => 'text/html', response => HTTP::Proxy::BodyFilter::htmlparser->new( $parser ); );
The HTTP::Proxy::BodyFilter::htmlparser lets you create a filter based on the HTML::Parser object of your choice.
This filter takes a HTML::Parser object as an argument to its constructor. The filter is either read-only or read-write. A read-only filter will not allow you to change the data on the fly. If you request a read-write filter, you'll have to rewrite the response-body completely.
With a read-write filter, you must recreate the whole body data. This is mainly due to the fact that the HTML::Parser has its own buffering system, and that there is no easy way to correlate the data that triggered the HTML::Parser event and its original position in the chunk sent by the origin server. See below for details.
Note that a simple filter that modify the HTML text (not the tags) can be created more easily with HTTP::Proxy::BodyFilter::htmltext.
A read-write filter is declared by passing rw => 1
to the constructor:
HTTP::Proxy::BodyFilter::htmlparser->new( $parser, rw => 1 );
To be able to modify the body of a message, a filter created with
HTTP::Proxy::BodyFilter::htmlparser must rewrite it completely. The
HTML::Parser object can update a special attribute named output
.
To do so, the HTML::Parser handler will have to request the self
attribute (that is to say, require access to the parser itself) and
update its output
key.
The following attributes are added to the HTML::Parser object by this filter:
This string will be used as a replacement for the body data only
if the filter is read-write, that is to say, if it was initialised with
rw => 1
.
Data should always be appended to $parser->{output}
.
This filter defines three methods, called automatically:
filter()
filter()
method handles all the interactions with the HTML::Parser
object.
init()
will_modify()
rw
parameter passed to the constructor.
the HTTP::Proxy manpage, the HTTP::Proxy::Bodyfilter manpage, the HTTP::Proxy::BodyFilter::htmltext manpage.
Philippe ``BooK'' Bruhat, <book@cpan.org>.
Copyright 2003-2006, Philippe Bruhat.
This module is free software; you can redistribute it or modify it under the same terms as Perl itself.
HTTP::Proxy::BodyFilter::htmlparser - Filter using HTML::Parser |