WWW::Mechanize::Cookbook - Recipes for using WWW::Mechanize |
WWW::Mechanize::Cookbook - Recipes for using WWW::Mechanize
First, please note that many of these are possible just using
the LWP::UserAgent manpage. Since WWW::Mechanize
is a subclass of
the LWP::UserAgent manpage, whatever works on LWP::UserAgent
should work
on WWW::Mechanize
. See the the lwpcook manpage man page included with
the libwww-perl distribution.
use WWW::Mechanize;
my $mech = WWW::Mechanize->new( autocheck => 1 );
The autocheck => 1
tells Mechanize to die if any IO fails,
so you don't have to manually check. It's easier that way. If you
want to do your own error checking, leave it out.
$mech->get( "http://search.cpan.org" ); print $mech->content;
$mech->content
contains the raw HTML from the web page. It
is not parsed or handled in any way, at least through the content
method.
Sometimes you want to dump your results directly into a file. For example, there's no reason to read a JPEG into memory if you're only going to write it out immediately. This can also help with memory issues on large files.
$mech->get( "http://www.cpan.org/src/stable.tar.gz", ":content_file" => "stable.tar.gz" );
Generally, just call credentials
before fetching the page.
$mech->credentials( 'admin' => 'password' ); $mech->get( 'http://10.11.12.13/password.html' ); print $mech->content();
Find all links that point to a JPEG, GIF or PNG.
my @links = $mech->find_all_links( tag => "a", url_regex => qr/\.(jpe?g|gif|png)$/i );
Find all links that have the word ``download'' in them.
my @links = $mech->find_all_links( tag => "a", text_regex => qr/\bdownload\b/i );
Use Abe Timmerman's the WWW::CheckSite manpage http://search.cpan.org/dist/WWW-CheckSite/
Copyright 2005 Andy Lester <andy@petdance.com>
Later contributions by Peter Scott, Mark Stosberg and others.
WWW::Mechanize::Cookbook - Recipes for using WWW::Mechanize |