Revision history for WWW::Mechanize Please note that WWW::Mechanize and Test::WWW::Mechanize are no longer using rt.cpan.org for bug tracking. They are now being tracked via Google Code at http://code.google.com/p/www-mechanize/issues/list Mech now has its own mailing list at Google Groups: http://groups.google.com/group/www-mechanize-users 1.34 Mon Dec 10 00:30:39 CST 2007 ======================================== [FIXES] Many fixes to make the test suite more portable. 1.32 Tue Oct 30 12:02:17 CDT 2007 ======================================== [ENHANCEMENTS] Added dump methods to mirror mech-dump: * $mech->dump_images() * $mech->dump_links() * $mech->dump_forms() * $mech->dump_all() Sanity checks in the WWW::Mechanize::Image constructor. Every Image must have a "url" and "tag" field passed in to it. 1.31_02 Thu Oct 25 11:48:29 CDT 2007 ======================================== [ENHANCEMENTS] Added class, class_regex, id and id_regex limiters to find_link() and find_all_links(). Thanks to Adriano Ferreira. 1.31_01 Mon Sep 17 23:38:03 CDT 2007 ======================================== [FIXES] Mech tests now pass even if your DNS server gives A records for anything (like OpenDNS). Thanks, Miyagawa! Searching for the is now case-inensitive. A better solution would be to actually parse the HTML. [ENHANCEMENTS] mech-dump now handles --user and --password arguments for sites that require authentication. 1.30 Thu May 24 21:31:10 CDT 2007 ======================================== [DOCUMENTATION] Minor doc fixes. Thanks David Steinbrunner. 1.29_01 Tue May 22 14:02:55 CDT 2007 ======================================== Kevin Falcone and I ask for your assistance in figuring out how to handle the warnings thrown by the tests, other than hiding them. [FIXES] Overhauled how tainting was done. Stole code directly from Test::Taint. Have LWP only handle decoding of Content-Encoding, not charset. [DOCUMENTATION] Fixed the docs for $mech->submit_form()'s with_fields arg. Thanks, Peteris Krumins. 1.26 Wed May 16 14:21:29 CDT 2007 ======================================== [FIXES] Re-reversed the content decoding. This is critical for reading from sites with gzip on the fly, like Wikipedia. Content is now properly tainted. [ENHANCEMENTS] mech-dump can now pass --agent and --agent-alias flags so you can fetch from sites like Wikipedia that block LWP user agents. [INSTALLATION] The mech-dump program is now always installed. It no longer is presented as an option. 1.24 Fri May 11 15:57:56 CDT 2007 ======================================== NOTE: Version 1.24 will NOT automatically decode gzipped content for you any more. Consider it a "do not use" release. [FIXES] * Fixed failures in "make test" with some versions of HTTP::Server::Simple * RT #26593: Improved handling of charsets. Thanks Kevin Falcone. * RT #24354: find_link now handles http-equivs with quoted URLs. * Reverses the change in 1.21_01 where it decodes the content. [ENHANCEMENTS] * Added find_all_inputs() and find_all_submits() methods. Thanks, Mike O'Regan. * Test::LongString is no longer needed, so has been removed as a requirement. [TESTS] * Added a test for save_content() 1.22 Fri Mar 2 00:05:57 CST 2007 ======================================== [INTERNALS] Added new tests. Added Perl::Critic changes and a perlcriticrc file. 1.21_04 Sat Oct 7 21:35:42 CDT 2006 ======================================== [FIXES] * $mech->content( type => 'text' ) was not freeing memory. Thanks to Cat Okita for finding it. [INTERNALS] * Made the order of parms to $mech->content() not relevant. 1.21_03 Sat Oct 7 01:21:46 CDT 2006 ======================================== [THINGS THAT MAY BREAK YOUR CODE] * The methods $mech->form() and $mech->follow() have been removed. They've been deprecated since 1.10, which was released in Feb 2005. [ENHANCEMENTS] * I'm trying to nail down what seems to be a memory leak on long-running Mech programs. I'm stringifying URI::URL objects wherever I can. [INTERNALS] * No longer uses UNIVERSAL. 1.21_02 Wed Oct 4 13:14:30 CDT 2006 ======================================== [ENHANCEMENTS THAT MAY BREAK YOUR CODE] * The $mech->stack_depth() setting had no way to say "don't cache any pages at all". How silly! Now, if you set $mech->stack_depth(0), no history of pages will be kept. In the past, it would mean "Keep all pages." This means that if you want to set it to keep all pages, set it to some ridiculously large number. [DOCUMENTATION] * The docs previously refered to Compress::Gzip instead of Compress::Zlib. 1.21_01 Mon Sep 18 17:18:43 CDT 2006 ======================================== [ENHANCEMENTS] * If Compress::Zlib is installed, gzipped content is now accepted and transparently decoded. No additional syntax needed! This should save time and bandwidth in a number of cases. (Mark Stosberg) * Added a put() method. It also calls a subfunction called _SUPER_put that will be removed once LWP::UserAgent supports put(). 1.20 Sat Aug 19 09:09:08 EDT 2006 [ENHANCEMENTS] * Added new two-argument form of credentials() method. $mech->credentials($username, $password); That provides simpler visiting of password-protected resources in the vast majority of cases and still allows the other cases to be supported. (Peter Scott) [BUG FIXES] * autocheck no longer is triggered when informational responses are returned. (Mark Stosberg) [INTERNALS] * test suite no longer fails when Test::Warn is missing. (CPAN testers, Mark Stosberg) * Removed all the testing against live sites. The networking code is not actually in Mech anway, and they were prone to breaking, as the live sites changed. (Mark Stosberg) 1.19_02 Mon Aug 7 23:57:56 CDT 2006 [ENHANCEMENTS] * Add new Do-What-I-Mean submit_form() option. $mech->submit_form( with_fields => \%data ); That expresses that you want to select the first form contains all fields in \%data, and then submit the data to that form. See the docs for form_with_fields() and submit_form() for details. (Mark Stosberg, inspired by RT#6100) [BUG FIXES] * The behavior of clone() now copies over the cookie jar, which is probably what you expected it did in the first place. This fixes bug RT#13541 filed against Test::WWW::Mechanize, which was using clone() internally. (Mark Stosberg) * The correct URL is returned after redirecting. This a regression from 1.04 and was reported as RT#9059, RT#12882, and RT#12786. The documentation about this has also been clarified that we return a URI object, but that it stringifies to the URI itself. [DOCUMENTATION] * Fixed a misleading parm in the constructor. * Document the return value of set_visible (RT#6071, MJD, Mark Stosberg) * Document that form_name and form_number return an HTML::Form object (Mark Stosberg) [INTERNALS] * Made lots of little cleanups based on Perl::Critic * Fix Taint-mode warnings with Perl 5.6.1 (RT#16945) 1.18 Thu Feb 2 00:11:26 CST 2006 [TESTS] * Makefile.PL now takes four new parms: * --live/nolive turns on/off the live tests * --local/nolocal turns on/off the local tests * --mech-dump/nomech-dump installs/doesn't the mech-dump program * --all turns on all tests and installs mech-dump * Fixed some failures in tests. Non-existent URLs now have a "." postpended to them, so if someone's got a search domain with a wildcard (i.e. ignore.us) it'll ignore that. Also, Google's second link is now a https:// link, which some Mechs can't handle. Added a 'url_regex' which now makes it look at the second non-https link. Thanks to Pete Krawczyk. 1.16 Fri Oct 28 17:34:20 CDT 2005 [ENHANCEMENTS] * Sped up Mech significantly (~20% in some cases). Images and links are extracted from the HTML, and objects are created, only when they're actually needed. This will be a speedup for pages where you're only following links, or vice versa. [THINGS THAT MAY BREAK YOUR CODE] * If you've been relying on the $mech->{images} and $mech->{links} fields being populated so that you can bypass the $mech->images() and $mech->links() accessors, your code will break. That's OK, because you should have been using the accessors all along. 1.14 Tue Aug 30 17:17:40 CDT 2005 [DOCUMENTATION] * Added lots of new FAQs. Thanks to Peter Stevens. [INTERNALS] * Now requires Test::LongString. That's not too odious. [FIXES] * Tests now pass with the shuffling around that Google did. 1.13_01 Tue Apr 12 14:11:18 CDT 2005 [ENHANCEMENTS] * Now dies if you call submit_form() with a non-existing form_number or form_name. Before, it would just warn. [DOCUMENTATION] * Added an example of using credentials() in the cookbook. 1.12 Thu Feb 24 23:38:44 CST 2005 [FIXES] * Fixed RT #9026: hang in t/local/back.t under Windows XP. Thanks Andrew Savige. It also should no longer complain about being unable to clean up a temp file. 1.11_01 Mon Feb 14 00:12:48 CST 2005 [THINGS THAT MAY BREAK YOUR CODE] * Removed deprecated _parse_html() method. [FIXES] * Was incorrectly looking for INPUT tags TYPE="SUBMIT" as images. Thanks to Abe Timmerman. [ENHANCEMENTS] * Calling $mech->set_fields() with no current form now dies. Thanks to Julien Beasley. 1.10 Tue Jan 31 11:30pm-ish [FIXES] * Fixed bug where images inside of links would not be found. * Fixed test failures because of Google changes. Thanks to Offer Kaye and others who sent in patches. [DOCUMENTATION] * More samples in the FAQ. Thanks to Joshua Gatcomb. [INTERNALS] * Added explanation of running live tests against Google in Makefile.PL. 1.08 Fri Dec 24 01:01:06 CST 2004 [ENHANCEMENTS] * Added find_image() and find_all_images(). 1.06 Wed Dec 8 14:58:39 CST 2004 [INTERNALS] * Now uses the base pragma instead of setting @ISA. 1.05_04 Fri Nov 5 23:35:38 CST 2004 [ENHANCEMENTS] * Added WWW::Mechanize::Image object for representing images. * Improved the regex on the URL for META tags. * Added --images flag to mech-dump. [FIXES] * When parsing urls out of meta refresh tags, "url" may now be uppercase (RT#8230) * Behavior of back() fixed in a number of cases (RT#8109 reported by Josh Purinton, patched by Dominique Quatravaux) [INTERNALS] * Mark figured out to how to prevent his text editor from putting tabs into the code. Andy's blood pressure dropped slightly. 1.05_03 Sun Oct 31 20:54:33 CST 2004 [ENHANCEMENTS] * click_button() has a new input option for HTML::Form::SubmitInput objects (DOMQ) * content() has new options to return the page formatted as text, with a added. (RT#8087, patch by Dominique Quatravaux) * update_html() method has been added, which can be used to modify the HTML that Mech parses. It should be sub-classed instead of _parse_html(), which has been deprecated. (RT#8087, patch by Dominique Quatravaux) * select() has new option to select an option by number (RT#5789, Scott Lanning) * WWW::Mechanize::Link now has support providing all the attributes of the link through a new attrs() method, which returns them as a hashref. This is a replacement for the alt() method, added in 1.05_01. It's not backwards compatible with that, but, hey, that's what developer releases are for. (RT#8092, Rob Casey and Mark Stosberg) [FIXES] * Upload does not use the default value to prevent attacks, patch by Jan Pazdziora (RT #7843). [INTERNALS] * Improved tests and documentation for select() (RT#5789, Scott Lanning) * Improve taint-safeness on Perl 5.6.1 (RT#8042, patch by Dominique Quatravaux) * Added tests for click_button() (RT#8061, by Dominique Quatravaux) * Require URI 1.25, fixing bug which exposed itself in WWW::Mechanize (RT#3048) * Move select() to better location in docs. Document and test the return values. The return value is now "1" on success instead of the undocumented behavior of returning a form value. (RT#6138, spotted by MJD, patched by Mark Stosberg) * Possible matching tags for the find_link() 'tag_regex' attribute are now documented. (RT#2989, by Mark Stosberg) * refactored find_link() to avoid use of eval(). This should improve performance a bit and avoid potential security issues. (Mark Stosberg) 1.05_02 Sat Oct 2 16:55:59 CDT 2004 [ENHANCEMENTS] * Added the $mech->save_content( $filename ) function, so you can dump stuff to files easily. 1.05_01 Thu Sep 30 21:04:44 CDT 2004 [FIXES] * set_visible() doesn't stop setting values when it finds a zero. [ENHANCEMENTS] * WWW::Mechanize::Link has a new, easier to remember constructor interface. The old one is still supported. Support for including an 'alt' attribute was added, which is useful for links. (RT #3317). Thanks to Mark Stosberg. * When links are extracted from tags, the ALT attribute will be captured and become part of the WWW::Mechanize::Link object. (RT #3317). Patch by Mark Stosberg. [INTERNALS] * t/mech-dump.t is now more portable (RT #7690) * t/local/follow.t has new tests to confirm that 'follow*' functions work with characters like o-umlaut, even when the o-umlaut is encoded in the HTML, but not in the call to follow(). (RT #2416) By Mark Stosberg. 1.04 Wed Sep 15 23:27:53 CDT 2004 [ENHANCEMENTS] * $mech->get() now accepts a WWW::Mechanize::Link object. * $mech->stack_depth(n) lets you set the depth of the mech object's page stack. This way, if you have a Mech that does lots of stuff and never/rarely goes back(), you won't be eating up memory. Thanks to BooK and Chi-Fung. (RT #5362) [FIXES] * Fixed tests that fail under LWP >= 5.800. * Added a workaround for LWP::UserAgent->clone() when ->{proxy} is undef. (RT #6443) * The Referer was getting passed as a URI object sometimes, and that caused sadness. Eugene Haimov supplied a workaround. (RT #6372) [DOCUMENTATION] * Added Ian Langworth's listmod and John Beppu's photobucket uploader programs to WWW::Mechanize::Examples. * Minor doc tweak for find_link() * Finally added a value() func. Thanks to Spoon, who even now, months after his passing, is still contributing to Mechanize. 1.02 Tue Apr 13 22:45:10 CDT 2004 No reason to install if you have 1.00. Fixes are only in tests. [FIXES] * t/referer.t didn't cope with spaces in $FindBin::Bin. Plus, it now forces its URL to localhost. 1.00 Sat Apr 10 00:35:51 CDT 2004 I figure it's about time we hit 1.00, and this version seems like a good place to do it, because of the potential breakage described below... [THINGS THAT WILL BREAK YOUR CODE] * Header handling has changed. There is no more package variable %headers that holds all the headers to be added. They are now added on a per-object basis. If you were adding a header with add_header(), and the code relied on that header still being set later on in a later instance of the class, that code will now break, because the later instance won't have the header set. [ENHANCEMENTS] * You can now prevent a header from being sent by adding it with an undef value, as in: $mech->add_header( Referer => undef ); [FIXES] * Now correctly adds Accept-Encoding to all requests that need it. [INTERNALS] * Added new $mech->_modify_request($req) method to do all the HTTP header modification before the actual request gets sent off. Subclasses are able to override it if they want. * Removed the unused Compress::Zlib stuff. 0.76 Wed Apr 7 22:01:43 CDT 2004 [ENHANCEMENTS] * Added update_html() to let you update the HTML for the page you're on. [FIXES] * Test files account for new Google layout. [INTERNALS] * Rearranged the local tests into their own t/local/ directory. * Made the standalone tests show what server they're hitting. * Checked that it runs under LWP 5.78. 0.74 Mon Mar 22 23:36:46 CST 2004 [ENHANCEMENTS] * WWW::Mechanize now sends an Accept-Encoding header of "identity" to always enforce plaintext responses. Preliminary support for Compress::Zlib is also there, but is disabled by default. * Added click_button() and select() methods. The field() method can now take an arrayref of values, if appropriate. Thanks, Linda Lee Julien. * Added url_abs and url_abs_regex parms to find_all_links(). * URLs in META REFRESH tags are now treated as links. * t/taint.t makes sure that things that should be tainted are. [FIXES] * Still more fixes if the machine you're on doesn't have DNS pointing to it. * The local changes use localhost as the local host name, instead of whatever host name that might be on the box, but not in DNS. Thanks to David Wheeler for letting me play on his box. * The http_proxy and HTTP_PROXY environment variables get deleted during the tests that access the dummy local server. This should let your tests pass, and clear up a lot of RT tickets. 0.72 Mon Jan 26 21:07:20 CST 2004 [ENHANCEMENTS] * Added the set_visible() method, thanks to Peter Scott. [DOCUMENTATION] * Started the Cookbook at WWW::Mechanize::Cookbook.pod. [INTERNALS] * Made the globbing in Makefile.PL a little less command-line intensive. Also fixed the missing files in MANIFEST. * Added t/pod-coverage.t for testing POD coverage. 0.71_02 Mon Dec 22 14:29:13 CST 2003 [THINGS THAT MAY BREAK YOUR CODE] * Added a 5th, optional parameter to WWW::Mechanize::Link's constructor. In 0.71_01, it was at the beginning of the argument list and was required. Now it's at the end and is optional. If, in the 15 hours since 0.71_01 came out, you went and changed all your WWW::Mechanize::Link constructors, you'll have to change them around again. Otherwise, you can just ignore this change. 0.71_01 Sun Dec 21 23:48:12 CST 2003 [THINGS THAT MAY BREAK YOUR CODE] * WWW::Mechanize::Link's constructor has a new argument that needs to be passed in, at the start of the argument list. [ENHANCEMENTS] * WWW::Mechanize::Link object now takes a $base URL, and will return absolute URLs with the url_abs() method. Thanks to Ashley Pond. * Added another script to WWW::Mechanize::Examples. It's a script that didn't make it into Spidering Hacks. [INSTALL & TESTS] * Heavy use of the new Test::Memory::Cycle module. * Fixed Makefile.PL so that the tests are selected under Win32. * Changed t/mech-dump.t so that the test succeeds under Win32. * Updated t/referer.t and t/mech-dump.t so they run under VMS. Thanks to Peter Prymmer. 0.70 Sun Nov 30 23:45:27 CST 2003 [THINGS THAT MAY BREAK YOUR CODE] * Redirects are now handled better by LWP, so the code that changes POSTs to GETs on redirects has been removed. [FIXES] * Fixed redirect_ok(), which had its API changed out from under it in LWP 5.76. [ENHANCEMENTS] * New warnings in find_link() for strings that are space padded, and for text matches that are passed a regex. Thanks to Jim Cromie. [DOCUMENTATION] * Patches from Mark Stosberg and Jim Cromie. [INTERNALS] * Removed all the checking for Carp. I don't know why I was thinking that Carp wasn't core. RT #4523. Also, a big bump in requirements on LWP: We need 5.76. 0.66 Thu Nov 13 14:35:31 CST 2003 No new functionality. Fixed up some install bugs and made a few documentation tweaks, mostly to plug Spidering Hacks. 0.65 Mon Nov 10 00:11:06 CST 2003 [ENHANCEMENTS] * Made a _parse_html() method that you can override or call manually, per request from Gavin Estey. [FIXES] * Made some path naming use File::Spec->catfile so that they work correctly under Windows. * "make clean" cleans up temp flag files. [INTERNALS] * Uses the new Test::Pod 1.00 for simplicity. 0.64 October 23, 2003 11:15pm [ENHANCEMENTS] * Many new tests, based on the excellent coverage reporting created by Paul Johnson's Devel::Cover module. * The start of JavaScript support, sort of! If you have an tag that does an onClick that opens a window, Mech will find the URL from that and make that be the link for the tag. This is for things like Movable Type that pop little windows to rebuild indexes. This is subject to change in the future. I don't know if it will, but I'm not making promises. It might be so buggy I just yank the whole thing. * Big jump in requirements, since we'll soon be using Gisle's new HTML::Form stuff. Also, older versions of HTML::Form don't give output I'm expecting. [FIXES] * Fixed the t/mech-dump.t failure. 0.63 October 13, 2003 2:56pm [ENHANCEMENTS] * mech-dump defaults to dumping forms. * Added name, name_regex, tag and tag_regex options to find_link() and follow_link(). * Added tests from Jim Brandt. 0.62 October 7, 2003 8:46pm [THINGS THAT MIGHT BREAK YOUR CODE] * The parms for find_link()'s url_regex and text_regex must now be actual regex objects, as in qr// objects. They can't just be little text strings. If this is a big bummer, let me know. [ENHANCEMENTS] * Added autocheck parm, to tell your Mech object to die on any error. This saves you from having to check yourself. This closes RT #3056. * Renamed the internal _carp() method as warn(). * Added a die() method. * Can now override the warn() and die() handlers in the constructor. * find_link() now complains if it gets a *_regex parm that isn't actually a regex. See RT #3032. [FIXES] * mech-dump.t no longer runs if you're not installing mech-dump. See RT #3724. [DOCUMENTATION] * More FAQs. Thanks to Gavin Estey. 0.61 October 6, 2003 6:30pm No new functionality here. It's mostly to get the new tests into the pipeline so the CPAN testers can run 'em. [FIXES] * Missing dependency on File::Temp. Thanks, Ask. [ENHANCEMENTS] * Added the test case for the form processing problem as a .t file, since I spent so long getting it down to a simple case. * Internal code uses accessors instead of direct hash entries. Prepare for deprecation of existing hash entries! [DOCUMENTATION] * The FAQ is now its own document at WWW::Mechanize::FAQ. 0.60 September 22, 2003 10:00pm [FIXES] * Changed how t/failure.t tries to fail. It used to hit a bogus hostname in .com, but with Verisign doing its SiteFinder crap, even bogus addresses in .com succeed. [ENHANCEMENTS] * Added _make_request() to let WWW::Mechanize::Cached easily hook into the request chain. 0.59 September 3, 2003 11:56pm [FIXES] * Squelched a warning in follow() where it tries to do a regex match against an undef value. * The page stack functionality, including the back() button, was entirely broken. Now it works. Thanks to the mighty Iain Truskett for help. [ENHANCEMENTS] * Added the mech-dump script, which replaces mech-forms. It will dump forms and lists of links. Eventually it will do lists of images, too, but not yet. 0.58 August 14, 2003 11:30pm [THINGS THAT MIGHT BREAK YOUR CODE] * $mech->uri() now returns a plain string, not a URI object. The automatic stringification of the URI object was causing problems on Win32 and/or threaded Perls, and I didn't feel like figuring out why. If the non-objectness of the uri() method is a problem, let me know. * form(), form_name() and form_number() now return the HTML::Form object of the form that was chosen. They used to return a 1 or 0. This means that if you're explicitly checking for 1 or 0, instead of evaluating the return code in a boolean context, your code will break. [FIXES] * The -handling in extract_links() was incorrectly building the text. * uri() now returns a string, not a URI object. * form(), form_name() and form_number() now return the HTML::Form object of the form that was chosen. [INTERNALS] * Determination of live vs. local tests is now done in Makefile.PL, and we don't have to set those silly semaphore files any more. * Made other cleanups in Makefile.PL, like using ExtUtils::Command instead of rolling my own touch(). * Moved all the *-live.t tests into t/live/*.t, and renamed the *-local.t files to not be -local. * Added more tests for tags. 0.57 July 31, 2003 11:21pm [ENHANCEMENTS] * Added tags to those that are links per find_links(). 0.56 July 24, 2003 12:15pm [THINGS THAT MIGHT BREAK YOUR CODE] * Created agent_alias() method to do the browser string translation. Passing "Windows IE 6" to agent() will get you back exactly that string as the agent. You have to call $a->agent_alias( "Windows IE 6" ) to get the translation. Fortunately, unless you used the new functionality of agent() in the past two days since I released 0.55, it won't be a problem. [ENHANCEMENTS] * Removed the dependencies on Carp and Test::Builder. There still is a dependency on Test::Builder for Test::More, but it's no longer explicit in the Makefile.PL. Mech will use Carp if possible, but it's no longer a requirement. [INTERNALS] * Added _carp method for handling conditional warnings, rather than checking quiet() all the time. 0.55 July 22, 2003 12:10pm [ENHANCEMENTS] * Added WWW::Mechanize::Link object to encapsulate what used to be an array reference of stuff from find_link(). This replaces having to know that $link->[0] was URL and so on. However, since WWW::Mechanize::Link is a blessed arrayref, it's backwards compatible with existing code. * The WWW::Mechanize::Link object now tracks what tag the link came from (, or