3 The HTML-Parser distribution is is a collection of modules that parse
4 and extract information from HTML documents. The modules present in
7 HTML::Parser - The parser base class. It receives arbitrary sized
8 chunks of the HTML text, recognizes markup elements, and
9 separates them from the plain text. As different kinds of markup
10 and text are recognized, the corresponding event handlers are
13 HTML::Entities - Provides functions to encode and decode text with
14 embedded HTML <entities>.
16 HTML::HeadParser - A lightweight HTML::Parser subclass that extracts
17 information from the <HEAD> section of an HTML document.
19 HTML::LinkExtor - An HTML::Parser subclass that extracts links from
22 HTML::PullParser - An alternative interface to the basic parser
23 that does not require event driven programming.
25 HTML::TokeParser - An HTML::PullParser subclass with fixed
26 token setup and methods for extracting text. Many simple
27 parsing needs are probably best attacked with this module.
29 In addition take a look at the HTML-Tree package that build on
30 HTML::Parser to create and extract information from HTML syntax trees
31 (similar to HTML DOM).
36 In order to install and use this package you will need Perl version
37 5.6 or better. The HTML::Tagset module should be installed.
39 If you intend to use the HTML::HeadParser you probably want to install
45 Just follow the usual procedure:
55 Bug reports and issues for discussion about these modules can be sent
56 to the <libwww@perl.org> mailing list.
61 © 1995-2007 Gisle Aas. All rights reserved.
62 © 1999-2000 Michael A. Chase. All rights reserved.
64 This library is free software; you can redistribute it and/or modify
65 it under the same terms as Perl itself.