Options for HTML scraping? I'm thinking of trying Beautiful Soup, a Python package for HTML scraping. Are there any other HTML scraping packages I should be looking at? Python is not a requirement, I'm actually interested in hearing about other languages as well. The story so far: Python Beautiful Soup lxml HTQL Scrapy Mechanize Ruby Nokogiri Hpricot Mechanize scrAPI scRUBYt! wombat Watir .NET Html Agility Pack WatiN Perl WWW::Mechanize Web-Scraper Java Tag Soup HtmlUnit Web-Harvest jARVEST jsoup Jericho HTML Parser JavaScript request cheerio artoo