scRUBYt – Hot, New Ruby Web-Scraping Toolkit Released
For the past few months Peter Szinek has been giving me lots of tasty tidbits about his forthcoming ScRUBYt Web-scraping toolkit, and now it's finally fully released to the public! Peter describes ScRUBYt as "WWW::Mechanize and Hpricot on Steroids" and this description is pretty bang on.
As well as providing a simple DSL for performing Web actions (clicking links, submitting forms, etc.), one of ScRUBYt's most impressive features is that you can provide it with 'example' data from which it will extrapolate a search pattern and then find any other similar data within the same page. This is demonstrated perfectly by Peter's basic example:
ebay_data = Scrubyt::Extractor.define do fetch 'http://www.ebay.com/' fill_textfield 'satitle', 'ipod' submit click_link 'Apple iPod' record do item_name 'APPLE NEW IPOD MINI 6GB MP3 PLAYER SILVER' price '$71.99' end next_page 'Next >', :limit => 5 end
This code goes to ebay.com, looks for iPods, and then extracts all records using a dummy one as an example. It then proceeds through up to 5 more pages of records, returning them all as an XML dataset.