libxml-ruby 0.8.0 Released: Ruby Gets Fast, Reliable XML Processing At Last
Ruby's is not known for its deftness with XML. On RubyFlow, I considered calling the community to arms over it, and solicited twenty responses on what the problem is, and what we could do about it. Robert Fischer was lamenting on the state of Ruby's libxml library, and didn't seem to like REXML much either. Tim Bray has also had a few complaints about REXML. It seemed there was a problem to be fixed; a gap in the market, as it were, for a decent XML parser for Ruby. Hpricot, despite really being an HTML parser, would have to get us by in the meantime.
Today, however, libxml-ruby 0.8.0 has been released, and Charlie Savage explains why this is such a big deal. libxml-ruby now runs on Windows (thanks to Charlie), doesn't segfault all the time, and the bindings have all been fixed over the past year (thanks to Dan Janowski). You can get going with it right now with a simple gem install libxml-ruby
libxml-ruby is known for its performance, the latest release doesn't disappoint. For a range of simple tasks, libxml clocks in at ten times quicker than Hpricot like-for-like and between 30 and 60 times faster than REXML. Charles adds:
In addition to performance, the libxml-ruby bindings provide impressive coverage of libxml's functionality. Goodies include:
- XMLReader (streaming interface)
- XML Schema
- XSLT (split into the libxslt-ruby bindings)
Charles is planning to write a proper tutorial in the next week, covering some of the key features, but suggests referring to the API documentation in the meantime. The test suite (located in the test directory that comes with libxml-ruby) also looks like a great resource for code examples; very clean and straightforward. If you have any libxml-ruby tutorials or resources of your own, please post them in the comments here.
Congratulations to all of those involved in libxml-ruby's long history and especially to Charlie Savage for giving it the finish push to this mature state. Ruby's XML woes are tempered, for now at least.