Sunday, February 18, 2007

Groovy 101: Importing XML in your database

A friend of mine was looking for an easy way to import some XML content in his database. You have many ways to do it. But the easiest for a Java/Groovy developer is to use Groovy, and I have create this small example for him.

Groovy provides really simple solution to parse XML and manipulate your database. The following sample reads an RSS new feed and import the title and link in a table named NEWS that contains two columns TITLE and LINK.


import groovy.sql.Sql;

def rssFeed = "http://www.javablogs.com/ViewDaysBlogs.action?view=rss";
def xmlFeed      = new XmlParser().parse(rssFeed);

def sql = Sql.newInstance("jdbc:oracle:thin:@//tgrall-linux:1521/XE",
                         "GROOVY",
                         "GROOVY",
                         "oracle.jdbc.driver.OracleDriver")
def set = sql.dataSet("NEWS");

(0..< xmlFeed.channel.item.size()).each {
   def item = xmlFeed.channel.item.get(it);
   def title = item.title.value[0];
   def link = item.link.value[0];
   println("Importing $title ...");
   set.add(TITLE: title, LINK: link);
}

First I create a Groovy SQL object and a DataSet to manipulate my data. sql.dataSet("NEWS"). Do not forget, if like me you are using Oracle database, to add the Oracle JDBC driver to your classpath ;-)

Then I create a loop on each items of the RSS feed I am using: (0..> xmlFeed.channel.item.size()).each {...}. As you see, Groovy XML help me to parse, and navigate the XML document.

Like any Groovy iterator you have access to an implicit object available in the loop "it", so I can get the item using the Groovy XML : xmlFeed.channel.item.get(it)

Then you can get the different values you want of the element.Usingthe dataset.add method, you can insert them in the table.This is done using the value pairs notation column:value, this looks like: set.add(TITLE: title, LINK: link)

4 comments:

Brent said...

How's the perfomance of this approach? I'm using FeedTools in ruby right now and it's fairly slow.

Anonymous said...

The loop could be little more Groovy-idiomatic:

xmlFeed.channel.item.each { item ->
def title = item.title.value[0]
def link = item.link.value[0]
println("Importing $title ...")
}

or

for (item in xmlFeed.channel.item) {
...
}

keep groovin'
Dierk

Anonymous said...

NOTE: The value[0] is not working for me. It might be an old groovy thing. If you have issues, use the following code, which uses text(). Was pulling my hair out trying to figure out why it wasn't working.

xmlFeed.channel.item.each { item ->
def title = item.title.text()
def link = item.link.text()
}

Steve said...

Cool!
Is it (0..> xmlFeed.channel.item.size()).each
or (0..< xmlFeed.channel.item.size()).each ? You have it both ways :)