A friend of mine was looking for an easy way to import some XML content in his database. You have many ways to do it. But the easiest for a Java/Groovy developer is to use Groovy, and I have create this small example for him.
Groovy provides really simple solution to parse XML and manipulate your database. The following sample reads an RSS new feed and import the title and link in a table named NEWS that contains two columns TITLE and LINK.
import groovy.sql.Sql; def rssFeed = "http://www.javablogs.com/ViewDaysBlogs.action?view=rss"; def xmlFeed = new XmlParser().parse(rssFeed); def sql = Sql.newInstance("jdbc:oracle:thin:@//tgrall-linux:1521/XE", "GROOVY", "GROOVY", "oracle.jdbc.driver.OracleDriver") def set = sql.dataSet("NEWS"); (0..< xmlFeed.channel.item.size()).each { def item = xmlFeed.channel.item.get(it); def title = item.title.value[0]; def link = item.link.value[0]; println("Importing $title ..."); set.add(TITLE: title, LINK: link); }
First I create a Groovy SQL object and a DataSet to manipulate my data. sql.dataSet("NEWS"). Do not forget, if like me you are using Oracle database, to add the Oracle JDBC driver to your classpath ;-)
Then I create a loop on each items of the RSS feed I am using: (0..> xmlFeed.channel.item.size()).each {...}. As you see, Groovy XML help me to parse, and navigate the XML document.
Like any Groovy iterator you have access to an implicit object available in the loop "it", so I can get the item using the Groovy XML : xmlFeed.channel.item.get(it)
Then you can get the different values you want of the
How's the perfomance of this approach? I'm using FeedTools in ruby right now and it's fairly slow.
ReplyDeleteThe loop could be little more Groovy-idiomatic:
ReplyDeletexmlFeed.channel.item.each { item ->
def title = item.title.value[0]
def link = item.link.value[0]
println("Importing $title ...")
}
or
for (item in xmlFeed.channel.item) {
...
}
keep groovin'
Dierk
NOTE: The value[0] is not working for me. It might be an old groovy thing. If you have issues, use the following code, which uses text(). Was pulling my hair out trying to figure out why it wasn't working.
ReplyDeletexmlFeed.channel.item.each { item ->
def title = item.title.text()
def link = item.link.text()
}
Cool!
ReplyDeleteIs it (0..> xmlFeed.channel.item.size()).each
or (0..< xmlFeed.channel.item.size()).each ? You have it both ways :)