Monday, December 10, 2007

Working on a large XML or SOA project: think about "separation of concerns"

With XML and SOA becoming mainstream in the enterprise XML operation such as Schema validations, XSL transformations are now very common. These specific operations are CPU intensive and could become a performance bottleneck when directly applied on the middleware. It could be even worst now when using SOAP based Web Services and their related WS-* standards. For example with WS-Security, XML encryption and signature is now more and more used in SOA based applications.

This is why many enterprise architects are bow looking for solutions to improve performances of XML centric applications.

One of the think we learn when developing application, and that Aspect Oriented Programming has highlighted is the concept of “separation of concerns”. It is key to keep that in mind also in global architecture in our case by separating the XML processing from the business logic. Hopefully it is most of the time done directly by the various Web Services framework you are using, you do not code the SOAP request/response, it is hidden by the Web Services framework.

However, in the current application server, the full XML treatment is made directly in the container, for example the XML Encryption is made in the same container that the place where the pure business logic is executed. So let’s find a solution to extract the most intensive XML processing into another part of the system.

Vendors have now in their catalog appliances that could do the job. The same way that today we are using SSL accelerators to deal with SSL encryption/decryption, we can put XML appliance to deal with the intensive CPU processing operation: XML validations, transformation, Ws-Security enforcing point,...

Architecture Overview
The overall architecture could be represented using the following schema :

The role of the XML/SOA Appliance varies a lot depending of your project:

  • Simple XML firewall to check the validity of the XML/SOAP messages
  • Web Services access control: lookup enterprise directory to check authentication and authorization. This could be based on the WS-Security standards and its various tokens (username, SAML, ...)
  • Content generation and transformation: the appliance can be used to serve various devices for example WAP cell phone or simple HTML Web Client. the XSL transformation is done in a very efficient way in the appliance directly.
  • Services Virtualization : it is possible to route the different messages to various end point depending of simple rules. (business or IT system rules)

As you can see from an architecture point of view, XML appliances are very interesting to distribute the heavy processing of XML to some specific hardware. I have noticed that sometimes developers/architects hesitate to put another piece of hardware/software in their design, but I do think that in this specific case it is probably a good move.

Separating the concern is quite easy and very clean when dealing with XML processing, but also it will allow the overall architecture to be managed in a better way. This kind of appliance will allow administrators to centralize the management of policies, and transformations. But also a side effect of this is the simple fact that when dealing with Web Services, you can easily add WS-* support to many stacks that do not support "them".

XML/SOA Appliances Offering
I have said earlier that vendors are offering such products, here some of the product that I have met or pushed:

What's next?
Some of you would probably raise the fact that the application server, especially when dealing with Web Services, must parse the XML/SOAP request even if this has been done by the appliance. Yes it is true, but I am sure that in a next future the vendors of such solution would optimize it by providing for example support for binary XML, or any other solution that will improve even more the performance of the overall IT in complex enterprise architecture. But for this application server must support binary XML first to avoid proprietary approaches.

Another point of view that I have not talk about is the possible support of such appliance around Web 2.0/Ajax optimization. I have not yet dive into this, but I am sure we can do very interesting things too.

Finally if you have experiences with any XML/SOA appliance feel free to post a comment about it, it will help the readers to see the interest (or not) around this topic.


Marco Gralike said...

Maybe an idea to let the database server do a part of the job as wel (Oracle: NDWS / binary storage / validation)?


Tug said...


I would say it depends.

If you application is database centric and all data that you have to manage, and transform, including the XSL are in the DB it makes sense...

But how many application are today putting everything in the DB? Less and less... (at least in the projects where I do work)

And regarding the Web Services, it is clear for me that it cannot be the DB. NDWS (Native Database Web Services) available in the new 11 release of the DB is powerful and useful but:
- as far as I know it does not support WS-*, and especially WS-Security and WS-Policy. (the appliance will allow you to add such standards to you NDWS...

- quite often when dealing with large SOA/Services based applications the authentication and authorization are externalized to another system (LDAP/IdM) so the gain of putting more thing in the DB is partially lost.

- what about reusability/portability? One of the key point of these appliances is the fact that it is "implementation agnostic" so it could transform and validate XMl and WS independently of the source of the XML/SOAP (NDWS, Java, .Net, PHP, Ruby, ...). It does it in a centralize way.

Some other and important parts that are more and more used in these appliances is the fact they are used as router for XML request, like sometimes ESBs do, and using the DB for this will probably be very expensive.

Do not get me wrong, I am not saying that puttin XML/WS in the DB is not a good thing, but I cannot imagine to put a DB as the main processing point for XML (for all XML of an large application/infrastructure).

I do not want to talk about performances and scalability, in detail, but one of the thing behing these appliance is also the fact that it is made to do one thing (XML) and one thing only... and it does it well and fast.

btw, do you have experience with the NDWS, I have not yet use them, I am impatient to do it, and compare to the "standard" DB Web Services provided by Oracle Application Server... (performances, interoperability with other stack and the SOA Suite, etc etc)


Anonymous said...

Marco -

I think you hit the appliance positioning dead-on. Another way used to describe this general trend is "decoupling". More specifically, the decoupling of expensive processing such as XML parsing, validation, security, content routing and even non-XML processing to a purpose-built device. More and more I'm also seeing the move to include *some* ESB specific features in a device like this, such as workflow processing (WS-BPEL), some application logic, *some* conversion (non-XML to XML) as well as database access.

However, I think that maybe these XML appliances have had their day in the sun. It's a great idea but the challenge has always
been over ownership. As you said, it's simply another device in the network that needs management and provisioning. It's a curious
situation because the network guys don't own the box and the application guys don't own the box - its sort of a frankenstein. You
need some stakeholder in the Enterprise that wears three hats (a) Security Architect, (b) Network Architect, (c) Application Architect.

There is another chapter in the XML appliance story and that seems to be *software* versions of these appliances that have the same or better performance
as the firmware. This definitely helps with the ownership / perception issues and I hope to see more of these actually deployed
in a software form factor.

One vendor I know in particular is Intel. They have a product called SOA Expressway designed for this - the main advantage is that it is (a) cheaper, (b) extensible, (c) just as performant, and most importantly it runs on a general purpose server so you can either repurpose the hardware or upgrade the server easily. There is a video on youtube here:

There is more info from Intel here: It will be interesting to see if DataPower goes this route. I also know that Vordel has a software version of its gateway, but I don't think it has the same performance as Intel.