Tuesday, November 13, 2012

Building a chat application using Node.js and Couchbase

After some basic articles about Couchbase installation, Node.js integration. Let's now dive into a more complete  example: a chat application.

The first version of the chat should be compliant with the following requirements:

  • web based
  • single room
  • user just needs to enter a login and he can start to interact with other connected users
  • user should be able to navigate into the chat history

The Couchbase Chat application is build using the following components
  • Node.js for the application
  • Couchbase to persist all the messages

I won't go in all the detail of the design of the Node.js application. You can find many example of Node based chat application. I prefere to focus on how I have design the persistence using Couchbase more than the application itself. If you want me to give more detail about the complete application feel free to drop me a message/comment and I will do it.

What are the challenges with persisting the messages?
Storing the information is quite easy, just "dump" the message information in your database. The challenge is more around the fact that user want to access the history of the messages. So the key point here is how to store the information in a way that it is easy to get back in a sorted fashion.

You will find many different ways of achieving that depending of the technology you are using and they query capabilities of your persistence engine. Using Couchbase you have two ways to access/find the data:

  • Using Views that allows you to query and secondary level index and do advanced operation such as sorting, query on key range, ...
  • Directly access the data using its key

In this post I will show you how you can use the the two options to build your application and retrieved information that are stored and retrieved in a specific order:

  • First Options: using a view to get the message history
  • Second Options: using a counter as a key for the messages

The source code of the application is available in Github : https://github.com/tgrall/couchbase-chat


Get the Couchbase connection

The following code is used to connect to Couchbase, once it is done, the Web server is started:

var express = require('express');
var app = express();
var http = require('http');
var server = http.createServer(app);
var io = require('socket.io').listen(server);

var driver = require('couchbase');

driver.connect({
 "username": "",
 "password": "",
 "hostname": "localhost:8091",
 "bucket": "default"}, 
 function(err, couchbase) {
  if (err) {
   throw (err)
  }

  server.listen(8080);

  app.get('/', function(req, res) {
   res.sendfile(__dirname + '/index.html');
  });
...
// Application code
// Socket.io events
...
});

Let's now see how Couchbase is used in the chat application.

First Option : Using views to get the message history

Post a new message
In this example messages are formatted using the following information:

{
  "type": "message",
  "user": "Tug",
  "message": "Hello all !",
  "timestamp": 1349836768909
}

The key is based on the timestamp and  the user name : 1349836768909-Tug. I am adding the user name to be sure that the key is unique. Like that I do not have to manage conflicts.

The insertion of the message :

  socket.on('postMessage', function(data) {
    // create a new message
    var message = {
      type: "message",
      user: socket.username,
      message: data,
      timestamp: Date.now()
    }
    var messageKey = message.timestamp +"-"+ message.user;
    io.sockets.emit('updateChatWindow', message);
    couchbase.set(messageKey, JSON.stringify(message),function(err) {  }); 
  });

  • The postMessage event is called by the client when the user post a new message. 
  • A new message object is created with : a type, the user, the message itself and a timestamp.
  • The message is sent to the different clients using the io.sockets.emit() function (line 10)
  • Finally the message is saved into Couchbase (line 11). As you can see the only thing you have to do is to send the Javascript object as a simple JSON String.
At this point your application work perfectly, all the connected user will see the new messages since they are sent by the server as soon as they are created. But it is not possible for a user to navigate in the chat history and see older messages.


Retrieve messages from Couchase 

As explained earlier, it is possible to use a view to retrieve the message from the database in a proper order.  The view looks like that:

function (doc, meta) { 
  if ( meta.type == "json" && doc.type == "message" ) {
   emit(doc.timestamp, null);
  }
}

Each time a new document is inserted in the database, if this is a JSON document and the type of this document is "message" the index will be updated. When this view is called the result looks like :

{"id":"1352733392477-JOHN","key":1352733392477,"value":null},

As you can see the id of the document (timestamp-username) is automatically inserted in the response.


You can use the following command to insert the view in your Couchbase Server: (configure the server address, port and bucket accordingly to your environment)
curl -X PUT -H 'Content-Type: application/json' http://127.0.0.1:8092/default/_design/chat -d '{"views":{"message_hisory":{"map":"function (doc, meta) {\n  \n  if ( meta.type == \"json\" && doc.type == \"message\" ) {\n   emit(doc.timestamp, null);\n  }\n}"}}}'



The application now calls the view using the following code

socket.on('showhistory', function(limit,startkey) {
  limit = (limit == undefined || limit == 0)? 5 : limit;
  var options = {"descending": "true", "limit" : limit, "stale" : "false"};
  if (startkey > 0) {
    options.startkey = startkey-1;
  }
  couchbase.view("chat","message_hisory", options , function(err, resp, view) {
    var rows = view.rows;
    var keys = new Array();
    for( var i = 0; i < rows.length ; i++  ) {
      keys.push( rows[i].id );
    }
    couchbase.get(keys,function(err, doc, meta) {
      socket.emit('updateChatWindow', doc, true);
    });
  });
});

When the client send a showHistory event the application capture this event and call the view with proper parameters to send back the list of messages to the client.

The options object contains the different parameters that will be used to call the view:
  • Use descending order to return the messages from the newest to the oldest
  • The number of message to return (limit)
  • Ask the view to update the index before returning the rows using the stale=false parameter.
  • Use startkey parameter if the client send a specific starting point. 
On line 7, the view "chat", "message_history" is called using the Node.js SDK, with the options object.

In the callback function, the application creates an array containing the document id (the keys of the document itself), then on line 13 the messages are retrieved from Couchbase using the get() function. (note: in this function I may have a small issue when multiple messages are sent in the same milliseconds and are just on the edge of the offset)

We have an interesting point to discuss, the view is used only to return the list of keys, and then do a multiple get call with the list of keys. This is most of the time better than returning too much data in the view.

In this first option, the application is using a view to get the message history. This is great, the only thing to look at closely is the fact that this approach uses index and the indexes are stored on the disk. So you need to be sure that the message is saved and the index updated before printing the message in the history, this is why the stale=false is required in this specific scenario.


Second Option : Using a counter as document Key

Let's see now how it is possible, with few changes in the the application, to do the same without using a view and only use the in memory keys. Using this approach the application only use the keys that are all in the memory of the server (memcached).

The application logic stays the same:
  1. When user connects to the server the system returns the last 5 messages from the database
  2. Each time the user posts a message it should be persisted
  3. The user can manually load older messages from the database to view the complete chat history

Post a new message
The key associated to the message is now a counter, and the application use the increment feature of Couchbase:
socket.on('postMessage', function(data) {
  // create a new message
  var message = {
    type: "message",
    user: socket.username,
    message: data,
    timestamp: Date.now()
  }
  couchbase.incr("chat:msg_count", function (data, error, key, cas, value ) { 
    var messageKey = "chat:"+ value;
    message.id = value;
    io.sockets.emit('updateChatWindow', message);
    couchbase.set(messageKey, JSON.stringify(message),function(err) {  }); 
    });
});

Once the message object is created (line 3), the application increments a value chat:msg_count that will be used as message counter (line 9). Note that the Node Couchbase SDK will automatically create the key if it is not present when the incr() method is called.

When the server has returned the new value, increment by 1 with a default value of 0, the callback function is call :

  • The value is used to create a new key for the message (line 10)
  • The message is push to the different users  (line 12)
  • Then the message is saved into Couchbase (line 13)


So what we have here:
  • a new item that contains the counter, associated to the key : chat:msg_count
  • each message will have a key that looks like chat:0, chat:1, chat:2, ... 

Retrieve messages from Couchase 
Retrieving the older messages from Couchbase is very easy since all the message contains a unique and sequencial id. The showHistory event just need to create a list of keys based on the correct number and get them from Couchbase.

socket.on('showHistory', function(limit,startkey) {
  var keys = new Array();
  for (i = startkey; i > (startkey-limit) && i >= 0 ; i--) {
    keys.push("chat:"+i);
  }
  couchbase.get(keys,function(err, doc, meta) {
    socket.emit('updateChatWindow', doc, true);
  });
});

The line 3-5 are used to create an array of keys, and then in line 6 this array is used to do a multiple get and send the messages to the client using socket.emit.

Here the logic is almost the same that the one used in the previous example. The only difference is the fact that we do not call Couchbase server to create the list of keys to use to print the message history.

Conclusion

As you can see when working with a NoSQL database like any other persistence store you often different ways of achieving the same thing. In this example I used two approaches, one using a view, the other one using the key directly.

The important thing here is to take some time when designing your application to see which approach will be the best for your application. In this example of the chat application I would probably stay with the "Key/Counter" approach that will be the most efficient in term of performance and scalability since it does not use secondary index.







7 comments:

Estelle et David said...

Nice post. Thanks for sharing.

Moon said...

which protocol we have to use for couchbase chat application....

Tug Grall said...

Vikas,

This "chat" application is a simple Web application, the protocol is HTTP, a Websocket using SocketIO to be exact

Regards
Tug

Unknown said...

any idea how to develop chat application using XMPP SERVER and couchbase as a database

Gagan Naidu .C said...

i need to develop chat application for my social networking site , like facebook chat...and i need to provide online friends,contact list features and also chat history should store in couchbase database ..please give me any idea how to do this ?

Anonymous said...

does the code on git hub complete.
I just get a static html page and the node consolesays serving static content

Tug Grall said...

Yes it is complete. That said I need to update it to the latest Couchbase Node.js SDK (1.0.0)