Writing fast NIO webserver

I wrote fast and simple web server using New I/O and Kilim microthreads. No locks, no blocking, no channel selector. With a few optimization tricks, it is probably fastest webserver written in Java.

Channel selector problems

NIO is using channel selector as mechanism to dispatch events. It is promoted by Sun, and nearly all NIO based programs are using it. Good tutorial is here.

Well known problems:

Less known problem is that selector simply dies under heavy load. Consider this example: fetching 3 KB file for 5 seconds. Webserver is Grizzly which internally uses selector. Total score is 12622 fetches/5 seconds ie 2524.39 fetches/sec

jan@artemis:~$ http_load  -parallel 200 -second 5 urlg.txt 
12622 fetches, 166 max parallel, 4.5111e+07 bytes, in 5.00003 seconds
3574 mean bytes/connection
2524.39 fetches/sec, 9.02215e+06 bytes/sec
msecs/connect: 0.862625 mean, 146.231 max, 0.043 min
msecs/first-response: 4.82321 mean, 306.221 max, 2.067 min
HTTP response codes:
  code 200 -- 12622

And now same but longer 50 seconds.

28233 fetches, 179 max parallel, 1.00905e+08 bytes, in 50.0007 seconds
3574 mean bytes/connection
564.652 fetches/sec, 2.01807e+06 bytes/sec
msecs/connect: 9.74712 mean, 3228.84 max, 0.04 min
msecs/first-response: 5.32603 mean, 3000.25 max, 2.137 min
HTTP response codes:
  code 200 -- 28233

Resulting score is 564.652 fetches/sec. Webserver simply gave up and start serving connections much slower after a few seconds under heavy load.

I made some testing and I am nearly sure this problem is caused by Selector.

And last problem with Selector? It is hard to use. There are some NIO libraries, which should help, but it still too complicated.

Kilim microthreads

Classical approach is to have one thread for one connection. It makes programming much easy and simple. But threads are very resource hungry. Running thousands of threads in the same time is nearly impossible.

But there is other solution: Kilim microthreads. Very lightweight threads which inspired by Erlang. It is not problem to run milions microthreads in the same time. It uses some bytecode manipulation and stack restoration. It is definitely not problem to have one microthread per connection.

Kilim microthreads are using cooperative multitasking. Unlike preemptive it is not automatic, you must switch manually from programming code. Maybe it is not so cool, but is much more faster and simpler. Also it is easier to analyse and predict latency. Cooperative multitasking is for example used inside Linux kernel.

Channels without selector

With one microthread for each connection it is quite easy to use NIO without selector.

First we need to listen and recieve new connection:

// Kilim microthread is actually called 'Task'
class ListenTask extends Task {

// @pausable is for bytecode manipulations, 
// execute method is like run() on java thread
@pausable public void execute() {

  //open port 8080
  ServerSocketChannel ssc = ServerSocketChannel.open();
  ssc.socket().bind(new InetSocketAddress(8080));      
  //configure it as non blocking, VERY IMPORTANT!
  //now listen for new connection in infinitive cycle
  while (true){
    //try to get new connection, accept() can also return null 
    sc = ssc.accept();  
    if(sc == null){
      yield(); //no new connetion, switch to next microthread and wait a little
    // create new task (microthread) which handle it
    SessionTask ct = new SessionTask(sc);
    //and start it

If is needed to send some data in non blocking way:

ByteBuffer buf = //data to send
//perform cycle until all data has been send
  int wr = sc.write(buf);
    //no data send this time, 
    //give chance to other threads


In the same way you can recivie data. Parsing headers and other stuff is not here, you can check it directly directly.

Final tunning

I found two problems:

Current status

Webserver is in aplha stage, core is relatively stable. It can serve files, more extensions will came. More here.

Last modification: May 31 2012

blog comments powered by Disqus