Joe Williams home
I have been playing around with merle and have switched it from using the normal gen_server to using LShift's modified gen_server2. It has a couple of changes to make things faster, the key is:
From a comment in their source file: More efficient handling of selective receives in callbacks gen_server2 processes drain their message queue into an internal buffer before invoking any callback module functions. Messages are dequeued from the buffer for processing. Thus the effective message queue of a gen_server2 process is the concatenation of the internal buffer and the real message queue. As a result of the draining, any selective receive invoked inside a callback is less likely to have to scan a large message queue.
This means if you send a ton of messages at once it can handle this more effectively. In the case of merle this means more gets/puts/deletes/etc in a shorter amount of time. Some of the downsides are stated on the mailing list. I believe for the workload that merle does (lots of small messages in short time spans) this is a great addition. For other use cases it may not be, you know when you should test. I have run some tests using gen_server and gen_server2 doing a large number of 'set' operations to memcached. The test consisted of running merle:set(a, "1") a specific number of times (25k, 50k and 100k) with both gen_server and gen_server2. Since the mailbox gets backed up the Erlang processes are started before the operations complete on the memcached side. I didn't have a good way to watch the memcached logs for when the operations completed and log timestamps so I used a simple stopwatch app to physically do the timing. Obviously this isn't scientific but as you will see the differences are large enough its not a big deal. gen_server (click here for a larger view) As you can see gen_server2 performs much better (almost linearly?), shaving large amounts of time off. Also note that on the gen_server 100k tests I stopped the testing once it reached 5 minutes, so I am unsure how much longer those would have went on. Below is the raw data, I also preformed subsequent tests and found that my initial findings seemed to be accurate.
gen_server test 1 gen_server2 test 1 gen_server test 2 gen_server2 test 2
25000 24 4 25 4
50000 134 8 115 8
100000 300 18 300 16
The latest source for merle using gen_server2 has been committed to github, give it shot and let me know if you find any bugs.
Fork me on GitHub