Stowe Boyd

a postfuturist at large in the present

popular now: The Social Operating System: A Reader

Stowe Boyd

Scroll to Top

Getting Down To The Heart Of The Matter: Twitter Performance

Al3x says:

[from Twitter Technology Blog]

Twitter is, fundamentally, a messaging system. Twitter was not architected as a messaging system, however. For expediency’s sake, Twitter was built with technologies and practices that are more appropriate to a content management system. Over the last year and a half we’ve tried to make our system behave like a messaging system as much as possible, but that’s introduced a great deal of complexity and unpredictability.

[…]

We love kicking around ideas, but code speaks louder than words.

I’ve had this discussion one hundred times in recent months.

It’s astonishing how apparently small design choices can hose scalability. I remember when I was working at a startup in VA called Ikimbo, which was an instant messaging company (at least at the start). We were having performance problems when testing for scale, and I couldn’t understand why, nor could I understand the explanations.

So I went wandering around in the code, and discovered the fact that the buddylists were ‘backwards’. By backwards I mean the actual data structure implemented in the code to represent the buddylists were inverted relative to performance. To be technical, the implementation was ‘pessimal’ — meaning that performance was provably as bad as it could be.

The situation: I found that buddylists were implemented as a tree of linked lists (best would have been some sort of hash table). So each user had a node on a list, and hanging off of each person’s node was a list of all that person’s buddies. Simple.

However, each time a user X changed their online presence, the updating procedure would — yes, this is the sublimely dumb part — walk down the list of users, one by one, and then walk through each buddylist to see if X was a buddy, and if so, change the status.

This is known as the British Museum search. It is as slow as can be.

The fix wasn’t in improving the code doing the search, but in throwing out the data structure. We turned the things around: every user X had a status in a hash table, and hanging from each was a link to the buddylist it appeared in. So when X changed status we just had to hash the key to find the user’s presence, and change it (and flip a bit for each buddylist in which X appeared, so that later on, it would trigger an update). The beauty is that you can increase the number of users linearly, but the time to process the hash is non-linear, so we could increase the number of users online by a factor of 1000 without changing the hash computation time measurably. Then, the next time a user with X on his buddylist asked for updated information, we would send a whole updated buddylist to the client — note that we could have done better here, perhaps, but with variability of updates we would have been trading processing for messaging, and it seemed workable.

We improved performance by several orders of magnitude, since it had been directly (stupidly) linked to the number of users. Note that system startup was slightly slower, since the original data structure loaded fast, but as with many higher performance data architectures doing more work at the start — the ‘start’ meaning here both in design and the runtime processing as the system comes online — to make the information more manipulable later pays off.

Twitter needs a similar fix, where the snags in performance are finessed in a similar way.

I am not saying that Twitter’s needs are tied to presence updates, but they can’t pretend that there are no clients in the architecture, even when they themselves aren’t implementing them. Twitterific, Friendfeed, and Twirl are all part of the picture, since they are making client requests via API for updates. But Twitter is not implementing the code for the clients, but instead simply providing the APIs. This may be a serious problem, since getting complex algorithms right may require specific distribution of functionality to the client, not just a denial of service defense. Note that AIM and the other IM services blocked external clients for years, until their solutions had scaled to the point where Trillian and the like could really make much difference.

So the folks at Twitter have a similar problem to the status update problem I faced at Ikimbo. They need on one hand to compartmentalize, and work on making components work better. But at the same time, they need to step back far enough to see that it’s not just minimal code improvements that are needed, but rethinking the representation of the core elements of the messaging/ping architecture.

Which is why I don’t understand why lessons learned by AOL, MSN, Yahoo, Google, Jabber Inc. and other instant messaging vendors won’t play here. These lessons have been learned over and over again by all major instant messaging vendors trying to grapple with presence and messaging throughput. Can’t they find someone who knows how these systems scale?

Posted by Stowe Boyd
May 29, 2008
Comments

Share
http://tmblr.co/ZHrZFytQykd
blog comments powered by Disqus

< Previous post Next post >

 

Theme by Pixel Union

  • Profile
  • Pages
  • Likes

About me

Social anthropologist, clairvoyant, postfuturist.

My work is social tools and their impact on media, business, and society.

I am made greater by the sum of my connections, and so are my connections.


Connect with me

  • Twitter
  • RSS
  • Archive
  • Ask me anything

Pages:

  • About Stowe Boyd
  • Underpaid Genius
  • Popular Posts
  • Work Talk Research
  • Work Talk Reports
  • Speaking

Stuff I Like

  • Photo via everythingisacasestudy
    Photo via everythingisacasestudy
  • Photoset via considertheaesthetic

    Only in my wildest dreams would I actually own one of these beauties. At a astonishing $3650, this...

    Photoset via considertheaesthetic
  • Photo via andrewgreene

    LOL

    Photo via andrewgreene
  • Photo via creativemornings

    Prototyping is like thinking with your hands.

    Manuel Großmann and Martin Jordan,...

    Photo via creativemornings
  • Post via newschallenge
    Expand the Unconsumption Project

    1. What do you propose to do? [20 words]

    Expand Unconsumption’s capacity to serve as a resource for sharing stories and ideas about creative reuse and mindful consumption.

    Post via newschallenge