Determining Twitter's Growth #

Opinion on Twitter is divided. What seems to be undisputed is that right now it's growing very quickly. I was curious just how "quickly" quickly was, preferably going beyond just anecdotal "my network doubled in size in the last 5 seconds" kind of observations. It seems like Twitter assigns globally unique, incrementing IDs to all messages it receives. By looking at the values of these IDs over time, it's possible to see how many status messages Twitter is keeping track of. I've generated a logarithmic graph of this.

I'm not sure why there was an inflection point in early November. It's also possible that this is affected by technical changes on Twitter's side. Still possibly interesting. Also, Joshua's post on autoincrement considered harmful is related and an interesting read.

Update: As it was pointed out to me in the comments, Andy Baio had the same idea except he executed it more throughly.

Twitter Message IDs

4 Comments

Andy Baio has had the same idea
It's interesting to see Joshua's post laid out like that 'cause they're the same lessons we learned with LJ. The final reason single global identifiers can be a bad idea is it makes it difficult to shard -- you can't have each shard adding values independently without some global lock on the main ID.

LJ's posts are identified with primary keys like (userid, user-specific-postid << 8 + post-specific-random-number). The random number helps prevent people from enumerating posts (but perhaps not enough -- you can still tell that you've "missed" someone's post by looking at the visibile number >> 8).
But, how many people are still using it? I'm sure there is an HUGE part of the user who forget that service after one or two messages... But I can't prove it.
I don't use Twitter. Never have even looked at it. Too busy reading about it.

John A. Davis
Lazy Dancer Blog

Post a Comment