---
title: Pushing the boundary of Real Time Web with Twitter and XFactor
teaser: 'We use WebSockets and NodeJS to monitor Twitter in real time during the XFactor
  final.

  '
tags: fun,websockets,new bamboo,web
author: Makoto Inoue
published_on: 2009-12-14
---

_This post was originally published on the New Bamboo blog, before [New Bamboo
joined thoughtbot in London][new-bamboo-thoughtbot]._

---

I got lots of positive comments/retweets about my last article [Real time online
activity monitor example with node.js and WebSocket][real-time-article].

(if you haven't read the very long article, and not interested into the
technical detail, don't worry. I try not to get into technical detail this time)

I am glad that I was able to show how exciting node.js and WebSockets are to
make real time web application.

However, I don't think my "activity monitor" example showed the full potential
of the real time web yet. Why? because I was only showing resource usage stats
"EVERY ONE SECOND". One second latency is still not real time yet.

This is what I did next (**Make sure your sound is turned ON**):

<iframe title="Twitter world when XFactor winner was announced - YouTube" width="420" height="315" src="https://www.youtube.com/embed/VNS66kuXRus" frameborder="0"></iframe>

## How many people watched the show?.

"[XFactor]" is UK's popular talent show. It's like American idol, where
contestants sing live and get voting each week (Please note that Susan Boyle is
not from XFactor, but from [Britain's got talent].

**20 million people** watched the final show, and over **10 million people**
voted for the 2 finalists , which is more than the number of votes for the
government at the last general election (according to [Metro]).

## What the video shows

I captured the twitter feed real time during the show (19:54 ~ 21:49 GMT ) 13th
Dec (Sun). I captured any feed related to keywords "Joe & win" , "Olly & win" or
the both.

Here is the explanation of 3 key figures.

- Total = The number of mentions to each contestant during the show
- Speed = **Tweets Per Second**(I'll call it TPS going forward)
- Ratio = "Total" of each contestant / sum of the both

![](https://images.thoughtbot.com/new-bamboo/blog/pushing-the-boundary-of-real-time-web-with-twitter-and-xfactor/CQFBfxZySGSEd6RkxciW_total-speed-ratio.png)

Of the 3 figures, TPS(Speed) was the most interesting one to watch real time.
Did you notice that it was showing 4 ~ 5 TPS overall, went down to 0 ~ 1 TPS
during the pose before the announcement, then jumped into **10 TPS** immediately
after the announcement?

## What happened at Twittersphere during the show (what the video does not show)

A night before the final ,there were 3 contestants (Olly Murs, Stacy Sollomon,
and Joe Mcelderry). Olly and Stacy's name were on [Twitter's "Trending Topics"],
even though many newspapers were writing that Joe is the bookie's favourite(2 -
9, again according to Metro).

My initial guess was this.

- Stacy and Olly are in their 20's while Joe is still a teenager. Twitter is
  based on more adults audience (20~30's), so people tweets more about Stacy and
  Olly, but teenage fun of Joe does not tweet about him much.

The moment I started capturing the twitter feed, it was clear that Joe was going
to win. The dominance ratio between Olly and Joe was consistently 41% : 59%.
Olly pushed a bit during the show up until 45% (for about 5 min), but it did not
last so long. Olly received his highest **14 TPS(Tweets per second)** when
Robbie Williams was on video to support Olly. Joe received his highest **22
TPS** when Cheryl Cole was making support comment after Joe sang his last song
and she was half sobbing.

So, I tried to find out myself, and the result shows that Joe got more tweets.
So why Twitter failed to put as Trends? Here is my current guess

- Joe is a very common name (Average Joe, GI Joe, [Joe Jonas of Jonas
  brothers][joe-jonas] which was trapped on my search a lot in my earlier
  version of the trial), so twitter filtered out as noise.

![](https://images.thoughtbot.com/new-bamboo/blog/pushing-the-boundary-of-real-time-web-with-twitter-and-xfactor/VCrW0e7Sby22E1JrVltq_twitter-trends.png)

The above is the screen capture after Joe won the XFactor. His name is still not
on the trends.

## Under the hood (a bit more technical detail)

I was looking for interesting things I could do real time. Unfortunately, most
Web APIs are not real time yet. They are still in the old paradigm of
request/response cycle, and also many of their api have usage limits, so I can
not keep hitting external web server, with an exception: Twitter.

When it comes to "real time", Nobody puts Twitter on a corner.

I knew the existence of Twitter's [Streaming API]. When Twitter announced it
back in September, I totally dismissed it thinking that it's only useful when
you either write a desktop app, or store the streamed data somewhere for data
analysis at later time.

Here is the brief summary from the website.

> To connect to the Streaming API, form a HTTP request and consume the resulting
> stream. Our servers will hold the connection open indefinitely, barring
> server-side error, excessive client-side lag, network hiccups or duplicate
> logins.

For streaming capturing, I could have used node.js, but I was in such a hurry
that I used my most trusted tool, Ruby.

I used a ruby gem called [TweetStream]. [RubyInside has some nice
article][ri-article] about it, so I won't go detail. If you are really
interested how I got stream, here is [the code].

How did I put the result onto my browser? It's easy. I appended the result into
log file, and let node.js to tail the log file.

### Redirecting output to log file

`my_program.rb >> xfactor-final.json`

### Start up node js with the log file as argument.

`node server.js ./xfactor-final.json`

In fact, I used node.js , bud did not write a single line of node.js code, as I
already had "tail.js" code which reads the the log file.

### Tail node.js example

The source here is bundled in [my previous example code].

```javascript
var sys = require('sys');

var filename = process.ARGV[2];
if (!filename)
  throw new Error("Usage: node server.js filename");

var child_process = process.createChildProcess("tail", ["-f", filename]);

exports.handleData = function(connection, data) {
  var output = function (output_data) {
    connection.send('\u0000' + output_data + '\uffff');
  }

  connection.addListener('eof', function(data) {
   child_process.removeListener("output", output)
  })

  child_process.addListener("output", output);
}
```

I may publish the code once I replace my ruby logic with node.js.  However, it's
worth saying that **you do not need to do everything using one framework**. You
can just pick the best tool to do what you want to do, and glue them together.
That's good & old tradition of Unix programming in general.

Another interesting thing worth noting is that Twitter real time feed did NOT
disconnect during the entire show (over 1 hr generating more than 5MB of texts).
That's pretty impressive.

[Britain's got talent]: http://talent.itv.com/
[joe-jonas]: http://en.wikipedia.org/wiki/Joe_Jonas
[Metro]: http://www.metro.co.uk/showbiz/805981-joe-mcelderrys-x-factor-win-brings-in-20m
[my previous example code]: http://github.com/makoto/node-websocket-activity-monitor/blob/master/server/websocket-server-node.js/resources/tail.js
[new-bamboo-thoughtbot]: https://thoughtbot.com/blog/new-bamboo-joins-thoughtbot-in-london
[real-time-article]: https://thoughtbot.com/blog/real-time-online-activity-monitor-example-with-node-js-and-websocket
[ri-article]: http://www.rubyinside.com/tweetstream-use-the-twitter-streaming-api-from-ruby-2541.html
[Streaming API]: http://apiwiki.twitter.com/Streaming-API-Documentation
[the code]: http://gist.github.com/255664
[TweetStream]: http://github.com/intridea/tweetstream
[Twitter\'s "Trending Topics"]: http://search.twitter.com/
[XFactor]: http://xfactor.itv.com/2009/
