Push notifications in the age firewalls

In the world of client-server networking, web browsers are the ultimate clients. They are always sending SYN packets, but never receiving any. They are always establishing outgoing connections, but never get to listen to any ports.

Due to the ubiquity of the web, we have built our networks around the need to browse. Most NATs will not let any incoming connection near the private subnet. Similarly, our corporate firewalls cull incoming connections with extreme prejudice. As a result, an outgoing SYN packet on port 80 is the X86 of network connectivity. It’s as portable a networking tool as it gets, but it’s getting kludgier by the minute.

Polling is a particularly unfortunate result of this outgoing connection based architecture. Take RSS readers: imagine a website that posts 10 new items a day, if you poll this site every 5 minutes, you are wasting your time 97% of the time. Granted, you are getting 304 Not Modified responses without any HTTP response body. But these requests still consume threads, CPU and network resources on the server. At the end of the day we are also only getting 2.5 minutes of average latency, which in computer terms, is not very fast. In short, polling sucks.

The PubSub Hubbub protocol aims to solve this problem by allowing the server to push content to the readers directly. Unfortunately, as explained above, the server cannot establish a connection to a lot of clients out there. You would be able to connect to web based feed readers like Google Reader, because Google understands how to configure a network to allow incoming connections. But the average user either is not permitted to modify their network configuration (office) or don’t know how to do it themselves (home). As cool as PubSub is, it just does not solve the problem at hand.

Which brings me to the actual solution to the HTTP push problem: long polling. With long polling, instead of promptly responding with a 304, the server lets the connection go idle as long as there is no content to send. The client on the other hand, reinitiates a new request as soon as it gets a response. The server also does not need to allocate threads to each incoming request, a single thread can wait on multiple incoming requests and respond to them as new data becomes available. The client gets an almost immediate notification, the server does not have to waste time handing out 304s. Everybody is happy.

With that we have implemented yet another version reverse-tunneling to deal with NAT. Honestly, for the near future, we are going to have to fit the square peg of push notification into the round hole of browser acting as a client.