- Three ways a Twitter hack can hurt you
- Outlook '09
- IBM employees buzzing about layoff rumors
- Microsoft layoff rumors continue their swirl
- Salary calculator
Determining how fast the Internet is growing is almost a parlor game among pundits these days. Part of the reason is simple practicality: Companies that depend for their livelihoods on supplying or using Internet infrastructure want to better understand growth trends so they can plan investments and growth curves accurately.
But there’s a broader scientific issue as well. The Internet has become a critical part of our economic and political landscape, yet we don’t really understand how it works. Sure, we know how the protocols themselves operate — but only recently, for example, did folks determine that Internet traffic follows long-tailed distributions rather than (as previously assumed) Poisson distributions.
For those who’ve been out of school for a while, the difference is this: Poisson distributions are the typical “bell-shaped curves” that often characterize random events — that is, events that occur independently of each other. Telecom engineers model traditional telephony traffic (aka voice calls) as Poisson distributions, which makes sense — my decision to call my mother in Corpus Christi is highly independent from your decision to call your stockbroker in New York. Critically, Poisson distributions get smoother as the volumes get bigger — so engineering a network to handle billions of calls is fairly straightforward.
Long-tailed distributions, in contrast, are characterized by a considerable degree of dependence, which also makes sense: If your machine has requested data from my server, there’s a strong dependence between the first packet I send you and subsequent ones, since they’re all part of the same data file. Because of this dependence, traffic tends toward clustering — and gets less predictable as volumes get larger.
And yes, there’s a practical point to this little detour into probability distributions: network engineering is fundamentally different if you assume long-tailed rather than Poisson distributions. Buffers need to be bigger, servers need to be more powerful, and planning needs to include more extreme traffic-burst scenarios.
Why didn’t we know this until recently? Because there’s no system for monitoring and measuring traffic in the 'Net. Individual carriers look inside their own networks, and peering providers (such as Equinix) examine characteristics of traffic that crosses peering points. But although many individual researchers and research institutions are attempting to monitor Internet traffic, none has a panoramic view.
Comment