[Technology Review] Akamai’s Algorithms

Date view Thread view Subject view Author view

From: Linda (joelinda1@home.com)
Date: Mon Sep 04 2000 - 22:14:32 PDT


[Adam asked me to FoRK this...]

http://www.techreview.com/articles/oct00/qa.htm

September/October 2000
Akamai’s Algorithms

Tom Leighton has the formula for going from MIT math professor to
Internet gazillionaire. You do the math. Tom Leighton, a professor at
MIT’s Laboratory for Computer Science, or LCS, holds nearly 10 million
shares in Akamai Technologies, a company he co-founded in August 1998.
Last October, Akamai went public, with prices at the initial public
offering (IPO) starting off at $26 a share; by the end of the day,
investors had bid the price up to $145 a share. A month later the
stock was selling at $327 a share. No matter how much math anxiety
you might have, you get the point—Tom Leighton had become a very
rich man.

An academic whose expertise is in parallel algorithms and applied
mathematics, Leighton is at first glance an unlikely candidate for
an Internet tweeds-to-riches success story. But on closer
examination, it makes perfect sense. For years, Leighton has
been scrutinizing how complex networks operate—and how they can
be optimized. So, five years ago, when Tim Berners-Lee (the
inventor of the World Wide Web) came down the hall at LCS looking
for ways to better manage the escalating traffic flow on the
Internet, Leighton and his crew of graduate students were an
obvious place to drop in.

During the next several years, Leighton and a mix of MIT graduate
students and undergrads tried to figure out a better way to
manage and distribute content over the Web. In early 1998, the
group, which included grad student Daniel Lewin (who along with
Leighton and Jonathan Seelig, a student at MIT’s Sloan School,
went on to found Akamai), entered the MIT $50K Entrepreneurship
Competition. The team was a finalist but didn’t win.

Still, the venture capitalists came knocking. And the rest is
Internet history. Today the company runs a worldwide network of
more than 4,000 servers that distributes Web content for such
customers as Yahoo!, CNN and C-SPAN; if a PC user requests, for
example, videostreaming from C-SPAN’s Web site, the Akamai system
of servers helps to deliver that content, thereby avoiding
bottlenecks at C-SPAN’s centralized site. The distributed network
makes content delivery over the Web quicker and more reliable.

Despite hitting the IPO jackpot, the soft-spoken MIT professor
(currently on a leave of absence from LCS) displays few overt
signs of material success. At Akamai’s new headquarters
adjacent to the MIT campus, Leighton, the company’s chief
scientist, occupies a modest corner office overseeing a maze
of cubicles. It’s very much the office of a professor, and
Leighton speaks in the patient and precise words of someone
used to explaining how things work. TR Senior Editor David
Rotman recently went over for a lesson on managing traffic on
today’s Internet.

TR: When did it occur to you that you could use algorithms to
optimize content delivery on the Web?

LEIGHTON: The first time I ever thought about the Internet
 was in 1995. My office [at MIT’s LCS] is down the hall
from Tim Berners-Lee and the Web Consortium. Over time we
talked about some of the issues facing the Internet. These are
the kinds of large-scale networking problems that our group
was working on and that I have a long-term interest in. So
we took on some of them as research projects.

TR: In a sense, the Internet is really the ultimate
networking challenge, isn’t it?
                             
LEIGHTON: Yes. That’s right.

TR: What was the problem that you started with in ’95?

LEIGHTON: We were looking at ways to deal with flash
crowding and hot-spotting. That’s where a lot of people go
to one site at one time and swamp the site and bring down
the network around it—and make everyone unhappy.

TR: Can you explain the technologies you’ve developed?

LEIGHTON: Today we’re probably one of the world’s largest
distributed networks. At a high level, we’re serving content
or handling applications for end users, and we’re doing that from
servers that are close to the end users. “Close” is something
that changes dynamically, based on network conditions, server
performance and load. Because we’re close, we can avoid a lot of
the hangups, delays and packet loss that you might experience if
you’re far away. Before, you typically got your interaction with
a central Web site. And typically that was far away. Now you
typically have a lot of your interactions—not all, but a lot—with
an Akamai server that is near you and is selected in real time.

TR: What are the tricks and challenges to making this distributed
system work?

LEIGHTON: It’s an extremely hard area; you can’t go and just throw
a bunch of servers out there and have them all work with each other.
The servers themselves are going to fail. Processors are going to fail.
The Internet has all sorts of its own issues and failure modes. So
all these kinds of things have to be built into the algorithmic
approach. How do you develop a decentralized algorithm with imperfect
information that is still going to work? That’s a huge challenge. But
it’s clearly what you have to do. You can’t have any central point of
failure or the system will come down. I can’t think of a component
or a piece of hardware that hasn’t failed at some point or some place.
So, it’s a given [that you need a distributed system].

When a client comes to one of our customers looking for content, we
have to figure out where that client is, which of our locations at
that moment is the best to serve the client from, and what load
conditions are, so we don’t overload anything. We have got to handle
flash crowds that are both geographic and content specific. We have
got to replicate the content immediately to handle any of those kinds
of issues, but you can’t afford to have copies of everything
everywhere. You’ve got to make these decisions and respond back to the
clients in milliseconds. We’ve got to be automatic. And when pieces
fail, you’ve got to compensate automatically for that.

TR: That’s what you call fault tolerant?
                             
LEIGHTON: Yes, and you have to be fault tolerant across all aspects.
Then there are also the non-obvious things. Like billing. We’re
serving billions of hits a day, and we’re billing for every single
hit. We’ve got to figure out whose content it was and how many bytes
it has, and bill them for it. On top of that, we have a service that
we offer our customers, where they can see within 60 seconds how many
hits we served for them in the last 60 seconds. In addition, we can
break down for our customers where the hits are coming from by country
or state. It’s a challenging algorithmic problem. How do you actually
do that? And make it work with a finite amount of hardware and
resources?

TR: Hardware isn’t really the key to this, is it?

LEIGHTON: It’s not even a major component. I don’t want to belittle
our hardware partners, but the key here is the algorithmic and software
infrastructure. It’s critical.

TR: What is your competition in offering a distributed network for
content delivery?

LEIGHTON: There’s not really much out there. We’re at a time when
there’s a lot of business plans and there’s a lot of stories. There’s
not much in the way of real services available today. Pretty much the
only competitor in our space is Digital Island, which recently
acquired Sandpiper [Networks]. There are others that have announced
[business plans] but are not actively carrying traffic yet. One of
the things that distinguishes Akamai is the amount of research and
engineering and R&D effort that went into designing the system. It’s
not just throwing a bunch of boxes out there. There are companies that
have tried do that with no distributed system. The companies that
announced,services based on that approach two or three years ago
aren’t still in business. Doing that didn’t work.

TR: What are the upcoming challenges for the technology? Is it to
deliver content faster?

LEIGHTON: That’s a component. We’re trying to deliver on the promise
of the Internet. There is the idea that there is a tremendous
revolution happening with regard to the Internet. At the same time,
there’s frustration because of the limitations. What we’re trying to
do is to make the Internet more useful. And a component of
that is making it faster and more reliable. Another component, somewhat
related, is enabling the delivery of more enriching, more enabling
content. If we can make streaming better, and in this case speed is
not so much the issue, it’s bandwidth and not having packet loss,
you’re going to get a much better image on your screen; you’ll do more
with it, and more people are going to use it to convey content and
information. And that’s invaluable in enriching the power of the
Internet. But not everything is pushing bits. Akamai offers services
for capabilities such as Internet conferencing that enable, for
example, distance learning. With these services, content providers
or enterprise customers can effectively deliver content and interact
with small or large audiences on the Web through live audio and video;
there are features for sharing presentations, audience polling and
moderating messaging.

TR: When you introduce a new function like conferencing, for example,
what demands does it place on the network?

LEIGHTON: How are you going to implement it? How are you going to
integrate it into this massive distributed platform? How are you
going to maintain it for thousands of customers? You have thousands of
customers and hundreds of millions of people accessing those customers,
and we’re sitting in between. And it all has to work by itself. You
can’t be monkeying around. Delivering conferencing sounds simple. But
it’s not so simple when you’re talking this kind of scale. When people
think about streaming they think of a single source where the content
comes from, and then it branches out in a tree through the Internet.
Those places can break down and then all those people downstream are
out of luck. We’ve developed an entirely new way of going about it so
that there’s no critical point of failure. If the source dies, then
you’re stuck. But once [the content] is out of the source, we
replicate it and spread it throughout the system. So, it’s not a tree.

TR: What does it look like?

LEIGHTON: It’s hard to describe. The way to think about it is that
between the source and destination, you have multiple transmissions
going on such that you can lose content on those paths; you can have
packet loss on any or all of them, but at the endpoint you have enough
information coming in from those locations so you can reconstruct the
signal. So, if something gets killed along the way, such as a path
gets killed, nobody’s affected.

TR: We’ve all experienced frustrations with videostreaming. In terms
of the technology, what will it take to make it more reliable? When
will we be able to watch webcasts as easily as TV on a full screen?

LEIGHTON: In order that videostreaming be more reliable, you need a
content distribution service to deliver the bits reliably to the edge
of the network, and then you have to have a reliable last-mile
connection to the Internet. If you want high-quality video, then you
better have a high-bandwidth connection to the Internet. It will still
be some time before you can get TV-quality videostreams on a
widespread basis. We’ve demonstrated a megabit-per-second live
stream. In fact, just recently we carried thousands of
one-megabit-per-second streams to live customers accessing a
conference keynote address by Steve Jobs [CEO of Apple Computer].
This is a major milestone for the Internet. With that technology you
get a very high quality videostream. If the last mile is broadband,
then you’re all set to go. One thing we’re working on is bandwidth
profiling. The idea is to automatically detect the bandwidth of the
last mile. Does the client have a broadband connection, a 28K modem,
or is it narrow band—a cell phone or something? Then we deliver the
content as a function of that. So if you detect that the client has
high bandwidth, they get the high-bandwidth version—the streamed
version as opposed to the static version. Or in the case of narrow
bandwidth, you get a printed version as opposed to the graphics.

TR: The very nature of the Web seems to be changing with such
functions as videostreaming and conferencing. What will Akamai
be working on in five years? What do you think the Internet will
be like then?
                             
LEIGHTON: Things move so fast, it’s really hard to predict. People
who try to predict end up eating their words. I think we’re just
at the beginning of the Internet revolution. I don’t think we’ve
even begun to think of all the things that we can be doing on
the Internet. I can’t tell you what will be the hot service
five years from now. I don’t know. I would hope by then that,
for example, the quality of streaming is much better. That it’s
part of daily life. At the least, I would expect the typical
Web experience to become richer, more efficient and more
reliable than it is today.

TR: You are seen by many as a model of an academic making it big
as an entrepreneur in the new economy. What do you tell those
looking to emulate your success?

LEIGHTON: I never had an aspiration to be an entrepreneur. I love
academics and co-founded Akamai because we felt it was the best
way to transfer our technology from a research environment into
practice. It felt really nice to be taking technology, especially
technology out of a university, and making a difference with it.
That’s probably the biggest reward. It often takes 10 to 20 years
for a technology in a university to really manifest itself in
practice. And this time we’re able to decrease that time
dramatically. I’m perfectly happy writing a paper that only
five people read. Pretty smart people will read it, and I get a
kick out of that. It’s what I spent all my life doing.
But this is something with a chance to make a difference.

TR: Do you ever miss the days when, as you put it, you spent your
time writing papers that maybe five people were able to read and
understand?

LEIGHTON: Yes, although I don’t have much time to think about it.


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Mon Sep 04 2000 - 22:16:49 PDT