| Abstract: |
Recent studies show that a significant part of Internet traffic is delivered through Web-based applications. To cope with the increasing demand for Web content, large scale content hosting
and delivery infrastructures, such as data-centers and content distribution networks, are continuously being deployed. Being able to identify and classify such infrastructures is helpful not only
to content producers, content providers, and ISPs, but also to the research community at large. For example, to quantify the degree of infrastructure deployment in the Internet or the replication
of Web content. In this paper, we introduce Web Content Cartography, ie the identification and classification of content hosting and delivery infrastructures. We propose a lightweight and fully
automated approach to discover infrastructures based only on DNS measurements and BGP routing table snapshots. Our experimental results show that our approach is feasible even with a limited
number of well-distributed vantage points. We find that some popular content is served exclusively from specific regions and ASes. Furthermore, our classification enables us to derive
content-centric AS rankings that complement existing AS rankings and shed light on recent observations about shifts in inter-domain traffic and the AS topology. |