Analysis code

Preliminary

  1. To analyze TTL violating resolvers, you first need to have three datasets in Archive

  2. We use the following dataset to find the ISP and country information from the ASN from CAIDA: as-organization, also we use the python pyasn module to find the owner ASN of IP prefixes

  3. The analysis scripts are in written in python 3

Source codes

Here, we provide following source codes. The instruction and usage of the source codes are explained below.

filename Download
log_analyzer.py link
meta_analyzer.py link

log_analyzer.py

First, set the following paths: bind_file_directory pointing to the bind dataset, main_file_base_directory pointing to the luminati dataset and apache_file_directory pointing to the apache dataset. This script mainly analyzes the raw logs from our bind and apache webserver as well as the response we received from our Luminati exit nodes to create the mapping between resolver IPs and exit nodes as well as find out which resolvers are violating the authoritative TTL.

meta_analyzer.py

This script works on further analyzing the output files of log_analyzer.py to find out numbers and results that we have used in the result section as well as different figures in the paper.

Function description

Here we give brief descriptions of the functions used to generate the result:
file function Description
meta_analyzer.py preprocess_all_resolvers get resolver to ASN, Organization, Country mapping
meta_analyzer.py table_maker_global, table_maker_local Group each organization/ISP's resolver IPs, exit nodes and
TTL-dishonoring fraction of exit nodes. Used to generate data for Table 3
meta_analyzer.py get_client_to_country_distro Group each country by percentage of exit nodes
meta_analyzer.py geographic_dishonoring_resolver_distro Group each country by percentage of TTL dishonoring resolvers
meta_analyzer.py cdf_data_maker For each resolver, find out ratio of TTL-dishonoring exitnodes
for each TTL. Used for generating the CDF in Figure 3