Nonintrusive and Accurate Measurement of Unidirectional Delay and Delay Variation on the Internet
Ian D. GRAHAM <firstname.lastname@example.org>
This paper describes a novel and low-cost technique for accurately measuring delay, delay variation and packet loss on the inter-continental Internet. The technique enables us to measure delay between two Internet sites anywhere in the world, with an absolute accuracy of better than 10 microseconds. The system is being developed as part of a project to study patterns of Internet delay within New Zealand and around the Pacific Rim.
At present most published statistics on packet delay over the Internet are based on round-trip measurements, such as the use of ping. Such measurements are flawed, as congestion on the out and return paths may be completely different, indeed out and return paths may not even be the same. Active methods, such as ping, also require the injection of test traffic into the network, this can disturb the conditions that are being measured.
This paper describes a new system which allows the accurate measurement of unidirectional delays on real traffic, it does not require the injection of any test traffic into the network. This is in contrast to the Surveyor project , which uses also GPS time receivers but relies on the exchange of special measurement frames between sites. Our new system is based on our work on a very similar ATM measurement technique, which is described in . Data from these projects are regularly published on our Web site .
The technique uses PC-based monitoring stations, sending traffic information back to a single analysis site. Each monitoring station includes a GPS time receiver, which enables the correction of the local clock to UTC with an accuracy of better than one microsecond. Our original system was ATM based, but the new method described in this paper works directly on IP packets. Delays on individual IP packets are measured, the packets at each end of the network being recognized by a 32-bit CRC calculated over the packet payload. The use of PC-based monitoring, rather than specialized test equipment, results in measurement stations that cost in the region of $US 5,000 per site.
Results are described of delay measurements between England and New Zealand. A program of measurement between sites on the Pacific Rim will begin in early 1998. The results of this project will have an influence on the developing ideas of quality of service over the Internet, and will also provide input into techniques for efficient routing on both ATM networks and the Internet.
The measurement system consists of a number of PC-based monitoring sites, each equipped with a Trimble Palisade GPS time receiver  and running LINUX. The monitoring site machines act as servers, and return information on packet arrival times and GPS time corrections to an analysis client by TCP/IP. The analysis client software has been run on a number of UNIX-based operating systems; however most of our development work has been done under LINUX.
The measurement servers each have two Ethernet cards. One of these is used to communicate with the client; the second interface is used solely for recording packet arrival times. In our present implementation we use the facilities of libpcap, and the developmental LINUX 2.1.75 kernel which provides microsecond-resolution timers.
Arriving packets are filtered on the source and destination addresses, so that only relevant subsets of the data are recorded. The source and destination address are then combined with the IP frame type (TCP, UDP, ICMP etc.) into a 32-bit descriptor. A 32-bit CRC is calculated over the payload part of the IP packet; this acts as a packet signature, as its value should not change as the packet passes through the Internet. The IP header is not included in this CRC as certain fields -- TTL and header CRC -- will change between source and destination. The time stamp for the packet, provided by libpcap, is scaled to 0.25 microsecond resolution, and truncated to 32 bits. These three 32-bit words -- descriptor, time stamp and payload CRC -- derived from each packet are then stored in a buffer.
The GPS time receiver has two outputs occurring at one-second intervals: a time pulse approximately one microsecond wide, with its leading edge synchronized to UTC, and an ASCII time packet which contains information on which UTC second the time pulse refers to. The time pulse is connected to the ring indicate input of a serial port on the PC, and produces an interrupt at one second intervals. The PC's internal clock is read when this interrupt is received; this clock value can then be used to calibrate the drift and offset of the internal clock relative to UTC.
The one-second interrupts provide the basic pulse of the system. During each second the system collects measurements and stores these in a buffer. At the same time, as much data as possible from the buffer accumulated in the previous second is passed on to the client system by TCP/IP. The transmitted data is formatted as records containing data on up to 100 IP packets; each record also contains the time calibration data derived from the GPS receiver. Data records that cannot be transmitted are discarded. Thus the system is able to adapt to the bandwidth of the link back to the analysis site.
The client software is responsible for setting up the TCP/IP connection to each monitoring system, for defining the filtering to be used and starting the measurement. Usually a client will be in simultaneous contact with several monitors, receiving streams of measurement data from each.
As the data are received, the client software applies the time correction and writes to disk the triplet of corrected timestamp, packet descriptor and payload CRC from each IP packet. These data can then be analyzed offline by correlating the arrival times of packets with the same CRC, the difference in arrival times giving the time delay between sites. We have also implemented a real-time CRC matching system; this can give a very rapid display of a histogram of delays between two sites. However, the present implementation of the real-time system is usually not capable of matching as many packets as our post-processor.
Using packet payload CRCs as a signature for matching has the advantage that no special traffic needs to be generated to measure delay -- any existing traffic will do as long as there are not a significant number of packets with the same payload CRC within the typical delay times of our networks, from zero to one second. Where there is no existing traffic to measure we have used both ping and a simple process generating small UDP packets with random contents at a steady rate. In neither case have we had problems with duplicate packet CRCs.
Packet loss can also be detected by the payload CRC. If a particular value of CRC is detected at its source, but not at its destination, then that packet is missing somewhere in the network.
In order to estimate the error involved in our system we have made a series of measurements at two adjacent points on the same Ethernet segment, and calculated the difference in packet arrival times. If there were no error, we would expect a distribution corresponding to the 0.25 microsecond measurement resolution of the system. From figure 1 it is clear that we suffer from some jitter, probably due to the uncertainty in the interrupt response time. However, almost all the errors are less than 10 microseconds, and the great majority of measurements fall within 5 microseconds of 0. When measuring the Internet we are usually concerned with delays of the order of tens or hundreds of milliseconds; a ten microsecond accuracy is quite sufficient for the purpose.
In February 1998 measurement systems were set up in Cambridge, England and Hamilton, New Zealand. Measurements were made over extended periods using ICMP traffic generated by ping, and ping delays will also recorded. Some early experiments were used to investigate whether there was any systematic difference between the delays for UDP and ICMP frames -- no significant difference was found.
Figure 2 shows the raw results for a nineteen hour measurement starting at 0700 GMT on February 18, 1998. In this diagram, for clarity, the uni-directional time differences are shown as positive for Hamilton->Cambridge traffic, and negative for Cambridge->Hamilton.Ping measurements are shown as positive. The delay distribution for this same set of measurement is shown in Figure 3. There is a clear difference between the delay distribution for the two directions. This is probably due to an automatic load sharing system in use at Waikato, which distributes traffic between three different trans-Pacific routes depending on the load on each route. Each of these routes has a slightly different delay.
Figures 4, 5 and 6 show the variation of delay with time of day, including the ping delay as well as uni-directional delays, averaged over periods of one hour, fifteen minutes and one minute. The ping delay is the sum of a forward and backward path delay, plus some processing time, and ping delays have been normalized by dividing by two in these diagram, for the purposes of comparison with the uni-directional delays.
It is clear that there is a strong diurnal variation in both of the uni-directional delays, but that these changes are out of phase, and so tend to cancel out in the ping delay. The measurements using ping tend to underestimate the actual delay variations between the measurement points.
Without more measurement sites along the path it is difficult to explain the delay variation conclusively. However, it appears to be consistent from day to day, with the increase in delay in the NZ->UK direction corresponding to the working day in the UK, from 9 in the morning to 5 in the afternoon. Therefore we surmise that the delay variation in this direction is due to congestion on the transatlantic link.
Figures 7, 8, 9 and 10 show network delay medians, 10th and 90th percentiles plotted against time for both directions and at time resolutions of 15 and 1 minute. There is a strong anomaly in the UK->NZ delays, a period of increased delay lasting from about 0700 to 0900; this is not evident on the NZ->UK delay plot.
Figure 11 shows an estimate of the extra
delay of a ping operation, obtained by plotting the results of
subtracting the sum of the hourly mean NZ->UK and UK->NZ
delays from the hourly mean ping delay. This results in values
of between two and eight milliseconds for the extra delay of a
ping operation. A more sensitive determination of this delay is
being planned, where the delay of each ping is recorded together
with the individual unidirectional path delays.
We have demonstrated in this paper that it is possible to make high accuracy measurement of delay on the intercontinental Internet with a simple PC and a GPS time receiver, for a total cost of less than $US 5,000 per measurement site. More measurements are being made, and will continue to be published on our Web site. It is our intention to create a measurement infrastructure around the Asia/Pacific region to monitor trends in Internet service over the next few years; we hope that such measurements will become commonplace and aid in the planning of the worldwide Internet service.
2 A remote ATM network monitoring system, Ian Graham, Murray Pearson, Jed Martens, Stephen Donnelly and John G Cleary , submitted to SCICON'98, The IEEE Singapore International Conference on Networks, July 1998.