CS268 Summaries: November 2008

Monday, November 24, 2008

A Policy-aware Switching Layer for Data Centers

The main idea of this paper is to provide a policy layer that does not mandate placing the middleboxes in sequence on the physical network paths. PLayer uses policies that propagate to the switches, which then route packets through middleboxes before delivering them to their destination.

This is a very nice idea that tries to alleviate the complexity of configuring middleboxes, ensures correctness, network flexibility and efficient resource usage. The authors also discuss why minimizing the changes required for adopting PLayer is crucial. It is true that the success of any new technique is tightly coupled to the amount of changes it requires for deployment.

It was nice that the authors also thought about security issues related to PLayer. They discuss how policy violations can take place during policy churn and how these can be solved.

I really liked reading the paper because it gave detailed explanation about the limitations of current techniques and explained PLayer very nicely with all relevant details listed. I would recommend keeping this paper in the class.

Improving MapReduce Performance in Heterogeneous Environments

The authors in this paper propose a new scheduling algorithm LATE that improves performance in heterogeneous environments. The paper is very interesting to read and is well written. Hadoop scheduler makes assumptions like homogeneous environment, tasks progress at constant time, no cost in launching speculative tasks, equal weight to all functions in a reduce task etc. These assumptions degrade performance.

Pros:

Thinks about heterogeneous environments.
Execution of speculative tasks is estimated from the time left to finish the task.
I liked that authors discuss how their time left estimation technique may fail in certain scenarios.
I really enjoyed reading the paper. It was very nicely written.
This solves a problem in the data parallel applications that are being used in the industry and are getting adopted by more and more companies.

Thoughts:

The authors suggest that different methods can be used to estimate the time left for a task. Currently, they estimate it as (1- ProgressScore)/ProgressRate. Is there a way to store history of tasks to gather statistics and then use that for estimating the time taken by a task?
Has there been future work in the area of finding more sophisticated methods for estimating finish times.

Tuesday, November 18, 2008

A Delay-Tolerant Network Architecture for Challenged Internets

This paper proposes an architecture that uses overlays for achieving interoperability between diverse networks. The motivation behind building this architecture is to address problems with the TCP/IP model with respect to challenged networks. Some of these problems are network partitioning, high latencies and unexpected data transfer delays.

The architecture includes regions and gateways. Gateways are responsible for storing messages to provide reliable delivery. For routing messages they use a naming scheme that uses name tuples. Name tuples are of the form {Region, entity} where region names are unique. Convergence layers add features like reliability, and message boundaries.

Pros:

Provides in-network storage and retransmissions in challenged networks
The prototype implementation convinces the practicality of the architecture to a certain extent.

Cons:

Existing application may need to be changed in adoption of the architecture
Necessary to evaluate the overhead of using messages instead of packets
Does it break the end-to-end model?

Monday, November 17, 2008

An Architecture for Internet Data Transfer

The authors in this paper propose DOT architecture that separates content negotiation from data transfer. The main idea is to provide a transfer service that can be used by applications for bulk point-to-point transfer. The advantages of providing a transfer service include:

re-use available transfer service
ease in adopting new transfer techniques

Instead of iterating how the authors design DOT I would list my views about the paper. I do not have any specific negative points to the proposed scheme. The main motivation is to provide a general data transfer service that anybody can use without spending time re-implementing the same service. It is good to provide general implementations that anybody can use but along with such implementations comes a drawback. The disadvantage is to be unable to add optimizations specific to the application that is using the transfer service.

Sunday, November 16, 2008

X-Trace: A Pervasive Network Tracing Framework

X-trace is a tool built to ease the process of diagnosing complex tasks like web requests and DNS resolution. The main concept involves inserting X-trace metadata into a network task by a client. Metadata involves a task identifier that uniquely identifies the task. When a node sees X-Trace metadata it generates a report which is used for reconstructing the datapath. Task tree construction is performed offline by the user. The user collects reports and uses the information to build a request tree and diagnose the path traced by the request.

The main motivation for this work is to ease the method of diagnosing complex systems that have an interplay of multiple applications and protocols. I feel that with the rapid evolution of the Internet and multiple application running over it, it is really required to build better mechanisms for debugging. X-trace aids in automating the process of debugging rather than collecting reports and then manually trying to assemble them.

Even though X-trace shows to be really helpful, first thing that concerned me was related to its adoption. Success of X-trace relies on the easiness of modifying current applications and protocols. Partial deployment of X-trace can limit the view a user may get while diagnosing the problem. I was happy that the authors were aware about these issues and have discussed them in the paper. I would recommend keeping this paper in the syllabus.

Wednesday, November 5, 2008

Internet Indirection Infrastructure

The authors in this paper propose an overlay based scheme to provide services like multicast, anycast and mobility. The main idea of the paper involves sending a packet with an identifier and the receivers set triggers for receiving packets associated with an identifier.

I liked this scheme because it is neat and simple. The sender has to communicate with the intermediary and not with all the interested receivers of the packet. This aids in decoupling the act of sending from receiving.

This scheme relies heavily on the performance of the overlay. The overlay used should be: Robust, Scalable, Efficient and Stable. The authors choose Chord and say that it satisfies all the requirements.

I really like the concept for its simplicity but few things that concern me are:
1. What happens when a chord node goes down? This involves copying the triggers registered with the failed node to another node. This requires a robust replication and that will make the system more complicated.
2. Efficiency of this system is less because we have introduced indirection.
3. How is revocation of triggers handled neatly?

Middleboxes No Longer Considered Harmful

This paper discusses how middle-boxes such as NAT and firewalls violate the two principles of the Internet architecture:

Unique identifier for each Internet entity.
Packets processed by their respective owners.

The authors say that even though the middle-boxes break the rules but they are required for important reasons. Few reasons being security and performance improvement through caching as well as load balancing. The paper then proposes an architecture to include the functionality of middle-boxes without breaking the principles.

The architecture called Delegation Oriented Architecture proposes the following things:

Globally unique ID in flat namespace which is carried by the packets
Sender and Receiver can define the intermediaries that should process the packet.

Without going into the details of the architecture and how it works, I want to list some of the things that concerned me:

After adding the intermediary information in the packet we are still defying the end-to-end principle. What happens if the intermediary crashes?
The unique identifiers are said to be 160 bit long. The packet is supposed to have 2 160-bit identifiers. Isn't this an overhead for small packets?
The idea seems to be interesting but I am concerned about performance. Even though the architecture provides flexibility by allowing the intermediaries to be anywhere and not in the path to the destination. The packet now has to be first traversed to the intermediary and then to the destination. Also, it is required to lookup of the path to the intermediary.
Another question that get raised with such systems is scalability. With so many machines in the Internet, if every machine sends the message to an intermediary and the DHT being used for EID resolving, an important question arises that we are relying on the performance of the DHT for lookup and information retrieval.

Monday, November 3, 2008

DNS Performance and the Effectiveness of Caching

This paper provides analysis of:

DNS performance (latency and failure)
Effect of varying TTL and degree of caching affect

The analysis was performed on three traces that included DNS packets and TCP SYN, FIN and RST packet information. The authors found that over third of all DNS lookups were not answered successfully. 23% of the client lookups in the MIT trace failed to arrive at an answer while 13% lookups gave errors in the answer.

The authors show that name popularity distribution is zipf-like. Effectiveness of caching is determined by finding how useful is it to share DNS caches and the impact of choice of TTL value. Intuitively, I was under the assumption that reducing the TTL will severely affect the performance of DNS but the authors show that reducing TTLs of address(A) records to few hundred seconds has little effect on the hit rate. Also, sharing a forwarding DNS cache does not improve performance.

I found the paper very interesting to read. The analysis of performance of DNS is not dependent on aggressive caching is opposite to the general notion of DNS performance being tightly tied to caching. This observation is directly related to the relationship between TTL values and name popularity. Popular names will be cached effectively even with short TTL while unpopular names may not gain even with long TTL values.

It will be nice to discuss how would the results vary if analyzed with current time traces.

Development of the Domain Name System

This paper discusses the motivation behind the design and need of DNS. Prior to DNS each machine downloaded a file called HOSTS.TXT from the central server on to their system. This file contained the mapping between host names and addresses. This scheme had the inherent problem of distribution and update. With the rise of the number of machines connected to the Internet came the need of building a distributed system for the functionality of HOSTS.TXT.

DNS was designed for this need. It is a hierarchical naming system that had the following design goals:

provide all information of HOSTS.TXT
Allow distributed implementation of the database
Have no size limit issues
Be inter-operable
Provide tolerable performance

DNS contains two main components: name servers and resolvers. Name servers store information and answer queries from the information they possess. Resolvers are the interface to client programs and contain algorithms for querying name servers. DNS name space is organized in the structure of a variable-depth tree. Each node in the tree has a label associated with it and domain name of a node is the concatenation of all the labels on the path from the node to the root of the tree.

One of the main things that makes DNS fast is caching the results of the queries. This overcomes the need to fire a query every time a name lookup has to be performed. Although this optimization comes at the cost of a security concern. People have exploited this feature to perform DNS cache poisoning attacks.

I really enjoyed reading the paper because it gave a good explanation of the ideas behind the design of a system that plays an important role in today's Internet. I would recommend keeping this paper in the syllabus.