Web100: Concept Paper
Draft 0.4
9/29/1999
I. Introduction
When a typical academic researcher attempts to transfer large datasets using conventional vendor-supplied FTP-based programs on high-performance LAN or WAN networks that are purported to provide 100-Mbps (megabits/second) performance, the researcher is often lucky to see 10-Mbps actual transfer rates. This result is not unique to just FTP; similarly poor results are seen with almost all other out-of-the-box (and much custom) networking software.
Such network performance problems are mostly caused by poorly implemented and poorly designed commercial host-software in Layers 3 through 7 of the OSI Reference Model. The solutions to many of the technical problems are well understood, and prototype implementations exist in many cases. The chief difficulty has been the failure of the market to address performance issues in a significant way. Part of the problem has been a lack of agreement among network researchers regarding comprehensive solutions, and part of the problem has been the traditional reluctance of operating system and other commercial software vendors to make network performance improvements in a timely fashion. Furthermore, different vendors install different sets of improvements, so there is often little uniformity among network performance capabilities among vendor software.
This concept paper outlines a project for trying to solve the bulk of these problems in a comprehensive fashion. The proposed approach is to produce a complete host-software environment that will run common Web applications at 100% of the available bandwidth, regardless of the magnitude of a network’s capability. In particular, applications would be able to automatically consume 100% of the available bandwidth in very high-performance networks, something that has been problematic to date in today’s host operating environments. This host-software environment will include all components necessary to achieve such dramatic performance improvements. These components would be included in a turn-key software package that includes an operating system, a Web server, a Web browser, and common Web plug-in modules, and > would be packaged in a “shrink-wrapped” fashion for free distribution, with provision for some level of maintenance support. The proposed effort is tentatively referred to as the Web-100 Project, reflecting the goal of demonstrating individual Web transactions that are 100 times faster than what is typically seen from current Web behavior. The Web-100 concept serves as a focal point for the project, provides a benchmark by which success can be tested, and offers a common vision for developers and high-level policy makers alike.
Three steps are needed to rectify the vendor host-software problem:
- Agree upon what comprises a comprehensive host-software solution to the host high-performance networking problem.
- Produce working solutions and distribute them in the form of a free turnkey operating system package.
- Work with vendors that wish to incorporate these solutions in their software.
Working with vendors is the most difficult part, though it is believed producing a working solution may help induce important commercial vendors to adopt agreed upon solutions.
Ideally, even without a working solution, simply offering to work with vendors to develop such software would be a sufficient inducement because enlightened corporations would realize that high-performance-enabled host-software would enable supercharged Web applications that would ultimately create an expanded market for everyone’s products. Supercharging such Web applications as http, ftp, RealAudio, RealVideo, and the like would result in a qualitative improvement of business-support and commercial network-based ventures. Such improvements in turn would require major upgrades of server and user computers because of increased need for CPU power and disk storage by supercharged Web applications.
Initial deployment of a supercharged Web in the LAN infrastructure of academic and corporate institutions could in turn whet the appetite of American employees for similar home services and thereby create a demand and a potential revenue stream that could induce national and/or regional communication companies to finally invest in fiber-based high-performance networking delivered to every American home, having reasonable confidence that their tremendous investments could be recovered because of bona fide service demands.
The current demand for incremental home bandwidth improvements didn’t happen just because people thought it would be a good idea. The current demand is because people want better access to a nascent national utility, the Web. Just as pumps and lights were the reasons for making the incredible investment to connect our homes to the national electrical grid, the Web-100 could be a future reason for making (and recapturing) the equally incredible investment necessary to bring megabit bandwidth to our homes. In the case of a new utility, there is no chicken-and-egg question. While the technology must exist for a new utility to be technically feasible, the application that uses the utility must also exist to justify the investment to construct the utility. Today, the technology exists to bring megabit bandwidth to our homes, but the current Web is not a sufficient application to generate the demand or produce the revenues necessary to induce investment in home megabit bandwidth.
The Web-100 is envisioned to be an application that could help break this impasse. As envisioned, the distribution and deployment of the Web-100 would first occur in academic and research institutions, both in the LAN and WAN. This deployment would be followed by utilization within business-office environments. As the potential of the Web-100 was utilized and became understood by individual researchers and workers within academic and corporate institutions, a powerful desire could then grow to obtain bandwidth to the home sufficient to support similar Web-100 capability. A new explosion of server, client, storage, and other Web infrastructure improvements would then ensue to serve this vast new market of consumer end-points.
Of course, in the real world, such flights of fancy are not always sufficient to induce commercial software vendors to do what is in their own best self-interest, as evidenced by the current state of affairs in host-software. Therefore, some other inducement is necessary. This paper proposes to emulate the successful approach DARPA used in developing and propagating the operating system network socket functions that were fundamental to the development of all host-based TCP/IP data communications networking.
DARPA funded this fundamental development work in an operating system source code that was more or less in the public domain, namely BSD UNIX. Initially thereafter, the ONLY turnkey way to obtain TCP/IP networking was to use BSD UNIX or a derivative. Because TCP/IP became fundamental to ubiquitous networking and because ubiquitous networking became fundamental to all of computing, all other (surviving) operating system vendors discovered that insertion of socket-like code in their operating systems was necessary to stay in business. This desire to survive (not necessarily the desire to thrive) is what led to the universal deployment of socket code, ultimately allowing the propagation of those universally consistent Layer 3/Layer 4 mechanisms, IP & TCP, that serve as the foundation for today’s world-wide data communications.
However, the goals of the Web-100 project are more ambitious than just adding a bit of code to an operating system. The penultimate goal is to produce a complete shrink-wrapped Web client, Web server, middleware, and OS suite that, when installed, will produce Web responses that can automatically consume 100% of a network’s available bandwidth, both on LANs and on WANs such as the vBNS and Abilene. This production-quality high-performance-enabled operating system would then be freely distributed. The ultimate goal is that the widespread deployment of this distribution would serve as an impetus for commercial OS vendors to quickly adopt its improvements. Finally, the resulting codes could serve as a continuing basis for future research.
II. Implementation
This section discusses choice of development components, details of the development tasks, and the award of development tasks to qualified researchers.
II.A. Component Choices
One of the first decisions is to choose the fundamental components of the development system. These include:
- An operating system
- A hardware platform
- A Web server
- A Web client
The primary criteria for such component selection are that:
- Source code is freely available
- The opportunity to freely distribute the enhanced source code is reasonable.
- The source code is already in popular use and support mechanisms already exist.
- The source code is suitable for performing the desired functions.
Operating System and Hardware Platform
Without debating the merits of a very few alternatives, the Linux operating system seems to come closest to a vehicle similar to what BSD UNIX was 20 years ago. Linux appears to meet the four selection criteria quite well. Furthermore, the scientific community is rapidly converging on Linux as the low and medium end platform of choice.
The Intel-compatible platform is also the obvious choice for development since Linux is principally an Intel-oriented operating system; in addition, Intel platforms are inexpensive and ubiquitous. Furthermore, Linux combined with Intel-compatible platforms is a popular server combination and is also popular among the scientific research community. Therefore, the Linux/Intel combination offers a large community that could immediately benefit from the proposed improvements.
Web Server
The Apache Web Server meets the four criteria quite well. Development and support activities occur through the Apache Group. Apache is a free software project and is running on nearly 60% of the web servers running on the Internet today.
Web Browser
The Web browser market is effectively divided between AOL Communicator (formerly Netscape Navigator) and Microsoft Internet Explorer, with Communicator enjoying a substantial lead with Linux users. Source code is unavailable for the current versions of both of these browsers. However, the Mozilla project, also known as mozilla.org and which is a funded project of Netscape/AOL, was formed to develop and maintain the next generation Netscape/AOL browser, which will be an open source version of Netscape Communicator. Mozilla is expected to make their browser available as open source software before the end of 1999.
II.B. Development Tasks
This section lists the development tasks required to produce a complete high-performance-network-enabled system.
TCP-Stack Improvements
A great deal of fine research has been underway by the Pittsburgh Supercomputer Center’s Networking Research Group, the University of Washington’s Department of Computer Science & Engineering, and several other groups regarding networking performance tuning and TCP protocol stack improvements. This research needs to be intensified and capitalized upon in terms of application to the TCP protocol stack in the chosen development system. The individual research groups might also be more effective if their various efforts could be utilized in a cohesive fashion. For instance, no standing TCP-stack improvement forum exists to provide a focal point for the exchange of ideas. Finally, it should be noted that the TCP protocol stack improvement task would be the most complex and most difficult task of all of those listed.
Needed TCP-stack improvements are listed below.
Include Well-Known Mechanisms
Standard mechanisms like per-destination MTU-discovery (RFC 1191) and “Large Windows” extensions to TCP (RFC1323) would certainly be included in the development system.
Include Advanced Mechanisms
In addition to such standard mechanisms as listed above, more advanced improvements are needed. For instance, TCP Selective Acknowledgment (SACK), defined by RFC 2018, should also be included in the development system.
Furthermore, work needs to be done not just to improve high-performance networking, but to improve short-duration network-flows as well, particularly when congestion is relatively high, as such short-duration high-loss transfers are typical of most current Web transfers. Current end-to-end congestion avoidance and congestion control mechanisms can greatly impede performance in such circumstances.
Include TCP API Performance Knobs
Despite the best efforts to improve TCP protocols stacks, it’s unlikely that such protocol codes can automatically apply themselves to all possible situations in an optimal manner because of the diversity of TCP applications. For instance, optimal short-duration transfers on lossy links are likely to require different protocol mechanisms than required by optimal long-duration transfers on uncongested high-bandwidth-delay links. It is likely that TCP implementations should accept a few simple parameters from the TCP API that give hints as to the requirements of the data transmission being undertaken.
Provision of Automatic Bandwidth-Delay-Product (BDP)Discovery
Background
Probably the most important unsolved technical problem is automatically determining the bandwidth-delay-product (BDP), which is used to specify the simplified maximum TCP window size for each TCP session. The correct BDP is extremely important for maximum utilization of high-performance networking without undue memory consumption, particularly over very long-distance high-performance networks.
The simplified BDP is calculated by multiplying the round-trip-time (RTT) by the maximum bandwidth of the least-capable hop of all of the router hops between two hosts using the TCP protocol. The RRT is easily obtained with the “ping” protocol, which utilizes an ICMP echo request message, which is a special IP message that is echoed back to the sending host by the receiving. Simply measuring the time it takes to get the echoed message back provides the RTT.
Right now, however, there is no way quick and easy method to automatically obtain the least-bandwidth number between an arbitrary pair of TCP hosts, so all users who expect to obtain high-performance networking results are required to manually compute this value, which essentially implies that such users must have an intimate knowledge of the complete network topology between their own host and any host with which they wish to communicate. Furthermore, every application that is to make use of this information requires special coding and special user-interface parameters. This is a ludicrous situation akin to having to be an expert automobile mechanic to be able to drive an automobile (which was actually the case at the dawn of the automobile age). As long as it requires network-engineer training for any user that must deal with BDP-discovery, we too will remain only at the dawn of the high-performance networking era.
Proposed BDP-Discovery Methods
One of the great implementation difficulties is that the chosen BDP-discovery method must automatically work well with highly varied circumstances that include: large file transfers, small file transfers, LANs, MANs, WANs, high-performance links, and low-performance links.
The most promising method to date for dealing with optimal TCP window sizing has been performed at Pittsburgh Supercomputer Center. This method is called autotuning and utilizes TCP congestion feedback to dynamically tune the maximum TCP window size throughout the entire duration of each TCP session. An implementation of this method has been accomplished and a paper has been published regarding the results. More information about autotuning can be found at https://www.psc.edu/networking/auto.html.
Another method that has been proposed is the direct discovery of the BDP. A brief description of this method is *************here.
Diagnostic and Performance Monitoring Tools
The state-of-the-art is pathetic regarding diagnostic and performance monitoring tools designed for easy use by the ordinary end-user. Existing tools are basically balky prototypes with pitiful user-interfaces, non-existent documentation, and obscure distribution methods. Essentially,no tools are in wide use except for ping and traceroute, which have been around since the dawn of network computing.
However, the situation isn’t quite as grim as just described. A large number of good prototype tools exist, and aside from filling in a few gaps, the main effort that is required is to develop these prototypes into production quality tools that are portable, installable, and that have good documentation and good (i.e., GUI) user interfaces.
The following is a list of needed improvements.
Kernel Hooks
Currently, operating system kernels generally provide statistics regarding network only in the aggregate. Kernel hooks to monitor individual TCP sessions in real-time need to be added as a foundation for developing a large class of highly needed network diagnostic and performance monitoring tools. Such hooks should maintain dynamic counts of important TCP-session parameters, as well as be able to supply TCP-session packet streams upon demand.
GUI-based TCP-Session Monitoring Tools
Based upon the aforementioned kernel hooks, one or more TCP-monitoring tools need to be developed that are capable of concurrent, dynamic, real-time graphing of sets of user-selected real-time TCP-session statistics. Among these statistics are: data rate, window size, round-trip-time, number of packets unacknowledged, number of retransmitted packets, number of out-of-order packets, number of duplicate packets, etc. A variety of display options should be available such as totals, deltas, running-averages, etc.
Enhanced Traceroute
The MTR real-time concurrent traceroute/ping program should be enhanced with good portability,
installability, and usability (i.e., GUI user-interface and
documentation).
[Continue the list of diagnostic and performance measuring tools to be included in the distribution.]
Data-Transfer APIs
It’s ridiculous that everyone that writes a data-transfer program using sockets has to write the same 300 lines of code to transform sockets into something useable. Furthermore, almost all of this code deals with obscure issues that are very poorly documented and have awful data structures to boot. A small set of simplified APIs built on top of sockets should be developed for such common tasks as file transfer, data streaming, etc. Such APIs should define uniform and easy to use parameter specification syntax, such as separate “KEY=VALUE” strings for all parameters except for buffer addresses and transfer counts.
FTP Improvements
Simply developing portable, turnkey, high-performance-enabled FTP-codes would itself be a major boon for many researchers. Most FTP-based codes today have execrable user-interfaces and are not high-bandwidth-delay-enabled. Until automatic bandwidth-delay-product (BDP) discovery is implemented, TCP-based applications will require enhancements
to enable users to input manually calculated BDP values if high-performance networking is to be effectively used. Thus, FTP codes need both user-interface enhancements and functional improvements.
Aside from simply developing improved GUIs for FTP codes, some thought ought to also be given to developing simplified stand-alone FTP-based codes optimized for common tasks. For instance, an rcp-like FTP-based command-line program could be developed, and a simple FTP-based programmatic API could be developed to perform file transfer functions within
user-developed codes.
Some work has been done on the BDP aspect of ftp/ftpd in terms of the NCSA WU-FTP and NCFTP from NCAR; however, these codes are far from being portable turnkey products.
Web Server/Client Improvements
The chosen Web server and client codes will need to be analyzed for needed network performance improvements. One possible major improvement would be to make changes that allow TCP sessions to be kept open on a timed basis and reused for multiple file transfers from the same location, as is the case with HTTP 1.1.
A set of popular plug-in/helper applications that perform their own data transfers will need to undergo similar analysis.
It should also be noted here that many performance improvements to the Web server/client would automatically accrue from fundamental improvements to the TCP protocol stack
previously discussed.
Web Caching
Web caching software should be enhanced to take advantage of any changes to the Web Browser/Server that would be beneficial to the operation of Web caching. The Squid caching software would be the best development platform for any such enhancements as Squid is a popular caching product whose source code is freely available. Furthermore, active Squid development and support is already currently being funded by the NSF.
Other Tasks
Shrink-wrap group
A group will be required to burn CDs and distribute them. The ideal solution for this activity would probably be to partner with one or more of the established Linux distributors.
Compiling/Testing Group
A group will be required to assemble the various system parts, and compile and test them. It’s likely that such a group would work with existing groups such as a Linux Project, the Apache Group, one or more Linux distributors, and/or mozilla.org.
Measurement Group
A measurement group would be of great utility in assisting the various other development groups, and in developing measurement benchmarks used to measure the various performance improvements. For example, the measurement tools and expertise developed by the MOAT group at SDSC/CAIDA would be immensely valuable to the measurement needs of the Web-100 Project.
User Community
It is critical that a core user community be formed to use the Web-100 products during the early phases of the project and to continue to act as a beta user group during subsequent product development. This core group will be the nexus from which increased product usage grows and will also contribute to a growing core of expertise for the Web-100 products.
This core group should be selected from scientific disciplines and departments that are early adopters of technology and which are already heavily invested in Linux. Such a selection maximizes the immediate benefit of Web-100 improvements to the research community and provides an academic environment that is amenable to the maturation of alternative tools.
Help-Support Function
Funded help-support of the Web-100 products will be essential to their adoption and effective use. There is nothing more frustrating than having problems with a critical tool and having no one to call for help. A lack of such help can quickly sour people on the value of a tool, and the subsequent negative word-of-mouth can lead to its quick demise. It’s proposed that both central and distributed help-support be an integral part of the Web-100 Project. The help-support groups will collect bug reports and assist with installation and operation of the Web-100 products. Some of the help-support people will be associated with the departments that are early adopters, and some will be a part of a central support organization. Having help-support people located in early-adopter institutions is particularly important in the early phases of usage.
Be Prepared to Assist Vendors
It is vital that efforts be made to help commercial vendors adopt the performance enhancements in the Web-100 development system. Such efforts could occur as collaborative efforts with commercial OS vendors as a part of the development phases of the Web-100 Project and/or they could occur by working with vendors after the Web-100 improvements are demonstrated to be successful.
Good-House-Keeping Seal of Approval
Finally, some thought might be given to the notion that a standard for a “high-performance-network-enabled system” could be defined and that a body could exist that evaluates systems and designates them as compliant with such standards. A simple example of such standards would be a required set of RFC implementations.
II.C. Phasing the Development Tasks
[This section is under development.]
II.D. Awarding the Development Tasks
It is proposed that a project be defined for each listed development activity and that those projects be awarded to experienced and talented groups of individuals located at a variety of public research institutions in collaboration with existing volunteer groups whenever such groups exist. Examples of such groups include the various Linux Projects, the Apache roup, and mozilla.org. In general, it is expected that projects would be conducted in a distributed-collaborative fashion, as it is unlikely that all people working on a given< project will be co-located.
It may also be convenient for the NSF to include in the structure of these projects an umbrella group to coordinate overall project goals, provide comprehensive reporting and milestone results, and administer sub-awardee grants. However, this oversight group would just be another project sub-awardee itself, and would not be responsible for making awards, structuring agreements, or assembling awardee teams. [Note, the thoughts expressed in this paragraph are open issues at this point.]
III. Justification of NSF Involvement
Past Government Efforts
Government funding of academic and research institutions has been instrumental to the major developments that have made the Web possible. In the 1970’s, DARPA provided the funding that lead to the development of TCP/IP and the socket code in BSD UNIX. In the 1980’s, CERN, funded by European governments, developed the WWW protocols, while NSF funded the NSFnet and MOSAIC, which demonstrated that the Internet had commercial potential and provided the springboard for the massive subsequent commercial development of the Internet.
NGI
The NGI has as two of its primary goals the provisioning of networks with 100 times and 1000 times the bandwidth of recent networks. With the advent of the vBNS, Abilene, and other Federal mission-agency networks, the first goal has essentially been met for the U.S. research community in terms of the link and hardware bandwidth infrastructure. However, provisioning bandwidth turns out to be the easy problem in some sense, and merely provisioning exceptional networking bandwidth has not translated into similar end-to-end performance improvements for the end-users and their applications. As an NGI institution, it is entirely appropriate for the NSF to address the significant host-software problems that inhibit effective use of the massive bandwidth available in the various NGI-partner networks, including of course the vBNS.
Past Web Development
The Web protocols were invented at CERN, a European academic research institution, while the first browser, MOSAIC, was developed at NCSA, another academic institution. The Web and the first Web browser were first widely deployed throughout academic institutions before being adopted by the population at large. The NSF has already been instrumental in Web development since it funded MOSAIC development at NCSA. Thus, it is entirely appropriate, for the NSF to facilitate the next necessary step in Web development, namely solving the host-software bottleneck.
The host-software problems impeding high-performance Web access are also the same problems impeding high-performance applications in general, and solving these problems for the Web solves them for many other research-oriented applications.
Finally, the existing Web is a vital tool for researchers and (along with email) is perhaps the most important networking tool used by researchers today. Thus, it is entirely appropriate for NSF funding to be used for the Web 100 Project, and for the Web-100 to be initially deployed in academic circles.
IV. Conclusion
It is important that policy-makers and investors understand the implications and potential for applied research investments. The impact of many of today’s current high-performance academic network demonstration projects is not readily apparent to most people, but everyone, including high-level policy-makers and investors, can readily perceive the implications of a Web service that is 100 times faster than the service they’re use to. Furthermore, it is easy to understand how such improved service could benefit the economic status of the nation, and it is easy to understand that the Web-100 is a non-elitist and popular innovation that high-level policy-makers could justify with pride and excitement.
A recent (4/27/99) article in “USA Today” advised that U.S. CEOs risk becoming obsolete if they don’t become involved with the Web. To some degree, the NSF faces similar risks regarding the importance of the Web to the future of advanced networking development. Many of the problems impeding improved Web performance are also the same problems impeding high-performance applications in general, and solving these problems for the Web solves them for many other research-oriented applications. The Web-100 is the application to demonstrate the solution of high-performance networking issues in general by providing a well-understood goal thateveryone can focus on and which offers a subjective measure of success that everyone can easily perceive.
The WWW today is a 20th century technology; it is now time to prepare the Web for the demands of the 21st century. Like a string of pearls, evolution of technology occurs at a steady pace with critical leaps of innovation strung between periods of steady development. The invention of the WWW protocols, the development of MOSAIC, and the NSFnet are three such pearls on the string of the Internet. However, since then, very few pearls have come forth. It’s time for another pearl, and perhaps the Web-100 is a tiny one that the NSF could culture.
The National Science Foundation is in a unique position to direct the necessary efforts to effect the next step in making the Web-100 a reality. In fact, the NSF may be the ONLY institution in a position to meld the numerous and diverse pieces necessary to solve the host-software network-performance bottleneck via a comprehensive approach such as that expressed by the Web-100 Project.
Last Modified: 09/29/99 07:00PM
This material is based in whole or in part on work supported by the National Science Foundation under Grant No. 0083285. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF)