A Model for Cost Allocation and Pricing in the InternetSkip other details (including permanent urls, DOI, citation information)
This work is protected by copyright and may be linked to without seeking permission. Permission must be received for subsequent distribution in print or electronically. Please contact email@example.com for more information. :
For more information, read Michigan Publishing's access and usage policy.
Presented at MIT Workshop on Internet Economics March 1995
This paper explores issues in the pricing of the Internet, in particular the relationship between the range of service actually offered to users and the cost of providing these services. Based on this analysis, it identifies a new scheme for resource allocation and pricing, expected capacity allocation. This scheme is contrasted with a number of resource allocation schemes under consideration, and the more traditional pricing approaches of fixed or subscription pricing and pricing based on actual usage. The paper presents a simple mechanism that implements expected capacity allocation, and discusses its advantages and limitations. 
As the Internet makes its transition from a partially subsidized service to a commercial service with all costs recovered by direct charging, there is a pressing need for a model of how pricing and cost recovery should be structured.
As every economist understands, pricing can be an effective means not only to recover costs, but to allow the user to select among different options for service in a controlled manner. For example, users might like the option of purchasing either more or less capacity on the network during periods of congestion. However, historically the Internet has not relied on pricing to allow the user to select one or another service. Instead, the Internet has implemented one service class, and used a technical means rather than a pricing means to allocate resources when the network is fully loaded and congestion occurs.
There is debate within the community as to how in the future Internet service should be allocated. One opinion is that we have survived so far. There is rather substantial experience with the current model, and it seems to meet the needs of many users. However, others believe that once commercial Internet service becomes mature, customers will start to have more sophisticated expectations, and will be willing (and indeed demand) to be able to pay differential rates for different services. Indeed, there is already evidence in the marketplace that there is a real need for service discrimination. The most significant complaint of real users today is that large data transfers take too long, and that there is no way to adjust or correct for this situation. People who would pay more for a better service cannot do so, because the Internet contains no mechanism to enhance their service.
This paper is written under the assumption that there will be a range of services available, and that the user will be able to select among these classes, with pricing set accordingly. Thus, the discussion of pricing is placed in the context of what the offered service actually is. This point may seem obvious, but in fact the service provided by the Internet is rather complex. This paper will offer some speculation on how service discrimination should be implemented, but there is yet no agreement as to which aspects of the service are seen as valuable to the user. Failure to understand what service features are valued by the user can lead to the implementation of potentially complex control mechanisms that do not meet real user needs. This is a serious error, because while pricing can be changed quickly, the deployment of mechanisms inside the network (involving changes to the packet switches, or routers) can take years to accomplish.
Pricing also serves to recover costs, and this paper looks at pricing structures that attempt to follow the intrinsic cost structures of the providers. In practice, pricing need not follow cost. However, it is reasonable to look at cost structure in a first attempt to understand pricing, even if providers then deviate from that basis for business and marketing reasons.
So this paper addresses the issue of pricing in a somewhat indirect way. First, it discusses the behavior of the Internet as seen by a user, and offers a hypothesis as to what aspects of the service the users of the Internet actually value. Second, in this context, it considers (and rejects) several possible resource allocation schemes (guaranteed minimum capacity, fair allocation, and priority scheduling), and several pricing schemes (access subscription and simple usage based), and proposes a new pricing and resource allocation scheme, expected capacity pricing. The expected capacity scheme is used to address both the issues of cost and price to the individual end user, as well as inter-provider connection, the pricing situation usually called settlement. A rational discussion of pricing must consider both perspectives, for several reasons including the central one that the Internet currently has no on-line way to distinguish the two sorts of attachments, which makes it difficult to bill them differently.
2. The service of the Internet
The Internet today uses a service model called "best effort." In this service, the network allocates bandwidth among all the instantaneous users as best it can, and attempts to serve all of them without making any explicit commitment as to rate or any other service quality. Indeed, some of the traffic may be discarded, although this is an undesirable consequence. When congestion occurs, the users are expected to detect this event and slow down, so that they achieve a collective sending rate equal to the capacity of the congested point.
In the current Internet, rate adjustment is implemented in the transport protocol, which is called TCP. The general approach is specified as follows. A congestion episode causes a queue of packets to build up. When the queue overflows and one or more packets are lost, this event is taken by the sending TCPs as an indication of congestion, and the senders slow down. Each TCP then gradually increases its sending rate until it again receives an indication of congestion. This cycle of increase and decrease, which serves to discover and utilize whatever bandwidth is available, continues so long as there is data to send. TCP, as currently specified and implemented, uses a set of algorithms named "slow start" and "fast retransmit" [Jacobson], which together realize the rate adaptation aspect of the Internet. These algorithms, which have been developed over the last several years, represent a rather sophisticated attempt to implement adaptive rate sources. They seem to work fairly well in practice. 
It is sometimes assumed that the consequence of congestion is increased delays. People have modeled the marginal cost of sending packets into a congested Internet as the increased delays that those packets encounter. However, this perception is not precisely correct. Because of the rate adaptation, the queue length will increase momentarily, and then drop back as the sources reduce their rates. Thus, the impact on the user of using a congested network is not constant increased delays to individual packets, but a reduction in throughput for data transfers. Further, observing the delays of individual packets does not give an indication of the throughput being achieved, because that depends not on individual packet delays, but on the current sending rate of the TCP in question.
Observation of real delays across the Internet suggests that wide variation in delay is not, in fact, observed. The minimum round trip delay across the country, due to speed of light and other factors not related to load, is about .1 seconds. Isolated measurements of delay across the Internet usually yield values in this range, whether the measurements are taken in periods of presumed high or low load. MacKie-Mason and Varian [MacKie] have measured variation of delay on a number of Internet links, and observed that in some cases maximum delay is indeed observed to increase under periods of higher loads, but that the average does not usually deviate markedly in most cases.
2.1 Assessing the impact of congestion
If the delays of individual packets are not much increased by congestion, how then does a user perceive congestion and its impact on performance? For any application, there is a data element of some typical size that characterizes it. For remote login, the element is the single character generated by each keystroke. For a Web browser, the element is a web page, of perhaps 2K bytes average. And for a scientist with large data sets to transfer, the element may be many megabytes. The hypothesis of this paper is that, in each case, the criterion that the user has for evaluating network performance is the total elapsed time to transfer the typical element of the current application, rather than the delay for each packet.
For an application with a limited need for bandwidth and a small transfer element size, such as a remote login application, the impact of congestion is minimal. Isolated packets sent through the system will see an erratic increase in delay, but will otherwise not be harmed.  For a user moving a large data file, the rate adaptation translates into an overall delay for the transfer, which is proportional to the size of the file and the degree of congestion.
There is another dimension to the service quality, beyond the desire for a particular target elapsed time for delivery, which is the degree to which the user is dissatisfied if the target delay is not met. For most services, as the delivery time increases, the user has some corresponding decrease in satisfaction. In some cases, however, the utility of late data drops sharply, so that it is essentially useless if the delivery target is missed. The most common case where this arises is in the delivery of audio and video data streams that are being played back to a person as they are received over the net. If elements of such a data stream do not arrive by the time they must be replayed, they cannot be utilized. Applications with this very sharp loss of utility with excess delay are usually called real time applications, and the applications in which the user is more tolerant of late data are sometimes called elastic [Shenker].
Some simple numerical examples will help to get a sense of the impact of congestion on users of the Internet today. Consider the case of a trans-continental 45 mb/s link, and a user trying to transfer an object of three possible sizes. The typical small data transfer across the Internet today, a piece of mail or a World Wide Web page, is perhaps two thousand bytes average. An image file will be much larger; 500 Kbytes is a reasonable estimate. And to represent users with very large datasets, presume a typical value of 10 Mbytes. Table 1 shows the required transfer time assuming two cases; first that the user gets the entire capacity of the link, and second, that 100 users share the link. The time computed is the transfer time, looking only at the link rate and ignoring all other sources of delay.
Table 1: Transfer times in seconds for files over a 45 Mbps link
The smaller numbers, of course, are not what would be seen in practice. There are a number of effects that impose a lower bound on any transfer. There are perhaps two round trips required to get the transfer started (more if the protocol is poorly designed), and one round trip for the transfer itself, which for a cross country transfer with .1 second round trip delay implies a lower bound of .3 seconds. Added to these delays is system software startup costs at each end. If we make the (somewhat conservative) estimate that all these delays impose an overall lower bound of .5 seconds for a transfer, then the corrected delays for a cross country transfer are as shown in Table 2.
Table 2: Transfer times taking into account .5 seconds of fixed delay
This table shows why the users of the Internet today are tolerant of the existing approach to congestion. For small transfers—mail and text Web pages—congestion swings of 100 to 1 are totally hidden to the user by the round trip delays and system startup latencies. For image transfers, transfer times of 10 seconds are mildly bothersome but not intolerable.  Only for the case of large dataset transfers is the impact clear. Congestion during the transfer of a 10MB dataset stretches the overall time from a few seconds to 3 minutes, which essentially eliminates the interactive nature of the transfer. It is for this reason that parts of the scientific research community have been most vocal in favor of more bandwidth.
One particular point in the above example deserves elaboration: why pick a sharing factor of 100 to represent the congested case? Why not 1000, or a million? Again, actual numbers help. One can make a rough guess at the sharing factor necessary to yield an inexpensive cross country service, and assert that we need not look at sharing factors higher than that. Let us make some simple assumptions about a user reading mail or browsing simple Web pages, and estimate that the average reading time is 10 seconds. Some pages are discarded after a glance, and some deserve study, but 10 seconds seem reasonable for an order of magnitude estimate. So each user goes through an alternating cycle of reading and fetching. If we assume that the bottleneck is the 45 mb/s link, and the user is willing to tolerate no more than the delay computed above (about .5 seconds) for the fetch time, how many total users is the link supporting? A rough answer is 28,000. If we further assume that during a busy period, 10% of the total subscribing users are active, then the user pool would be about 280,000 people. One can do a simple estimate of per-user cost for the Internet backbone shared at this level, and conclude that it is less than six dollars per year.  This is roughly consistent with other estimates of actual wide area costs for the existing Internet [MacKie]. It is plausible to claim that the costs associated with this level of sharing are sufficiently low that there is no reason to strive for a higher sharing level.
3. Enhancing the Internet service
While the single level of best effort service offered by the Internet today seems to meet the need of many users, there are real concerns being expressed. One concern is the impact of mixing users attempting to transfer objects of very different size.
If, on a single network, we were to mix together without any controls 28,000 users browsing the Web (as in the example above) and 280 users (1%) transferring 10 Mbyte files, it is not clear that any user would be happy. If the network "just happened" to give all 28,280 users equal shares of the net, the 1% moving bulk data would die of frustration. (In the unrealistic case of static allocation of the bandwidth equally among the pool of users, the 10 MByte transfer would take about 14 hours.) If the 1% "just happened" to preempt all the bandwidth, the 28,000 web browsers would be equally frustrated.
The other concern that seems of real significance to users is whether the service provided by the network is in fact stable and predictable. The whole point of the Internet's approach to traffic aggregation is to play the odds statistically, and both the providers and users have a good operational sense of what current performance is. However, if we build a network in which both the customer's expectations for service, and the provider's decisions about provisioning, are based not on explicit bounds on user behavior, but only on observed behavior, and the two are very different, we run the risk that some change in user behavior can change the loads on the network, and render it very quickly overloaded. There is a lot of fear about this, both inside and outside the Internet community.
Other network services such as the phone system, base provisioning decisions on observed usage. The phone companies have good models of "intrinsic human phone behavior"; even when phone calls are fixed fee, people do not talk on the phone all day. This sort of observation allows telephone providers some confidence in provisioning based on observed behavior, and offering pricing structures that are not cost related. However, there is a fear that these sorts of assumptions about human behavior will not translate into the computer environment. A computer could easily be programmed to talk all day on the network. Indeed, a similar event has happened to the phone system, when modems became popular. The phone system survived, and still offers fixed price service options to its users, even though some people now make phone calls that last for days. But there is real concern that the Internet, which is subject to applications with much wider intrinsic variation in normal usage, will not remain stable in the future.
This paper takes the position that while the Internet does work today, and satisfies most if not all of its customers, there is value in adding mechanisms to provide more control over allocation of service to different users. The two most important features to add are first, a mechanism to control the worst case user behavior, so that the resulting network service is more predictable, and second, a mechanism to permit controlled allocation of different sorts of service in times of congestion. 
The goal of this paper is to propose some matching resource allocation and pricing scheme that will accomplish these objectives. A number of approaches have been proposed for control of usage and explicit allocation of resources among users in time of overload. As a starting point, it is useful to look at these, and see how well they meet the criterion for service variation proposed above.
a. Guaranteed minimum capacity service
This service would provide an assured worst case rate along any path from source to destination. For example, the Frame Relay service is defined in this way. The subscriber to a Frame Relay network must purchase a Permanent Virtual Circuit (PVC) between each source and destination for which a direct connection is desired. For each PVC, it is possible to specify a Committed Information Rate (CIR), which is the worst case rate for that PVC. Presumably, the provider must provision the network so that there are sufficient resources to support all the CIRs of all the clients. But capacity not being used can be shifted at each instant to other users, so that the best case peak rate can exceed the CIR. This makes the service more attractive.
This idea is appealing at first glance, but in fact is not very effective in practice. One issue, of course, is that the user must specify separately the desired CIR along each separate path to any potential recipient. Thus, this approach does not scale well to networks the size of the Internet. This problem might be mitigated by moving from permanent virtual circuits to temporary or switched virtual circuits that are established as needed. However, a more basic problem with guaranteed minimum capacity is that it does not actually match the needs of the users, based on the usage criterion of this paper.
The issue is that a simple guaranteed minimum capacity presumes that the traffic offered by the user is a steady flow, while in practice the traffic is extremely variable or bursty. Each object transferred represents a separate short term load on the network, which the user wants serviced as quickly as possible, not at a steady rate. To provide continuous capacity at the peak rate desired by the user is not feasible. Either the rate would be so low as to be irrelevant, or the necessary provisioning to provide a reasonable worst case rate would result in a network with vastly increased capacity and, presumably, cost.
Consider again the example network consisting of a single 45 mb/s link, supporting 280,000 subscribers. If that network offered a guaranteed minimum capacity, the guarantee must represent the worst case service, which implies dividing the 45 mb/s among all the subscribers; this leads to a guaranteed capacity of 160 b/s. The users of the hypothetical network, who are benefiting from the cost/benefit of the shared service, might not be much comforted by the guarantee of 160 b/s. This corresponds to delivery of a 2 KByte file in 100 seconds, whereas in practice they can usually receive it essentially instantaneously. In fact, from a marketing perspective, they might well be less satisfied to know how much sharing the provider was exploiting.
Since Frame Relay is being widely sold in the market, we are getting real experience with customer reactions. Many customers, to the surprise of the providers, set the CIR to zero. Informal interviews with providers suggest that what customers use in judging the service is the actual rates achieved in practice, not the guaranteed minimum rate.
b. Fair allocation service
Even if the user is willing to accept a service with no assured minimum rate, he may expect a fair allocation among users. If a provider is selling the same service to two users, and giving one a smaller share when they offer equal load, then that user presumably has a complaint. The point of a fair allocation service is to assure the various users that they are being treated in a fair way relative to each other.
At first glance, it would seem, lacking a guarantee of minimum rate, that an assurance of fair access to the capacity would be important in providing a viable commercial service. However, before adding costly mechanism to the network to insure fairness, we should first consider if fairness actually helps in meeting subscriber service requirements. In many cases, it does not.
As discussed above, what the user considers in evaluating the service being provided is the total elapsed time to complete the transfer of an object of some typical size, which may be very small or very large. In this context, what does fairness have to do with minimizing total elapsed transfer time? It is obvious that uncontrolled unfairness, in which one user might not get any access to the facility, is not acceptable. But what about "controlled" unfairness? Consider a number of users that happen to arrive at once with equal size objects to transfer, and two possible service techniques. In one, we send one packet of each user in turn; in the other we serve one user to completion, then the next, and so on until all the users are served. In each case, the same number of bytes are sent, so the last byte will be sent at the same instant in both schemes. But in the first case, no user is satisfied until the last round of packets are sent, while in the second case, all the users except the unlucky last one are completed much sooner.
The issue here is whether micro-fairness (at the packet level) leads to best service at the macro level (total elapsed data element transfer time). In the simple (and perhaps atypical) situation described above, total user satisfaction was enhanced by servicing the users to completion in random order, which is certainly not fair at the micro level. In practical situations, micro-fairness has not been shown to be useful in enhancing perceived user service times, and thus not proven an effective mechanism to add to the network.
A more fundamental question is whether fairness is actually the desired service. Consider two users, each transferring a file, with one file being 10 times larger than the other. Is giving equal access to the bandwidth most fair? Shifting bandwidth to the user with the larger file may benefit him, but not cause a perceptual degradation to the small user. In this case, what is the proper allocation of resources, either in a theoretical or practical sense? In the practical case of the Internet, one of its "features" may well be that a user transferring a large file can obtain more than his "fair share" of bandwidth during this transfer. The fairness manifested by this system is not that each user is given an instantaneous equal share, but that each user is equally permitted to send a large file as needed. While this "fairness" may be subject to abuse, in the real world it meets the needs of the users.
A final and critical point about fairness is the question of how we should measure usage to apply a fairness criterion. Consider a specific flow of packets, a sequence of packets that represent one transfer from a source. Each flow, along its path in the network, may encounter congestion, which will trigger a rate adjustment at the source of a flow. In concrete terms, each TCP connection would thus represent a flow.
It would be possible to build a packet switch which assured that each flow passing through it received an equal share. Methods to implement this, such as Weighted Fair Queuing[Demers, Clark] are well known. But this sort of switch would achieve local equality only inside one switch. It would not really insure overall fairness because it does not address how many flows each user has and how they interact. What if one user has one flow, and another ten? What if those ten flows follow an identical path through the net, or go to ten totally disjoint destinations? If they go to different destinations, what does congestion along one path have to do with congestion along another? If one path is uncongested, should a flow along that path penalize the user in sending along a congested flow? And finally, what about multicast flows that radiate out from a source to multiple destinations? Once questions such as these are posed, it becomes clear that until answers are offered, any simple imposition of fairness among competing flows at a single point of congestion has little to do with whether two users are actually receiving balanced service from the network. And, at present, there are not any accepted answers to these questions.
c. Priority scheduling
A final scheme that has been proposed for allocation of bandwidth among users is to create service classes of different priorities to serve users with different needs. Such a scheme is proposed in [Gupta]. The definition of priority is that if packets of different priority arrive at a switch at the same time, the higher priority packets always depart first. This has the effect of shifting delay from the higher priority packets to the lower priority packets under congestion. 
What does this mechanism have to do with service differentiation? Slowing down an individual packet does not much change the observed behavior. But the probable effect of priority queuing is to build up a queue of lower priority packets, which will cause packets in this class to be preferentially dropped due to queue overflow. The rate adaptation of TCP translates these losses into a reduction in sending rate for these flows of packets. Thus, depending on how queues are maintained, a priority scheme can translate into lower achieve throughput for lower priority classes.
This might, in fact, be a useful building block for explicit service discrimination except that priority has no means to balance the demands of the various classes. The highest priority can preempt all the available capacity and drive all lower priorities to no usage. In fact, this can easily happen in practice. A well tuned TCP on a high performance workstation today can send at a rate exceeding a 45 mb/s DS-3 link. Giving such a workstation access to a high priority service class could consume the entire capacity of the current Internet backbone for the duration of its transfer. It is not likely that either the service provider nor the user (if he is billed for this usage) desired this behavior.
The other drawback to a priority scheduler for allocating resources is that it does not give the user a direct way to express a desired network behavior. There is no obvious way to relate a particular priority with a particular achieved service. Most proposals suggest that the user will adjust the requested priority until the desired service is obtained. Thus, the priority is a form of price bid, not a specification of service. This is a rather indirect way of obtaining a particular service; by the time the correct priority setting has be determined, the object in question may have been completely sent. It is much more direct to let the user directly specify the service he desires, and let the network respond.
3.1 An alternative—expected capacity allocation
What is needed is a mechanism that directly reflects the user's desire to specify total elapsed transfer time, and at the same time takes into account such issues as the vastly different transfer sizes of different applications, 1 byte or 10 million bytes, and the different target transfer times, which may range from tenths of seconds to minutes or hours.
The Internet does not offer any guarantees of service, but the users do have expectations. Experience using the network provides a pragmatic sense of what the response will be to service requests of various sized at various times of day. This idea of expectation, as opposed to guarantee, is an important distinction. One of the successes of the Internet is its ability to exploit the mixing of traffic from a large number of very bursty sources to make highly efficient use of the long distance trunks. To offer hard guarantees is inconsistent with the statistical nature of the arriving traffic, as the discussion of minimum guaranteed capacity illustrated. However, it is reasonable for the user of the network (and the provider as well) to have expectations as to the service that will obtain.
Thus, the approach in this paper is to develop an approach to allocating service among users that attempts to offer a range of expectations, not a range of guarantees. This approach builds on the observed success of the Internet, while extending it to distinguish among users with different needs. We will call such a scheme an expected capacity scheme.
Expected capacity should not be defined in a single rigid way, such as a minimum steady rate. In general, it can be defined in any way that the provider and consumer agree, but the model that will be used in this paper, based on the discussion above, is to specify a maximum data object size that a user would like to transfer within some specified delay, together with an assumed duty cycle for these transfers. Thus, a useful expected capacity for a web browser might be: "a 2KB object with minimal delay every 10 seconds." For bulk data transfer, a possible profile would be: "a 1 MB object at 1 mb/s every 5 minutes."
In section 5, we describe a specific mechanism that can be used to implement expected capacity allocation. But first this approach to allocation is related to the problem of pricing for services.
4. Pricing the Internet service
Pricing serves several purposes. It should recover costs. It should be attractive to users. And, as observed above, it can serve to allow users to select among a choice of services, so that users who wish to use more resources can pay accordingly. This section of the paper looks at common pricing schemes from these various perspectives, in order to lay the groundwork for a new proposal.
To date, there have been two common ways to charge for Internet access. One is fixed fees or subscription. The other, which is not yet common but is being widely considered, is per-packet fees.
As noted above, the most common mode of payment for Internet service today is a fixed or subscription fee, usually based on the capacity of the access link. The point has been made a number of times [CSTB, Srinagesh, MacKie] that most of the costs for network providers are sunk costs. Unless there is congestion, the marginal cost of forwarding a packet is essentially zero. So, in the absence of congestion, a fixed charging scheme may well match the intrinsic cost structure. In practice, there are a number of motivations for fixed pricing:
- Predictable fees reduce risk for users, and to some extent for providers as well. Both the provider and the subscriber have a known expectation for payments, which permits planning and budgeting. This makes the price structure more attractive to the users.
- Fixed fees encourage usage, which (if marginal costs are zero) increases customer satisfaction and does not hurt the provider.
- Fixed fees avoid the (possibly considerable) administrative costs of tracking, allocating and billing for usage.
However, there are problems with simple subscription pricing. First, as discussed above, there are times when the system will become congested and the network will enter a region where the marginal cost is not zero. So the provider needs some technical and/or pricing mechanism that encourages use when the net is not congested, and pushes back when it is. As discussed above, the TCP rate adjustment algorithms are exactly this sort of technical mechanism, but this paper is presuming that pricing should also play a role here.
Second, there are situations where it is clear that access subscription does not capture important aspects of the cost structure. One obvious example is the difference between a single user and a subsidiary provider that aggregates many users and feeds them into the provider in question. Both might ask for the same peak access capacity.  In this case, it seems that the source representing the aggregated traffic will present more of a load on the provider, and thus should be expected to pay more.
To deal with these issues, one proposal has been to regulate usage by the imposition of fees based on the amount of data actually sent. However, there are concerns with pricing based on the total number of packets sent:
- There is a large worry that usage-based pricing will lead to a collapse of the whole revenue model. Usage based pricing will tend to drive away the large users (who may, for example, build their own network out of fixed price trunks), leaving only the smaller users, who will (in a usage based scheme) contribute only small fees. This will require the provider to raise his usage-based fees in an attempt to recover his fixed costs, and this will start a downward spiral of fleeing users and rising prices.
- There is now real evidence in the marketplace that some customers, given a choice of fixed or usage based pricing, will prefer the fixed fee structure. Thus, whatever the providers may prefer to do, competition may force some forms of fixed fee pricing in the marketplace.
The fundamental problem with usage fees is that they impose usage costs on the user whether the network is congested or not. When the network is not congested and the marginal cost of sending is zero, this charging represents a substantial distortion of the situation.
The challenge for a pricing structure, then, is to avoid the problems of usage based fees, while addressing some of the concerns that are not captured in simple access based subscription fees.
4.1 Pricing expected capacity
To deal with the sort of issues raised above, another pricing model is needed, which captures directly the issue that marginal costs are non-zero only during congestion. Tying pricing to the expected capacity of the user achieves this goal. Expected capacity allocation does not restrict the ability of the user to send when the net is underloaded. Expected capacity only defines the behavior that will result when the system is congested. So charging the user for expected capacity is exactly the same as charging the user for the privilege of sending packets when the marginal cost of sending is non-zero.
Expected capacity has a direct relationship to the facility costs of the provider. The provider must provision to carry the expected capacity of his subscribers during normal busy periods, and thus his provisioning costs relate directly to the total of the expected capacity he has sold. Of course, the provider can set prices in any way that he sees fit, but the benefit of using expected capacity as a way of allocating prices is that it relates to actual costs.
Expected capacity is not a measure of actual use, but rather the expectation that the user has of potential usage. This leads to very simple approaches to implementation, since it is not necessary to account for packets and bill users based on the number of packets sent during congestion. Instead, the provider can just provision the network with enough capacity to meet the expectations for which the users have contracted. This has the practical advantage that one need not implement, at each potential point of congestion within the system, accounting tools to track actual usage by each source. Much simpler schemes will suffice, as the example in the next section illustrates.
From the perspective of the user, the resulting price structure can also be very simple. While users with different capacity profiles will see different costs, the costs do not vary with actual usage, and thus meet the needs of many users for fixed or predictable costs. In essence, prices derived from expected capacity are like access subscription prices, except that they relate not to the peak capacity of the access link, but to a more complex expected usage profile for that link. Whatever sort of profile is agreeable to the user and provider can be used. The model here of sending objects of a certain size into the network with some duty cycle is only an example of an expected capacity profile.
5. A specific proposal for an expected capacity service
How might we go about implementing an expected capacity scheme? As an example, here is a specific mechanism. In fact, this proposal is overly simple, and does not address certain issues that may be very important. These are identified later. But this discussion will provide a starting point.
This mechanism has two parts: traffic flagging, which occurs in a traffic meter at access points, and congestion management, which occurs at switches and routers where packet queues may form due to congestion.
At the access point for the user, there will be some contract as to the expected capacity that will be provided. Based on this contract, the traffic meter examines the incoming stream of packets and tags each packet as being either inside (in) or outside (out) of the envelope of the expected capacity. There is no traffic shaping—no queuing or dropping. The only effect is a tag in each packet. To implement a pure expected capacity scheme, this tagging always occurs, whether or not there is congestion anywhere in the network.
At any point of congestion, the packets that are tagged as being out are selected to receive a congestion pushback notification. (In today's routers, this is accomplished by dropping the packet, but an explicit notification might be preferred, as discussed below.) If there is no congestion, of course, there is no discrimination between in and out packets; all are forwarded uniformly.
The router is not expected to take any other action to separate the in and out packets. In particular, there are no separate queues, or any packet reordering. Packets, those both in and out, are forwarded with the same service (perhaps FIFO) unless they are dropped due to congestion.
The consequence of this combination of actions is that a flow of packets that stays within its expected capacity is much less likely to receive a congestion indication than one that tries to go faster. Some flows may be going much faster than others under congestion, however, since they may have negotiated different expected capacities.
As discussed above, it is the nature of adaptive flow control algorithms such as TCP to repeatedly send at an increasing rate until notified to slow down. This is not an error or an unacceptable condition; it is normal. These rate increases serve to discover and use new capacity that may have become available. Thus, the normal behavior for all users will be to exceed their expected capacity from time to time, unless they have so little data to send that they cannot do so. So, under periods of congestion, all senders with significant quantities of data will execute a control algorithm such as the TCP slow-start/fast recovery, which will cycle the sending rate. Each TCP will receive a congestion indication when it exceeds its expected capacity and starts to send packets that are flagged as out.
The specific expected capacity profile that has been proposed is the sending of an object of some size into the net at high speed at some minimum interval. This sort of profile can be implemented using a scheme called token bucket metering. Imagine that the traffic meter has a bucket of in tokens for each user. The tokens are created at a constant rate (the average sending rate of the user), and accumulated, if unused, up to the bucket size, beyond which they are discarded. As each packet arrives from the user, it there are enough tokens in the bucket for all the bytes in the packet, the packet is marked as in, and the tokens taken from the bucket. If there are insufficient tokens, the packet is marked as out.
Thus, the desired service for a web browser could be translated into a token rate (a 2KB object every 10 seconds is 1600 b/s), and a bucket size of 2KB that the user is permitted to inject at a high rate. (The user might like a burst size somewhat larger than his average object size—some small multiple of the 2K average, but this is a detail ignored here.) If the user exhausts the bucket of tokens by sending an object of this size, the bucket will be replenished at the token rate, which will fill the bucket in 10 seconds.
5.1 Provisioning for expected capacity
The idea of tagging packets as in or out is not a new one. For example, researchers at IBM [Bala] proposed such an "in/out" tagging as part of a flow control scheme. Frame Relay has the concept of in/out packets (the DE bit), as does ATM (the cell loss preference bit). However, those schemes were proposed in the context of a reserved flow or virtual circuit from a source to a destination. In this scheme, there is no implication that the expected capacity for any user is reserved along a particular path. What is contemplated is a much looser coupling between expected capacity profiles and internal provisioning. The traffic meter applies the expected capacity profile to the total collection of packets from the user, independent of where in the network it is going. The resulting packets might then go along one path, might spread out along multiple flows, or might be multicast along a number of links at once. It might be the case that only a small part of the user's packets might be going along a congested path, but even so, if the user is in total exceeding his expected capacity, one of his packets, tagged as out, might trigger a congestion pushback signal.
For the provider, meeting the customer's expectation is a matter of provisioning. One goal of this scheme, which tags the packets from each user as to whether they are within the expected capacity, is to provide a clear indication to the provider of whether the net has sufficient overall capacity. If the provider notices that there are significant periods where a switch is so congested that it is necessary to push back flows that are not tagged as being out, then there is not sufficient total capacity. In contrast, if the switch is congested, but some of the packets are flagged as out, then the situation is simply that some users are exceeding their expected capacity (which is what a TCP will always attempt to do), and so pushing back on those users is reasonable, and not an indication of insufficient capacity.
As a digression, note that as access links get faster, it will be critical to distinguish between opportunistic utilization of capacity and real user demand. Today, many users attach to the Internet over relatively slow access links (e.g. a dialup modem or a 128 kb/s ISDN link); thus the maximum best case rate from a user is rather insubstantial. But if that same user were attached to a campus net with high speed internal links and a 45 mb/s access link, one user sending one flow could, if not slowed down, use the entire capacity of a 45 mb/s backbone link. Since that user might be satisfied with an actual transfer rate one or two orders of magnitude slower than that, it is critical that the provider not mistake this opportunistic transfer for an indication of insufficient link capacity.
5.2 Further options for expected capacity profiles
What sorts of expected capacity policies could be implemented using this in/out scheme? In essence, the traffic meter can tag packets according to any sort of expected capacity profile at all, subject only to the requirement that there be a reasonable expectation of enough capacity inside the network to carry the resulting flow. For small users, the details of the profile probably do not matter much to the provider; what does matter is the overall capacity required to carry the aggregated traffic, which can be monitored in the aggregate. This leaves the provider great latitude to negotiate profiles (and charges) that meet the needs of the users. For example, a user who wants to send large files, but only occasionally, could purchase a small average rate, but a large token bucket.
The user can send low priority traffic intermixed with normal priority by pre-setting the capacity flag to out for the former. The traffic meter should reset the flag from in to out for traffic that exceeds the profile, but should not promote traffic from out to in. The consequence of setting all the packets of a flow to out would be that under congestion, the sending rate of that flow would essentially drop to zero, while other flows continued at the expected capacity. And a source, by presetting only some of the packets in a flow as in, could limit the busy period rate of that flow to some constrained value.
This mechanism is actually sufficiently general that it can implement allocation schemes other than expected capacity. For example, it can be used to offer a service based on actual usage if desired. If the traffic meter does not always tag traffic, but tags it only on user request, then the resulting service is more a demand service, rather then a long term contract based on expectation. The provider will still need to provision for this pattern of usage, and would thus need to reflect in the price to the user the burden of allowing for these demands. But inside the network the operation is unchanged: with adequate provisioning, packets that have been accepted by the provider as in are forwarded without triggering congestion notifications.
The other goal for enhancing the current Internet service was to provide a better limit on the impact of worst-case user behavior, so that users are not disrupted by other users sending in an unexpected pattern. In the current Internet, the limit on any one user is the peak speed of the access link for that user. As noted above, the difference between the usage of a normal user and the load generated by a constant transmission at the peak rate of the access link may be very considerable. In contrast, with this expected capacity scheme, the "worst" behavior that a user can launch into the network is to utilize fully the expected capacity profile. Usage beyond that point will be tagged as out. It still may be the case that for many users, their actual usage is less demanding than their profile, and since the providers may provision their networks based on observed loads of in packets, there could be excursions beyond average usage that exceed provisioning. But the range between average and worst case is much reduced by this scheme. As a practical matter, since the Internet seems to work fairly well at present without these additional assurances, there is some reason to believe that improving the situation, even if there is still an opportunity for statistical variation, will be sufficient for providers to make reasonable capacity plans. Note that if a user persists in underusing his profile, it is to his advantage to purchase a smaller profile, since this would presumably reduce his fee. Thus there is a natural force that will tend to minimize the difference between the normal and worst-case user behavior.
5.3 Multi-provider Internets
The scheme described so far assumed that the source was connected to a homogeneous network: traffic was injected at the source, and the provider tracked the usage along all the links to insure that the network is properly provisioned. But this is too simple a picture. The Internet is composed of regions separately operated by different providers, with packets in general crossing several such regions on the way from source to destination. How does the expected capacity scheme work in this case?
In fact, it generalizes in a very straightforward way. Consider a link that connects two providers. Across that link in either direction is a flow of packets, some marked in and some marked out. Each of the providers has presumably made contract with the other to carry the traffic of the other. The traffic that matters is the traffic marked in, since that traffic represents the traffic that each provider agreed to carry under conditions of congestion. So each provider, in determining what his commitment is in carrying the traffic of the other, needs to examine the in packets.
This relationship could be implemented in a number of ways. Consider packets flowing from one provider, A, to another provider, B. First, A could purchase from B a certain expected capacity. B would then install at the interface with A a traffic meter. If A offers too many packets marked in, B's traffic meter will mark some as out. If this happens often, A's customers will complain to A that even though they are sending within their expected capacity profile, they are receiving congestion notifications, and A should respond by purchasing more capacity from B. (A does not require complaints from its customers to detect that there is insufficient expected capacity on the link; the traffic meter can report this directly, of course.) In this way, the commitment between A and B again resembles a provisioning decision.
Another business relationship would be for B to accept all of A's in packets, without using a meter to limit the rate, but to charge A according to some formula that relates the fee paid by A to the bandwidth consumption by in packets. As noted above, customers may wish to deal with congestion by paying either in delay or in money. This scheme permits both.
Expected capacity is thus proposed both as a means for a user to obtain a defined service from a network, and for two networks to attach to each other. That is, expected capacity can form a rational basis for establishing settlement payments. Each purchases expected capacity from the other; if there is balance there is no fee, but if there is imbalance, one should pay the other accordingly.
Especially when a user sends packets that cross many provider networks, the question of how the providers are to compensate each other is greatly clarified by this concept. Instead of tracking actual usage of each flow, the providers just compensate each other based on expected capacity, and then carry the packets without usage accounting. The result is easy to implement, relates to real costs, and scales to cover the case of many interconnected providers.
6. Limits of this scheme
The scheme described above is practical and could be deployed to good use. However, there are two important limits to this scheme. One, the transitive capacity problem, is intrinsic. The other is the need for payment both from sender and from receiver. This latter problem represents a significant issue, because it is very important that both forms of payment be considered, while at the same time it is very difficult to implement a scheme that reflects payment from the receiver.
6.1 Transitive capacity
Imagine that there is a packet path that crosses a number of providers from the source to the destination. In principle, each provider from the source to the destination will have purchased enough expected capacity from the next so that the packets flow unimpeded along the path. But what if this is not true? Perhaps the only path to that destination is through a provider network with insufficient expected capacity, and the sender receives congestion notification while sending within the contracted capacity.
Whose fault is this? Does the source have a complaint that they are not being given the service they paid for? How might this be addressed?
These issues cannot be resolved by demanding that each provider must purchase enough expected capacity from the others. Some providers may simply not have enough real capacity, and will thus represent a necessary point of congestion. And even if all the providers in a path have enough capacity to support the expectation of the sender, the receiver may have a final access link that represents a bottleneck. The users of the Internet must thus understand that even with the additional enhancement of expected capacity service, there is no guarantee that this rate can be achieved to any particular end point. Only if both end points are under the control of cooperating providers can this assurance be expected.
This observation leads to number of consequences. Most obviously, one or more providers may join together to insure that users attached to their combined facilities can count on receiving the contracted service. This service assurance will represent an important form of added value in the network, and will allow providers to differentiate themselves from each other. Second, providers may choose to offer and price different sorts of expected capacity service, depending on where the destination of the packet is. A user may be charged more or less depending on which sets of destination addresses can qualify to be tagged as in. These sorts of enhancements are possible, require only changes to the traffic meter at the source, and again give the provider more control over the quality and pricing of the service. The only constraint is the overall limit that along some paths the capacity simply may not be available.
6.2 Sender pays; receiver pays
To this point, the description of expected capacity has been in terms of the sender of the data. The sender purchases capacity from his immediate provider, which purchases it in turn from next attached providers, and so on all the way to the receiver. In return for this arrangement, the sender is permitted to send packets marked as in to the receiver. Capacity purchased and used in this manner is called sender expected capacity.
In practice, we cannot expect all capacity to be purchased in this way. There are many circumstances in which the receiver of data, rather than the sender, will be the natural party to pay for service. In fact, for much of the current Internet, data is transferred because the receiver values it, and thus a "receiver pays" model might seem more suitable. This assumption may be less universal today; if the World Wide Web is more and more used for commercial marketing, it may be that the sender of the information (the commercial Web server) is prepared to subsidize the transfer. But in other cases, where information has been provided on the Internet free as a public service, it seems as if the natural pattern would be a "receiver pays" pattern. In general, both of the conditions will prevail at different times for the same subscriber.
How can this idea of "receiver pays" be expressed in terms of expected capacity? With receiver expected capacity, the receiver would contract with his provider for some capacity to deliver data to the receiver, and that provider would in turn contract with other providers, and so on.
This pair of patterns somewhat resembles the options in telephony, with normal billing to the caller, but collect and 800 billing to the recipient. But technically, the situation is very different. First, of course, the Internet has no concept of a call. Second, the data flows in each direction are conceptually distinct, and can receive different quality of service. Third, which way the majority of the data flows has nothing to do with which end initiated the communication. In a typical Web interaction, the client site initiates the connection, and most of the data flows toward the client. When transferring mail, the sender of the mail initiates the connection. In a teleconference, whoever speaks originates data.
Abstractly, a mix of sender and receiver capacity makes good sense. The money flows from the sender and from the receiver in some proportion, and the various providers in the network are compensated with these payments for providing the necessary expected capacity. The money will naturally flow to the correct degree; if there is a provider in the middle of the network who is not receiving money, he will demand payment, either from the sender or the receiver, before allocating any expected capacity. And the resulting costs will be reflected back across the chain of payments to the subscribers on the edge of the Internet.
As a practical matter, payments in the Internet today resemble this pattern. Today, each subscriber pays for the part of the Internet that is "nearby." The payments flow from the "edges" into the "center," and it is normally at the wide area providers where the payments meet. That is, the wide area providers receive payments from each of the attached regional providers, and agree to send and receive packets without discrimination among any of the paying attached providers.
What is needed is a way to meld this simple payment pattern with the more sophisticated idea of expected capacity. And that brings us to a critical technical issue, which is that it is much harder to implement receiver expected capacity than sender capacity, because of the structure of the Internet.
6.3 The difficulty of implementing receiver expected capacity
The scheme described above for implementing expected capacity was based on the setting of a flag in each packet as it entered the network. This flag marked each packet as being in or out, depending on whether the packet was inside the contracted capacity profile. One could attempt to extend this scheme by putting two flags in each packet, one reflecting the sender's capacity, and one the receiver's. But this approach has several problems.
The first problem is that the tagging is performed by the traffic meter at the sender's access point. That traffic meter obviously can know about the expected capacity of the attached sender, but cannot possibly know what each of the many millions of possible receivers have purchased. So there is no realistic way for the sender to set the receiver's expected capacity flag without the design of some rather complex scheme by which the receiver informs the sender of what its expected capacity is. This problem is fundamental; it derives from an intrinsic asymmetry, which is that packets flow from the sender. This asymmetry makes the scheme described here applicable to sender expected capacity, but not directly applicable to receiver expected capacity.
The second problem, assuming that the sender could correctly set both the sender and receiver in/out flag in the packet, is to determine, at each point of congestion in the network, which of the flags applies. For part of the network, the receiver paid, and thus the receiver flag should apply. For other parts of the net, the sender flag should apply. But which applies in each case?
The third problem with putting a receiver expected capacity flag in the packet is that a multicast packet will go to many receivers, each of which may have purchased a different expected capacity profile.
These issues suggest that receiver expected capacity must be implemented in a way that does not depend on a control flag in the packet. There are two obvious implementation approaches. One involves the addition of mechanisms to the Internet to maintain enough state in routers to map packets to a capacity profile class based on their destination address. This approach would appear to be rather complex. The other approach involves utilizing control packets flowing in the reverse direction from the receiver of the data to the source. For example, TCP acknowledgment packets might be used to implement receiver expected capacity.
Since it is critical to support payment patterns in which both the sender and the receiver pay, these limitations must be resolved to make this approach general. However, there are significant technical issues there, which must be addressed in a way that do not add excessive complexity or overhead. This is the next step in a cost allocation scheme for the Internet.
6.4 Other uses for the tagging mechanism
The tagging scheme that permits congested routers to identify the packets that should receive a congestion indication can implement a broader range of services than just expected capacity. It can in fact implement a number of schemes, subject to the rule that the network supports the resulting traffic by provisioning for the aggregated load, rather than by setting up an explicit reservation for each flow.
An example of an alternative service and pricing scheme that could be implemented using the tagging mechanism is payment for actual use under congestion. Consider a user who anticipates a very small total usage per month. Independent of the short term expected capacity needed by that user, (e.g. web browsing or bulk data transfer), such a user might want a charging scheme that somehow reflected his very low long term usage. One way to accomplish this is for the traffic meter to impose two token buckets on the user at once. One is a short term bucket expressing the duty cycle of the data transfer, and the other a long term bucket reflecting the expected usage over some period like a month.
In the case of a long term bucket, there is more motivation for the user to hoard his tokens and to use them only when actual congestion is detected. That is, for a long term bucket, the user may prefer a contract that relates to "busy period actual usage" as opposed to "busy period expected capacity." The tagging scheme can also accommodate this pattern of use in a number of ways. One way would be to modify the user software so that it first attempts to send with the packets flagged as out, and if the resulting throughput is unacceptable then tries the transfer using the tokens. An alternative would be for the network to directly notify the traffic meter if congestion were detected, so that the meter could apply the tokens automatically.
This example is one reason why it is useful for the traffic meter to tell whether the network is congested. Today, the indication of congestion is a discarded packet. The TCP protocol at the sender detects lost packets using timers to trigger the retransmission of packets and the adjustment of sending rates. But the fact that a packet was discarded somewhere in the network is not visible to the traffic meter. If there was an explicit notification of congestion, such as the ICMP Source Quench (a congestion notification message defined but not currently in use) both the source and the traffic meter could observe this event.
The service provided by the Internet is a service that mixes a large number of instantaneous transfers of objects of highly variable size, without any firm controls on what traffic demands each user may make, and with user satisfaction presumptively based on elapsed time for the object transfer. We also conclude that while the mechanisms in the Internet seem to work today, the most significant service enhancement would be some means to limit the worst case behavior of each user, so that the resulting overall service was more stable, and some means to distinguish and separately serve users with very different transfer objectives, so that each could better be satisfied.
There are two general ways to regulate usage of the network during congestion. One is to use technical mechanisms (such as the existing TCP congestion controls) to limit behavior. The other is to use pricing controls to charge the user for variation in behavior. This paper concludes that it is desirable in the future to provide additional explicit mechanisms to allow users to specify different service needs, with the presumption that they will be differentially priced. This paper attempts to define a rational cost allocation and pricing model for the Internet by constructing it in the context of a careful assessment of what the actual service is that the Internet provides to its users.
Key to the success of the Internet is its high degree of traffic aggregation among a large number of users, each of whom has a very low duty cycle. Because of the high degree of statistical sharing, the Internet makes no commitment about the capacity that any user will actually receive. It does not make separate capacity commitments to each separate user.
Based on this model of the underlying service, the paper proposes a service enhancement called expected capacity, which allows a service provider to identify the amount of capacity that any particular subscriber is to receive under congested conditions. Expected capacity is not constrained to a single service model, such as a minimum fixed bandwidth, but can be any usage profile the provider and subscriber agree on. The example in this paper was a profile that let the user send bursts of a given size, and otherwise limited the user to a lower continuous rate.
This scheme differs from the more common approaches to resource reservation in that it does not make any explicit reservations along a path, but instead takes the much simpler step of aggregating all the traffic that is within the expected capacity profile of all the users, and then viewing the successful transport of this aggregated traffic as a provisioning problem. This raises the risk that on occasion, a user will not actually be able to receive exactly the expected capacity, but a failure of this sort, on a probabilistic basis, is the sort of service assurance that the Internet has always given, and that most users find tolerable.
Finally, this paper claims that expected capacity, because it represents how resources are allocated when they are in demand, represents a rational basis for cost allocation. Cost is allocated, not on the basis of actual use, but on the basis of the expectation of use, which is much easier to administer, and again reflects the statistical nature of the sharing that the Internet already provides.
The paper describes a concrete scheme for implementing expected capacity, based on a traffic meter at the source of the packets which flags packets as being in or out. It also notes a key limit of this particular scheme, which is that it easily implements sender expected capacity, but not receiver expected capacity. The paper argues that the latter as well as the former is necessary, and raised as an issue for further study how receiver expected capacity can be implemented.
A principle widely understood and used in computer system design is that a given mechanism may be used at different times to implement a wide range of policies. Thus, to build a general system, one should separate mechanism from policy in the design, and avoid, to the extent possible, building knowledge of policy into the core of the system. In particular, a goal for any mechanism that is implemented inside the network is that it be as general as possible, since it will take a substantial amount of time to get community consensus on the enhancement, as well as time to implement and deploy it.
The mechanism proposed here, which is the discrimination between packets marked as in and out for congestion pushback at times of overload, has the virtue of being simple to implement and capable of supporting a wide range of policies for allocation of capacity among users. It allows providers to design widely differing service models and pricing models without having to build these models into all the packet switches and routers of the network. Since experience suggests that we will see very creative pricing strategies to attract users, limiting the knowledge of these to a single point, the traffic meter for the user, is key to allowing providers to differentiate their services with only local impact. What must be implemented globally, by common agreement, is the format of the in/out tag in packets, and the semantics that out packets receive congestion indications first. Providers use the level of in packets to assess their provisioning needs and otherwise are not concerned with how, for any particular customer, the expected capacity profile is defined. This design thus pushes most of the complexity to the edge of the network, and builds a very simple control inside the switches.
In fact, the scheme can implement a wider range of services than just the expected capacity scheme. It can be used to implement essentially any service that the provider and the subscriber agree to, subject to the assumption that the service is implemented by aggregating the traffic from all the users and treating the service assurance as a provisioning problem. We argue that since the Internet works reasonably well today on this basis, an improvement based on the same general approach will be sufficient to meet the next generation of needs.
This paper has benefited from criticism and comment from a number of people. A first version of the material was presented at the M.I.T. Workshop on the Economics of the Internet. Many of the participants there provided valuable feedback on the material. I would like to thank Scott Shenker for his comments on an earlier version of this paper, and particularly to thank Marjory Blumenthal, who managed to read two earlier version of this paper and teach me a considerable amount of remedial economics.
David Clark (firstname.lastname@example.org) is affiliated with the Laboratory for Computer Science at MIT.
[Jacobson] V. Jacobson, Congestion Avoidance and Control, Proceedings of ACM Sigcomm '88, Stanford, Calf.
[MacKie] J. MacKie-Mason and H. Varian, Pricing the Internet, in Kahin and Keller, eds. Public Access to the Internet, MIT Press, 1995, pp. 269-314.
[Demers] A. Demers, S. Keshav and S. Shenker, Analysis and Simulation of a Fair Queueing Algorithm, in Journal of Internetworking: Research and Experience, 1, pp. 3-26, Also in Proc. ACM SigComm '89.
[Clark] D. Clark, S. Shenker and L. Zhang, Supporting Real-Time Applications in an Integrated Services Packet Network: Architecture and Mechanism, in Proc. ACM SigComm '92, Baltimore.
[Gupta] A. Gupta, D. Stahl and A. Whinston, Managing The Internet as an Economic System.
[CSTB] "Realizing the Information Future: The Internet and Beyond", Computer Science and Telecommunications Board, National Research Council, National Academy Press, 1994
[Srinagesh] P. Srinagesh, Internet Cost Structure and Interconnection Agreements, in Gerald Brock, ed., "Toward a Competitive Telecommunication Industry: Selected Papers from the 1994 Telecommunications Policy Research Conference, (Hillside, N.J.: Lawrence Erlbaum, in press)
[Bala] K. Bala, I. Cidon and K. Schraby, Congestion Control for High-Speed Packet Switched Networks, in Proc Infocom '90, San Francisco.
[Shenker] Shenker, S. "Service Models and Pricing Policies for an Integrated Services Network," in Kahin and Keller, eds. Public Access to the Internet, MIT Press, 1995, pp. 315-337.
1. This research was supported by the Advanced Research Projects Agency of the Department of Defense under contract DABT63-94-C-0072, administered by Ft. Huachuca. This material does not reflect the position or policy of the U.S. government, and no official endorsement should be inferred.
2. There has been considerable discussion as to whether it would be worth adding an explicit message to indicate to the source that congestion is occurring, rather than waiting for a packet loss. This point will be addressed briefly later.
3. This statement is not quite true. For a specific reason having to do with the dynamic behavior of the rate adaptation and the tail-trop FIFO queue discipline, isolated packets not a part of a ongoing sequence are slightly more likely to be discarded during congestion.
4. For a system with poor software, there may be several seconds of startup time, so that even for the case of image transfer, the impact of congestion may be hidden by fixed overheads. Additionally, most personal computers today will take several seconds to render a 500KB image, so the transfer time is masked by the rendering time.
5. A very simple network, composed of a single 45 mb/s 3000 mile transcontinental link, would cost about $135K per month at the tariffed rate of $45/mile. Dividing this cost by 280,000 users would yield a cost of $.48 per month, or $5.75 a year.
6. Other feature enhancements such as security would also be valued. But this discussion focuses on enhancements that have an impact on bandwidth allocation and the quality of the packet delivery service.
7. If there is no congestion, then there is presumably no queue of packets, which means that there is not a set of packets of different priority in the queue to reorder. Thus, priority scheduling normally has an effect only during congestion.