Queue-Management Architecture for Delay Tolerant Networking

. During the last years, the interest in Delay/Disruption Tolerant Networks has been significantly increased, mainly because DTN covers a vast spectrum of applications, such as deep-space, satellite, sensor and vehicular networks. Even though the Bundle Protocol seems to be the prevalent candidate architecture for delay-tolerant applications, some practical issues hinder its wide deployment. One of the functionalities that require further research and implementation is DTN queue management. Indeed, queue management in DTN networks is a complex issue: loss of connectivity or extended delays, render occasionally meaningless any pre-scheduled priority for packet forwarding. Our Queue-management approach integrates connectivity status into buffering and forwarding policy, eliminating the possibility of stored data to expire and promoting applications that show potential to run smoothly. Therefore, our approach does not rely solely on marked priorities but rather on active networking conditions. We present our model analytically and compare it with standard solutions. We then develop an evaluation tool by extending ns-2 modules and, based on selective scenarios primarily from Space Communications, we demonstrate the suitability of our model for use in low-connectivity/high-delay environments.


Introduction
Queue management in traditional networks is used mainly to regulate traffic fluctuations, as well as to assign priorities to specific traffic classes.It often utilizes dropping mechanisms that signal end-users, implicitly or explicitly, for impeding congestion events.Nevertheless, queue management in Delay/Disruptive Tolerant Networks (DTN) [1] [2], with long delays and disruptions has to address additional issues.While fair resource allocation and traffic classification are still important, queue management needs also to exploit every available contact opportunity and reschedule traffic prioritization and data storage to handle communication disruptions and delays.
We extend further the preliminary architecture proposed in [3] with mathematical analysis, architectural enhancements and systematic evaluation.Our architecture is composed now by three main components: i) an Admission Control unit, which determines the criteria that DTN nodes use to accept or reject incoming bundles, ii) a Buffer and Storage Management unit, which determines how accepted bundles should be stored, and iii) a Scheduling unit, which determines bundle service priorities.Here, we refer to data handled by DTN nodes as bundles, even though the architecture proposed is not confined by the standardized Bundle Protocol [4].The characteristics of each type of DTN network may vary and the objective each time may be different.Here, we emphasize on space applications: space satisfies both dimensions of DTN, that is, disruptions and long delays.We have primarily two goals: i) to increase the DTN device throughput via efficient link exploitation and ii) to increase application satisfaction.
From an engineer's viewpoint, DTN requires additional supportive functionality primarily for resource management.In DTN, resource management incorporates storage capacity as well, which in turn is associated with long delays for data forwarding and increased complexity for scheduling.For example, unlike typical IP network packets, bundles do not face the danger to be dropped, however they do face the danger to expire.Also, priority-marked packets need not prioritized service in case connectivity disruptions have damaged their scope already.
A traditional FIFO-Droptail queue policy may have been an initial candidate for such system.Apart from its simplicity and the fact that it may perform decently in low-traffic networks, this approach is flawed severely: we cannot assign different priorities to different traffic classes and, on top of that, queuing delays punish uniformly and cumulatively all users, even those that may have a chance to survive a potential short disruption.Alternatively, we could integrate a prioritization algorithm in our scheme, such as Priority Queuing (PQ), Fair Queuing (FQ), Class-based Weighted Fair Queuing [5] (CBWFQ) and Low Latency Queuing (LLQ) [6].A common, undesirable characteristic, however, is that priorities are typically predetermined and/or static, in the sense that do not incorporate connectivity feedback and hence cannot reflect a scheduling policy to a corresponding forwarding implementation.This non-typical requirement renders them only blind tools for DTN management and hence unsuitable, in their present form, for DTN networks.Additional approaches presented in [7] take account the low expiration time left and the number of times a packet has been forwarded, in order to drop it in cases of congestion.The most notable approach is SHLI (drop shortest life time first) which drops the packet with the lowest expiration time.However, this might have some undesirable side-effects when some packets have been delayed significantly, yet we can still manage to forward them before they expire.
Relevant work on DTN queue management is limited, and usually focuses on specific problems, such as policies or scheduling, providing a narrow approach to the queue management, occasionally isolating joint problems.However, some interesting work exists already.In [8], Amir Krifa et al., focus on queue management policies, and reach similar conclusions; they show that traditional buffer management policies such as drop-tail or drop-front are sub-optimal for use in DTN networks.Current implementations of the DTN, such as ION [9] and DTN2 [10], at this stage, adopt simple approaches to queue management considering the lack of corresponding standards.For example ION, deploys the bundle protocol approach, via a PQ scheme with three queues of outbound bundles, one queue for each of the defined levels of priority ("class of service") supported by BP.Our approach employs an enhanced classification scheme that integrates both network (i.e., connectivity) dynamics and traffic requirements.Therefore, scheduling is not a product of packet marks and hence, application alone, but rather a joint decision of data priority, application potential to survive disruptions and the network disruptions per se.
The rest of the paper is organized as follows.In section 2, we define our queue management model, including only the details necessary for the stochastic analysis, whereas in section 3 we present our stochastic analysis and the corresponding results.In section 4 we apply, validate and compare a part of our model through simulations in ns-2.Finally, in section 5 we conclude and set the framework for future work.

Delay Tolerant Queue Management Model
In order to define our queue-management model, hereafter referred as DTQM (Delay-Tolerant Queue Management) we consider the DTN network as a network with low connectivity, high and variable delays and absence of end-to-end path.We define DTQM in a way that allows us to include all the necessary functionality to satisfy a generic set of requirements.Thus, we divide DTQM into three units; i) Admission Control, ii) Buffer and Storage Management, and iii) Scheduling.We discuss all units' functionality, however, due to lack of space we emphasize on the most novel and sophisticated units, namely the Buffer and Storage Management and Scheduling units.
Admission Control: Admission Control determines how and which data may be accepted from a DTN node and is mainly related to data-custody requests.When custody requests are accepted by a DTN node, that node is obliged to maintain the bundles in its memory, until it is able to forward them, or until they expire.
Buffer and Storage Management: Contrary to traditional networks, where the routing nodes require buffers to implement a store-and-forward strategy, DTN nodes need additional persistent storage to maintain those packets that cannot immediately be forwarded due to limited connectivity.In IP-based routers, the main focus of researchers is to increase channel utilization and decrease delay through scheduling and dropping.This approach inherently assumes that end nodes respond to losses and therefore recover in short time.However, when connectivity is scarce, the requirement for short-time recovery is already violated.Furthermore, DTN networks introduce an additional level of complexity, as a result of combining both volatile and persistent storage.Clearly, the trivial approach to store every incoming bundle in persistent storage and move it to buffer upon request increases the processing delay of all the bundles and fails in cases of applications engaged in low-delay transfers.
In Fig. 1 we depict graphically the Buffer and Storage Management unit.Generally, this model is composed by two units, the Policy unit and the actual Storage unit.The purpose of the Policy unit is to accept all the bundles that enter the node and, depending on the conditions, move them to buffers or storage.Buffer and Storage management is initially differentiated based on whether there is connectivity between the DTN node and the next-hop.During periods of connectivity, packets that enter the node may be immediately routed to the output without being stored first.The total sending rate is calculated by the sum of sending rates of the Connectivity and Non-Connectivity buffers.In more detail the purpose of each storage unit can be described as follows: • Connectivity buffer.The Policy unit moves bundles to the Connectivity buffer only when there is connectivity and therefore the corresponding bundles can be forwarded to the next node.After a time-period which is determined by some threshold, when no connectivity exists, bundles that are stored temporarily in the Connectivity buffer move to Persistent storage.
• Persistent storage.The Policy unit moves bundles to Persistent storage in three cases: i) when there is no connectivity, ii) when there is connectivity but no Connectivity buffer space available, and iii) when there is both connectivity and Connectivity buffer space available, however the contact graph, which is known a priori, instructs that time does not suffice to forward bundles to the next hop.
• Non-Connectivity buffer.Bundles are moved from storage to the Non-Connectivity buffer in the following two cases: i) when bundles are of high priority (are either urgent or a scheduled contact is expected) and there is no connectivity and ii) when there is connectivity but other bundles are selected to be forwarded (opportunistic contact).The algorithm that determines which bundles should be forwarded first at a given communication opportunity is described briefly in the Scheduling section.
Our proposed model additionally deals with the problem of increased processing delay.In the event of multiple nodes in a row that are actively connected, transmission is rather straightforward, since packets are transferred from buffer to buffer without the interference of storage.In the event of short connectivity no further delay due to storage retrieval is imposed; bundles have already moved to the Non-Connectivity buffer, and hence bandwidth through short connectivity can be fully exploited.
Scheduling: Scheduling unit reassigns the priorities for each bundle and determines which bundles should be outputted from the DTN node when a communication opportunity occurs.A priority-oriented model should be inevitably considered; this model should incorporate application requirements, data requirements, Time-to-Live (TtL) for bundles etc.
Although we will not delve into more details, our scheduling depends heavily on the arrival timestamp and on TtL.In order to enhance application service, we promote both packets that have recently arrived in the node and packets that are near their expiration.This approach decreases significantly waiting delays and promotes application satisfaction.In Fig. 2 we depict the priority function used, where ToD (Type of Data) is a specific identifier that denotes the packet traffic class and TtL denotes the expiration time.

Stochastic Analysis
The purpose of this analysis is to highlight the advantages of our proposed model over traditional scheduling approaches.Our model consists of a primary FIFO queue (Connectivity buffer) and a secondary supportive queue (Non -Connectivity buffer), which serves high-priority bundles.We note that, in the context of Space, the queuing delay involved does not expect to contribute significantly to the total application delay, given the high propagation delay, and furthermore, the potentially very high storage delay involved in typical Space applications.Therefore, a priority queuing (PQ) or a PQ-derived scheduling model for incoming packets does not present conceptually a tempting approach, since it may fail with long-stored packets.Clearly, our approach departs from a FIFO scheme, and therefore, calls for a straightforward comparison with a typical FIFO scheme.However, for completeness, we extend our stochastic analysis also for PQ, which we consider as a theoretical upper bound (only) when connectivity is always present.
It is apparent, from an engineer's perspective, that our model is designed to handle disconnectivity/disruption issues.As such, it is reasonably expected to perform better in environments with limited connectivity, especially against traditional queuing schemes, which do not include a native mechanism to handle intermittent connectivity.Nevertheless, to achieve fairness towards FIFO and PQ, we use a worstcase scenario for DTQM, where connectivity is always available on the system.The results shall indicate to which extent our model is able to perform satisfactorily.In such environments, where connectivity is always available and we do not exploit DTQM's full potential, even a performance comparable to PQ and better than FIFO is acceptable.
We initiate our analysis by modeling network traffic.Packet arrival is modeled as as an exponential process and packet departure as a general distribution process.This is not, however, a globally valid assumption, since different types of traffic can result in different distributions concerning the packet arrival and departure.Nonetheless, as our knowledge on the possible DTN applications is limited, we assume that the environment and the applications under investigation manifest all the necessary characteristics of a M/G/1 system based analysis.The potential existence of selfsimilar characteristics is not considered here.Furthermore, we assume that all packets have the same size, non-preemptive priority is enforced and flows correspond to different traffic classes.One might argue, from an analytical perspective, that precision is rather dubious when we attempt to analytically compare different queuing policies with potentially distinct goals.However, there are several occasions when these mechanisms are indeed equivalent and directly comparable.For example, as throughput is decreasing, the behavior of these three systems converges.
We begin our queuing analysis by estimating the average system delay for each flow in a PQ scheme.We consider a single-server PQ system fed by three Poisson streams with arrival rates λ 1 , λ 2 and λ 3 .Each stream can be considered as a flow of data generated by various applications, in our case Real-time (RT), Telemetry (TM) and Telecommand (TC) applications.The buffers corresponding to different flows are infinite and packets in each buffer are served in the order they arrive.Thus, we use three queues, one per flow, and three priority classes.We limit the overall system utilization by setting ρ i <1 and ρ 1 +ρ 2 +ρ 3 <1.This keeps the system from being overloaded and cancels the possibility of flow starvation.Table 1, presents the notation used throughout the present mathematical analysis.Average queuing delay of a class-i packet

Ti
Average system delay of a class-i packet X 2  Second service moment That said, we consider the i th data packet arrival at the first queue of the PQ system.Since class-1 packets have the highest priority, the i th packet that has just arrived must wait in queue for a mean residual time R until the end of the current packet transmission, plus the transmission time required for a mean number of packets N 1 currently in the first queue, preceding the i th packet.
We calculate the mean residual time, for M/G/1 systems, by the formula ( [11]): Now, according to Little's law [12], the average number of packets waiting in the system is equal to the average delay multiplied by the average arrival rate of the system.We apply Little's law to the class-1 queue.As the average queuing delay for class-1 packets is W 1 and the average queue occupancy is N 1 with arrival rate λ 1 , we have Similarly, by enforcing non-preemptive priority queuing and according to [11] the average queuing delay for class2 and class-3 packets is accordingly: Finally, by adding the service time 1/μ of the i th packet in the equations ( 4), ( 5) and ( 6), we can calculate the average system delay for class-k packets: Next, we estimate the average delay for each flow in a FIFO scheme.In this scheduling scheme, each flow has the same average queuing delay, which depends on the average number of packets in queue N, and the mean residual time, R. According to [11] the average system delay in a FIFO scheme is: However, in order to guarantee that the service distribution of the FIFO queue corresponds to an appropriately weighted sum of the service distributions for the different classes in the priority queue scheme, we set: We continue our analysis by estimating the average delay for each flow in a DTQM scheme.We divide the analysis in two parts.Since DTQM uses two outgoing queues (plus the permanent storage -see Fig. 1) one for connectivity and the other for non-connectivity data, we can safely assume that the first one emulates a FIFO queue while the second one approaches the behavior of a PQ scheme.The latter assumption holds since the prioritization function that we apply (see Fig. 2), requires packet sorting.That said, the first queue analysis in terms of average system delay is directly comparable with a FIFO scheme, while the second queue calls for a PQ-based analysis.In order to fairly evaluate our proposed scheme, we omit permanent storage from the analysis.This allows the three systems to present similar properties and hence be comparable.We consider an average system arrival rate λ equal to the other systems and set arrival rates, without loss of generality, for the two queues λ a = 0.4λ and λ b = 0.6λ.The selection of coefficients 0.4 and 0.6 as the preferred values for our stochastic analysis was based on some initial empirical calculations and will be calibrated further according to emulation results.Finally, the total average system service rate will be μ where μ a =0.4μ and μ b =0.6μ for first and second queues, respectively.
To start off, the average queuing delay for the first queue does not differentiate per flow and can be calculated based on the average number of packets in queue-1 and the mean residual time, using equation ( 2) and replacing λ 1 , λ 2 , μ 1 and μ 2 with λ a , λ b , μ a and μ b respectively.

R N W
By applying Little's law for queue-1 we get: From equations ( 10) and ( 11) we get: The average system delay for all the flows in queue-1 is: We now approximate the behavior of the second queue as follows.The queue can split into three sub-queues or subclasses.Each class will be served using a PQ-based scheme.Furthermore, the proportion of packets for each class on the total available capacity of the queue-2 buffer is λ 1 /λ for class-1 packets, λ 2 /λ for class-2 packets and λ 3 /λ for class-3 packets.Therefore, the average queuing delay for class-1 packets is: By applying Little's law [12], we obtain: From equations ( 14) and ( 15) we get: The average queuing delay for the second subclass depends on N 1 packets, which are buffered in the first subclass, plus N 2 packets, which are buffered in the second subclass, plus the residual time R.In our case, unlike the ordinary properties of PQ, our design assumptions do not permit the possibility of higher priority packets to rearrange the queue at any given stage.Therefore, we have: From equations ( 17) and (18) we obtain: By the same token, average queuing delay for third subclass is: Finally, system's average delay for each subclass is: Having calculated the average delays for the two queues separately, we will combine these values to acquire the total delay.A statistically acceptable method for doing this is by using weights.Considering the way that we have defined the problem, it is logical to expect that each queue will contribute with a different percentage to the overall system delay.Hence, we will consider that queue-1 and queue-2 will contribute to the total average system delay by 40% and 60%, respectively.Considering the values of λ and μ in each queue, we have.As for the numerical results presented below, the value of system service rate is constant at 10 packets/sec, whereas the values of λ vary in order to obtain the possible range of system utilization, 10% -90%.Τhe value of service rate, considering packet sizes in the order of KB, provides an acceptable rate of data transmission, especially in space environments and among low energy sensors.

Fig 3. Numerical results
The numerical results of our analysis are presented in Fig. 3.We compare the aforementioned queuing schemes based on the average delay for each queue in each system and the average queue occupancy.The results show that our approach achieves a performance clearly better than FIFO and in some cases better than PQ, especially when the system utilization factor is high.

Experimental Evaluation
Evaluating such a queue management policy requires extensive experiments using an actual space network, since DTQM affects both lower-level (battery lifespan) and higher-level (throughput, latency) performance metrics.We implement and evaluate our proposed solution using the ns-2 software simulator This implementation was not trivial and required significant time considering the fact that ns-2 does not support DTN.In this work, we focus on the second and third components of DTQM, namely the Buffer and Storage Management and Scheduling parts, leaving the rest of the architecture evaluation as future work.
We apply DTQM to the network topology of Fig. 4. We assume three sending nodes, S 1 , S 2 , S 3, which are located in space and send traffic generated by various applications (Real-time, Telemetry and Telecommand; all constant bit rate with different sending intervals) and a receiving node R, which is located on earth.All traffic generated by the sending nodes is routed through the routing node Q in which we deploy DTQM.Since we assumed that the sending nodes are located in space, the connectivity of the wireless links should be intermittent.In order to emulate a DTN environment where connectivity is not predetermined, thanks to alternative routes and connections, we select a random disconnectivity pattern with uniformly distributed connectivity disruptions spread across the entire duration of the experiment, as the most appropriate for evaluating the proposed architecture.In this context, link S 1 -Q is uniformly unavailable in total 1% of the time of the experiment.Similarly link S 2 -Q is 5% unavailable, link S 3 -Q is 10% unavailable, and finally, link Q-R is 0.5% unavailable.We set the total time of the experiments to one hour, and measure the total number of received packets for all three sending nodes, as well as the Application Satisfaction Index (ASI) [14] of the network, a metric that highlights the contribution of the queuing delay to the total delay.In order to add reliability to our results and enforce randomness to take effect we repeat the experiment several times.In particular, the performance of DTQM was evaluated in five connectivity scenarios, each one utilizing a different (randomly generated) connectivity schedule.We compare the obtained results with the corresponding results of a FIFO policy.In line with the priority function used (see Fig. 2) we assign ToD for each application as follows: 1 for the RT application, 2 for the TM application and 3 for the TC application.Furthermore, we set the packet intervals for each application as we would expect in real-life, that is, Real-Time packets are generated with the shortest interval and Telecommand packets are generated with the longest interval; hence the difference in packet numbers.
In Fig. 5 below, we present the results from the comparison of DTQM against Droptail, using ASI and average delay as our performance metrics.The first observation that we can make is that DTQM outperforms Droptail in any case.In particular, we experience a 60% delay decrease on average and in some cases it can reach up to 90% reduction of the corresponding bundle delay using a typical FIFO scheme.This delay decrease is also reflected on the system ASI, which is increased 20%, on average.2, we notice that the received packets are almost the same in any case (with the exception of the TM packets of the worst case experiment) regardless of queuing policy.Nevertheless, we may experience up to 40% increase in received Telemetry and Telecommand packets when DTQM is deployed, since we assign them with higher ToD.Moreover, the most interesting observation is the undoubtable improvement of the average delay regardless of the application.However, since we transfer the same number of packets in both cases, how can we justify the delay decrease?DTQM uses a sophisticated scheduling algorithm that assigns higher priority to the packets most recently arrived in the node and promotes them in the queue (see Fig. 2).Thus, packets are reordered in the buffer based on the time they entered the routing node.Classic scheduling uses a rigid FIFO approach, which although it seems to promote fairness, in fact it increases the communication time, with the risk of dropping a packet due to TtL expiration.
In this paper we proposed a novel architecture for queue management in DTN nodes.Although the available space confines us from presenting a more detailed version and evaluation of our model, we sketched several of its characteristics.
One of the most interesting results was initially introduced by the stochastic analysis, which yielded positive results for the operation of our model that exhibits a behavior far superior to FIFO and comparable to PQ, even in network conditions that are unfavorable for our scheme.We also obtained supportive results from the conducted experiments that alleviate worries from adopting numerous assumptions on the stochastic analysis section.Therefore we can safely claim that DTQM has the potential to achieve smaller queuing delays and higher application satisfaction when connectivity is scarce.
Our next step is to enhance our evaluation towards two directions: i) to present a more detailed analysis, which incorporates total capacity and storage capacity as well, in order to highlight one major property of DTN and ii) to extend the experiments by using the space-oriented testbed [15] that we have developed in our lab.

TABLE I .
NOTATIONNiAverage number of packets in each queue λi, λPacket arrival rate at class-i queue / Total packet arrival rate μi, μPacket service rate at class-i queue / Total packet service rate ρi, ρ System utilization factor per class-i / System utilization factor R Mean residual service time Wi

TABLE II .
EXPERIMENTAL RESULTS

Table 2
demonstrates in detail the best and worst case results for DTQM.By viewing Table