The H.264 video encoding technology, which has emerged as one of the most promising compression standards, offers many new delivery-aware features such as data partitioning. Efficient transmission of H.264 video over any communication medium requires a great deal of coordination between different communication network layers. This paper considers the increasingly popular and widespread 802.11 Wireless Local Area Networks (WLANs) and studies different schemes for the delivery of the baseline and extended profiles of H.264 video over such networks. While the baseline profile produces data similar to conventional video technologies, the extended profile offers a partitioning feature that divides video data into three sets with different levels of importance. This allows for the use of service differentiation provided in the WLAN. This paper examines the video transmission performance of the existing contention-based solutions for 802.11e, and compares it to our proposed scheduled access mechanism. It is demonstrated that the scheduled access scheme outperforms contention-based prioritized services of the 802.11e standard. For partitioned video, it is shown that the overhead of partitioning is too high, and better results are achieved if some partitions are aggregated. The effect of link adaptation and multirate operation of the physical layer (PHY) is also investigated in this paper.
Multimedia applications, such as video telephony and streaming, are becoming an important part of the network user experience. This trend is in part due to the advent of efficient video compression technologies such as the H.264 standard , and the increase in the speed of access networks. It is, thus, necessary to support such multimedia applications in widespread broadband access networks such as IEEE 802.11 Wireless Local Area Networks (WLANs) [2, 3]. Achieving this goal is a challenging task since wireless networks are inherently less reliable in the physical layer, and the operation of the medium access control (MAC) layer of the 802.11 WLAN is greatly dependent on the pattern of the traffic offered by the application layer. Therefore, it is necessary to control the parameters and operation of each layer in conjunction with the others to provide the necessary quality of service for multimedia traffic. This paper focuses on the delivery of H.264 video over 802.11e WLANs and studies the features and operations in each layer that can be controlled through cross-layer mechanisms. In particular, we consider MAC and PHY layer mechanisms for the delivery of different profiles and types of H.264 video, including partitioned H.264 video as well as basic profiles. We consider controlled and contention-based access in the MAC layer, and investigate possible performance improvements through customized link adaptation in the PHY.
The H.264 standard provides a network abstraction layer (NAL) for adapting the output of the video encoder to the requirements of the underlying delivery technology [4–6]. The underlying delivery technology discussed in this paper (i.e., an 802.11e WLAN) uses a carrier sense multiple access (CSMA) MAC layer with controlled and contention-based access methods. Utilizing user defined mechanisms, it is possible to achieve prioritized or guaranteed services in the 802.11e MAC layer. Although such services can be simply used to serve the video traffic as a regular stream, higher performance, efficiency and video quality can be achieved if the MAC and NAL services and parameters are optimized using the available information from each layer.
Most of the previous research on supporting H.264 video transmission over wireless environments was focused on general or cellular wireless networks and did not address the specific issues of WLANs [7–9]. Few notable solutions that address WLANs ignore the complexities of the MAC layer operation [10, 11]. One noteworthy solution is described in . This work presents a way of mapping 802.11e MAC priorities to different types (partitions) of H.264 frames for improving the quality of video transmitted over WLAN. This method, however, ignores the large overhead of the PHY and MAC layers and the sensitivity of the MAC layer operation to traffic pattern characteristics such as packet sizes. Moreover, the use of contention-based mechanisms and simple priorities results in a very inefficient operation, as is later shown in this paper.
This paper proposes a cross-layer design that is comprised of mechanisms in the application, transport, and MAC layers. The design is based on mapping of the MAC scheduling services to different partitions and priorities provided by the H.264 encoding scheme. The scheduling services are provided using our previously published scheduling algorithm, controlled access phase scheduling (CAPS) , and its modified version for partitioned H.264 video. An enhancement based on aggregation of some H.264 partitions is also proposed. A summary of some of the mechanisms to deliver partitioned H.264 video over WLANs is presented in our work in . This paper elaborates on solutions presented in and considers other issues such as multirate operation. It also presents modifications to CAPS for partitioned H.264 video. In addition to these MAC layer mechanisms, the effect of PHY link adaptation and its possible customization for partitioned H.264 video are investigated in this article.
This paper is organized as follows. Section 2 briefly reviews features of the H.264 video encoding standard and 802.11e WLAN standard. It highlights the error resiliency and network delivery related features of the H.264 encoding technology. It also emphasizes the specific MAC layer services of the 802.11e standard that are designed to support multimedia applications. Section 3 describes CAPS and its specific features that are crucial for providing guaranteed services for multimedia applications. In Section 4, our solutions for transmission of H.264 video over WLANs are presented. The proposed solutions are then compared with the existing schemes. Conclusions are presented in Section 5.
2. Overview of H.264 and IEEE 802.11e Standards
2.1. The H.264 Video Compression Technology
The H.264 standard consists of two conceptually different layers: video coding layer (VCL) and network abstraction layer (NAL). VCL is designed to be transport unaware and only contains the core video compression engines that perform tasks such as motion compensation, transform coding of coefficients, and entropy coding. VCL generates the encoded video slices, which are a collection of coded macroblocks (MBs) [1, 4]. These coded slices are passed to the NAL, where they are encapsulated into transport entities of the network. The NAL provides an abstraction layer that helps in abstracting the output of the VCL to the requirements of the underlying delivery technology.
The H.264 NAL defines an interface between the video codec and the delivery or transport mechanism. The data structure, output by the NAL, is called an NAL unit (NALU) and consists of a one-byte header and a bit string containing the bits of a coded slice (a collection of coded macroblocks). One of the fields of the NALU header is the NALU type which can be used for signaling the delivery layer of the class or type of service required by this NALU.
The H.264 standard introduces a new design concept that enables it to generate self-contained packets without requiring large header fields. To do so, the encoder separates the higher-layer metainformation relevant to more than one slice from the media stream or video slices. The higher layer information is then delivered to the decoder using a reliable communications mechanism (inband or out of band) before transmitting the stream of video slices. This way it is possible to reduce the header information in each video packet to a codeword that identifies the set of parameters required for decoding the packet. The combination of higher-level parameters is called the parameter set concept (PSC) and usually includes information such as picture size, optional coding modes employed, and MB allocation map.
It is necessary that the information contained in the PSC arrives reliably at the decoder, otherwise the H.264 codec will not be able to decode the video. However, the loss of coded slices is tolerable at the decoder. In fact, the H.264 standard specifies a number of error resilience techniques . One of these techniques, which is in particular interest to network applications, is data partitioning (DP). With DP, each video slice data is partitioned to three groups with different importance, each group delivered in a separate packet. Using this technique, higher-priority data can receive better services from the delivery layer.
The extended and baseline profiles of H.264 are designed for video communications applications. The data partitioning mechanism is not available in the "baseline" profile; however, it is upported in the "extended" profile. Therefore, the solutions based on DP are only applicable to the extended profile. Data partitioning is an important feature that allows a network-aware video encoder to achieve higher-performance levels in a network that provides unequal error protection or quality of service. We examine the video communication techniques based on data partitioning in this work.
When data portioning is used, the compressed data is divided into the following three units of different importance.
(i)Partition A, contains the most important information such as MB types, quantization parameters, and motion vectors. Without partition A information, symbols of the other partitions cannot be decoded.
(ii)Partition B (intrapartition), contains intracoded block pattern (CBP) and intracoefficients. Since the intrainformation can stop further drift, it is more important than the interpartition (type C). The information in partition B packets can only be decoded if the corresponding partition A is available at the decoder.
(iii)Partition C (interpartition), contains only inter-CBPs and intercoefficients. This partition is the least important because its information does not resynchronize the encoder and decoder. The information in partition C packets can only be decoded if the corresponding partition A is available. However, the availability of partition B is not required.
If partition B or C is missing, the decoder can still use the header information, delivered by partition A packets, to improve the efficiency of error concealment. In fact, a comparatively high reproduction quality can be achieved if only texture information is missing and the MB types and motion vectors are available (from partition A).
2.2. The 802.11e WLAN Standard
The MAC layer of the 802.11 standard is based on the CSMA/CA mechanism [2, 3]. Collision avoidance is achieved through a distributed coordination function (DCF) that specifies the timing rules of accessing the wireless medium. Stations running DCF have to wait for an interframe space (IFS) time before they can access the wireless medium. The IFS is frame-type dependent. An arbitration IFS (AIFS) is used for data frames. The access point (AP) uses a PIFS (point coordination function IFS), which is shorter than AIFS, for management and polling messages; therefore, it can interrupt normal contention and take over the channel to create periods of contention free access called controlled access phase (CAP). As a result, the timeline of a WLAN can be viewed as being always in contention mode, interrupted occasionally by AP controlled CAPs.
MAC layer rules for controlling and coordinating access to the wireless medium in the 802.11e standard are specified under the hybrid coordination function (HCF) protocol that works on top of DCF . Using the services of DCF, HCF offers two access mechanisms: EDCA (enhanced distributed channel access) which is an enhanced version of the DCF of the original standard and is used for contention-based access, and HCCA (HCF controlled channel access) that specifies the polling or controlled access schemes. The 802.11e standard defines 8 different traffic priorities in 4 access categories (AC0–AC3) and also enables the use of traffic flow IDs, which allow per flow resource reservation. Access to the medium is normally done through EDCA; however, the AP can interrupt the contention period (CP) at almost any time by waiting a PIFS time, and initiate a CAP to allow HCCA access (Figure 1). This feature allows scheduled HCCA access to the channel; however, the standard does not mandate any specific scheduling algorithm for HCCA. An early solution (CAPS) proposed by the authors of this paper fills this gap and specifies such a scheduler .
Figure 1. 802.11e operation: CAP generation.
The 802.11e standard also introduces the concept of transmission opportunity (TXOP). TXOP specifies the duration of time in which a station can hold the medium uninterrupted and perform multiple frame exchange sequences consequently with SIFS spacing.
Under EDCA access mechanism, different AIFS values are used for different classes of traffic. The contention windows, from which random backoff durations are selected, are also different for each priority. Shorter AIFS times and smaller contention windows give higher-access priority. This prioritization enables a relative and per class (or aggregate) QoS in the MAC. The 802.11e standard suggests a specific access category (AC) for video traffic and recommends priorities 4 and 5 for video. However, this assignment is not mandatory and user-defined mechanisms can use different configurations.
The physical (PHY) layer of the 802.11 standard allows each packet to be transmitted at a different rate. The multirate operation is achieved using adaptive modulation and coding in the PHY. The mechanism that controls the transmission rate is called link adaptation (LA). The standard does not mandate any specific link adaptation algorithm. Conventional link adaptation schemes attempt to maintain a target bit error rate (BER) or packet error rate (PER) by adjusting modulation and coding parameters. Lower transmission rates usually yield lower BER. This article considers the multirate operation and LA scheme in resource reservation and assignment for video flows.
3. Guaranteed Service Provisioning in WLANs
Providing guaranteed services in WLANs is a challenging but feasible task. The 802.11e standard offers features for generating contention-free durations (known as CAP) that if scheduled properly can provide guaranteed channel access to stations . The standard, however, does not specify any scheduler for this purpose and leaves it to developers to devise such schedulers. We propose the use of CAPS for this purpose. There are several other scheduling mechanisms that have been proposed in literatures [15, 16]. These mechanisms cannot be directly used with partitioned H.264 video and do not provide partial service guarantee or fairness in multirate networks. It has already been shown in  that only CAPS is able to provide fair-guaranteed services in 802.11e WLANs. To provide such services, a QoS scheme must possess the following three features, each addressing an aspect of scheduling in a multirate 802.11e WLAN: (1) the ability to schedule uplink/downlink traffic at the same time, (2) the ability to schedule and switch HCCA/EDCA access, (3) and the ability to maintain fairness and isolate flows from each other. This ability must be maintained under multirate operation of a WLAN. The above features are all supported by CAPS .
Most of the CAPS functionality is implemented in the access point. CAPS uses the concept of virtual packets and combines the task of scheduling uplink and downlink flows of a naturally distributed CSMA/CA environment into a central scheduler that resides in the AP. The central scheduler uses a generalized processor sharing (GPS)-based algorithm, accompanied by an integrated traffic shaper, to provide guaranteed fair channel access to HCCA flows with reservation. The traffic shaping and scheduling mechanisms limit the HCCA service to the reserved amount and share the remaining capacity using EDCA. Through a modified central scheduler (e.g., temporal or throughput fair SFQ) that is based on start time fair queuing, multirate operation and packet loss issues are handled and fairness of the scheduling algorithm is maintained. The architecture of a station and an AP that implement CAPS is depicted in Figure 2. A complete description of the CAPS framework is found in . Here, only some features of CAPS are highlighted that are directly related to the proposed cross-layer mechanism.
Figure 2. Architecture of a station and access point which implement the CAPS-based mechanisms.
3.1. Combining Downlink/Uplink Scheduling
In a CSMA/CA WLAN, the medium is shared between downlink and uplink traffic at all times. Therefore, the scheduling discipline must consider both uplink and downlink traffic for scheduling at all times. Downlink packets are available in the AP buffers and can be directly scheduled, whereas uplink packets reside in the stations generating these packets and cannot be scheduled directly. However, the AP can use uplink traffic specifications, available through signaling (e.g., MAC signaling messages such as ADDTS) or feedback, and schedule poll messages that allow for uplink packet transmission.
The key to realizing the above scheduling concept is to represent packets from remote stations (i.e., uplink packets) by "virtual packets" in the AP, then use a single unified scheduler to schedule virtual packets along with real packets (downlink packets). When scheduling virtual packets, the AP issues poll messages in an appropriate sequence to generate transmission opportunities for uplink packets. This hybrid scheduling scheme combines uplink and downlink scheduling in one discipline and allows the use of a centralized single server scheduler design as shown in Figure 2.
As it is seen in Figure 2, there are two sets of queues, one serving packets without any HCCA reservation (EDCA queues), the other serving packets that belong to sessions with HCCA reservation (including virtual packet queues). This queuing architecture allows the coexistence of both types of prioritized and guaranteed access traffic. The scheduler/shaper serves the HCCA queues for the amount of their reservation and then allows for prioritized contention access to happen between all downlink queues (EDCA and HCCA queues).
Knowing that enough information is usually available about the multimedia source, we assume that guaranteed service at a reserved rate is possible for multimedia streams. For example, for a video source it is assumed that information such as frame rate, average bitrate, average and maximum packet sizes, and maximum burst size are available. This information is sent to the AP by the station in ADDTS messages (which include an extensive set of QoS parameters) at session setup time. The virtual packet generator at the AP uses this information to generate virtual packets at a rate equal to the average rate and at intervals equal to the inverse of frame rate. The virtual packet sizes are calculated using the bitrate and frequency of packets. If further information about the composition of the video traffic is available, for example, how often I-frames are transmitted and their average size, the virtual packet generator can generate similar periodic sequences.
Assuming that the VPG and traffic shaper are properly configured and resources are reserved, we can rely on CAPS providing guaranteed access with bounded delay. As a result, we can focus on utilizing and adjusting the reservation for each flow in order to improve the overall system performance and efficiency. When multirate operation is concerned, it is assumed that admission control and scheduling are performed with the aim of providing service time fairness, and isolation of the flows in terms of the BW assignment (not throughput assignment). This means that a flow is guaranteed a service time share of τ, regardless of the PHY transmission rate it uses. If the flow uses PHY transmission rate of C, it is guaranteed a bitrate of . When the transmission rate for this flow changes to , the scheduler reduces the flow's guaranteed throughput to in order to ensure that the time share of the flow is maintained at the same level and other flows are not affected. In fact, with this change, the flow is restricted to its BW assignment and the reduction in its transmission rate is confined and isolated. Throughput fairness or guarantee can also be provided by CAPS, but is undesirable for heavily loaded networks. This paper only considers service time fairness and guarantee (temporal fairness). Section 4 examines H.264 video transport over WLANs that use EDCA or CAPS (with temporal fair scheduler).
4. Delivering H.264 Video Using EDCA & CAPS
As described in previous sections, the type of QoS provided for multimedia content in a WLAN is either prioritized services using EDCA, or guaranteed access services provided by methods such as CAPS under HCCA. The performance of these QoS measures, seen in terms of the packet loss ratio, directly affects the quality of the video playback. There are several methods for quantifying the video distortion (or quality degradation) based on the packet loss ratio . For example, we can estimate the expected distortion for a partitioned video as
where the index i (A,B,C) represents the partitions, p i denotes the probability of only partitions i being received and decoded, and D i denotes the total distortion due to decoding the received partitions in absence of the others. For simplicity, we did not show the dependence on rate and quantization parameters in (1). D 0 represents the case where the entire frame or partition A is lost, and represents the natural coding distortion, when no packets are lost. D i can be estimated depending on the error concealment mechanism used. Nevertheless, the well-known fact is that . It is known that in general (but not always) partition B is more important than partition C . The delivery mechanism can be adjusted to distribute the loss probability in a way that the most important parts of the partitioned video incur lower loss ratios. This is indeed the basis for assigning priorities to different partitions (where partitioning is available). This paper does not assume any specific error concealment mechanism and provides general solutions for assigning partitions to different services of the WLAN.
Given the availability of the prioritized and guaranteed services in a WLAN, and the ability of an H.264 encoder to produce different traffic patterns for the same video sequence, there are several different options for delivering H.264 video over a WLAN. Since the partitioning feature is only available in the extended profile, the methods based on this feature are only applicable to the video sequences encoded using the extended profile. The other methods are applicable to all profiles of H.264 video. Given these facts, the following methods are the feasible solutions for delivery of H.264 video over 802.11e WLANs (methods 2, 4, 5, and 6 are the proposed mechanisms of this paper).
(1) Transmission of the entire video traffic using one access category (priority level) of EDCA. This method is the most commonly used method; the interaction between the multimedia source and the delivery layer is limited to a type of service field in each video packet that informs the delivery layer of its priority class.
(2) Transmission of the entire video traffic in one stream over CAPS. This method relies on informing the CAPS-enabled WLAN of the traffic pattern of the multimedia flows in order to guarantee required resources for them. Using this information CAPS enables the MAC layer to better serve the video streams; however, the application layer (video) actions are limited to tagging each video with a stream ID or a type of service tag, and further information of the delivery layer services are not used by the multimedia source. Since video traffic is variable bitrate (VBR), sometimes the video rate exceeds the reserved throughput. In this case, CAPS will provide partial guarantee and the extravideo bits are sent using EDCA. The same scenario occurs when the multirate operation forces a lower-guaranteed throughput for the video.
(3) Using the H.264 partitioning feature and transmitting each partition using a different priority level or access category (partition A and IDR frames use AC2 or priority 5, partitions B and C use AC1 or priorities 3 and 2). This method, presented in , uses information about available delivery layer services in the application layer (video source) to produce network aware content, but limits the available services to prioritized contention access services.
(4) Using the H.264 partitioning feature and transmitting partitions A (and IDR frame), B, and C using separate flows (sessions) over CAPS. BW reservation for partition A flows is at least at their required rate, while partitions B and C may receive lower reservations than they require. If partial throughput guarantee is available for the entire video, partition A flows are given priority in using the guaranteed throughput and partitions B and C have to use lower-guaranteed bitrate and rely on EDCA if enough guaranteed resources are not available. While this prioritized use of the resources is enforced at stream setup time (by assigning different weights to flows), a different enhancement is possible in the scheduling mechanism itself. We propose to modify CAPS and give absolute priority to partition A packets over other packets of the same video stream. This is achieved by serving eligible partition A packets of a stream instead of the other partitions of the same stream which may have lower time stamps. When control in CAPS is given to EDCA, the modified CAPS algorithm will give absolute priority to partition A packets in internal collision resolution of EDCA .
(5) Using the H.264 partitioning feature as in the previous method, but aggregating partitions B and C in one real time transport protocol (RTP) packet (using the payload formats as described in ), then transmitting partition A (and IDR frame) and aggregated B and C in two separate CAPS flows. As in the previous case, partition A flow is given priority in using the guaranteed services and in modified CAPS, while the aggregate B and C partitions may receive guaranteed services at levels lower than their bitrate. When multirate operation forces a lower transmission rate for the video flow, and only a partial bitrate guarantee is available, the reduced guaranteed throughput is first deducted from partition C and B shares. For this mode, it is also possible to aggregate partition A and B packets in one RTP packet and serve partition C separately. Using the aggregation of smaller packets, efficiency of the WLAN operation increases.
(6) Using the H.264 partitioning feature as in the previous method, aggregating partitions B and C in one RTP packet, then transmitting partition A (and IDR frame) and aggregated B and C in two separate CAPS flows, and serving each flow at a different PHY rate. Partition A flow is given priority in using the guaranteed services, and is assigned a PHY rate with acceptable low PER, while the aggregate B and C partitions may receive guaranteed services at levels lower than their bitrate. The PHY rate assigned to partitions B and C is according to the remaining service time share of the stream, and wireless link conditions. When the same rate is assigned to both partitions, this solution is identical to solution 5.
The first of the above methods is in fact the simplest and most readily available mechanism for video communications in 802.11e WLANs. The second mechanism (and mechanisms 4, 5, and 6) can be used when CAPS is implemented in a WLAN. Mechanisms 3 to 6 depend on the partitioning feature of the H.264 video which is available in the extended profile. A summary of the requirements of each technique is given in Table 1.
Table 1. Requirements and Features of H.264 video communications techniques.
In methods 4, 5, and 6, the higher priority flows containing partition A and IDR frames receive their required bandwidth through CAPS mechanism. However, the level of guaranteed service provided through CAPS for lower priority flows (containing partitions B and C packets) may be lower than the bitrate of these flows. If extrabandwidth is available, partitions B and C packets use the EDCA mechanism to access the channel and transmit the rest of their traffic. In fact, if only partial guarantee is available due to multirate operation or VBR characteristics of the video, partition A and IDR frames have absolute priority over partition B and C packets in using CAPS enabled guaranteed services. This priority is achieved through weight assignment and modifying the scheduling decision making algorithm of CAPS. In the modified CAPS, partition A packets of a stream are served ahead of partition B and C packets of the same stream, even if the time stamp of the partition A packet is larger.
The aggregation of partitions B and C (or A and B) in method 5 increases the efficiency and capacity of the system. The aggregation task can be done in the application or MAC layer; however, it is better to use the aggregation feature of H.264 RTP payload format. This aggregation task can be combined with a cross-layer optimization mechanism for optimizing the size of video packets delivered to the MAC layer. This mechanism ensures that packets are small enough to maintain acceptable PHY layer packet error rate, while not reducing the MAC capacity significantly. Adjusting the packet length is an enhancement applicable to all the methods above, and is described in more detail in .
Method 6 describes another possible enhancement when the link adaptation scheme can be customized according to the video partition information. When link adaptation does not differentiate between partitions, this solution is reduced to solution 5, otherwise it may provide enhancements over method 5 or other methods, based on data partitioning. This method is treated as an extension of 5, and is only described at the concept level; a detailed analysis of methods based on PHY link adaptation is out of the scope of this paper, which focuses on MAC solutions.
Figure 2 shows the architecture of a station and an access point that implements the proposed CAPS-based mechanisms. The following subsections analyze and examine each of the above methods. These methods are compared and several simulation experiments are presented to evaluate the performance of these methods and identify the best solutions.
4.1. Single Stream H.264 Video Transmission Using CAPS and EDCA
If data partitioning for a video sequence is not used or is not available (as is the case for the H.264 baseline profile), the encoded video produced by an H.264 encoder is delivered as a single flow over the network. The produced traffic is a stream of packets that carry data belonging to I, B, or P frames. Since decoding B frames may require excessive buffering at the receiver, real time applications usually use only I and P frame types (B frames are not allowed in the baseline profile). The most widely used QoS solution in this case is to provide either prioritized (differentiated) or guaranteed services for the entire video stream, not differentiating between packets belonging to the same stream.
In WLANs, the prioritized services are inherently supported through the use of EDCA. For guaranteed services, a user defined QoS framework, such as CAPS, is needed. Using the EDCA mechanism, the video traffic is usually given a priority level of 4 or 5 (video access category). This priority level uses smaller contention window and shorter AIFS, resulting in higher access probability, but lower network capacity. Although the higher access probability yields favorably lower average delay for the video traffic, the jitter is still high for video. To examine this fact, we simulated a typical video communication scenario in home WLAN environments using OPNET and observed the delay performance of EDCA and CAPS mechanisms to determine the packet loss ratios. The WLAN used for these simulations was an 802.11e network with an 802.11b PHY layer (maximum PHY rate of 11 Mbps).
In this experiment, an uplink video session coexisted with a heavy downlink traffic of 5 Mbps. We also considered 2 (and 6) stations sending uplink background traffic of 200 Kbps. The video was the CIF size H.264 foreman sequence with a bitrate of around 500 Kbps, using slice coding with slice size of 700 Bytes. For the CAPS scenario a 500 Kbps virtual flow was generated to reserve resources equal to the average bitrate of the video. For short durations when video bitrate was higher than 500 Kbps, EDCA was used by CAPS (i.e., partial guarantee was provided for high bitrate periods). The cumulative distribution function of the measured delay for the video session is depicted in Figure 3. This figure shows that CAPS has a significantly better delay pattern than EDCA. For example, if the deadline is set to 100 microsececonds, more than 10 to 20% of the packets in EDCA will miss their deadline, although the average delay of EDCA is far below this deadline. This experiment, which is based on real life scenarios, confirms that EDCA is not suitable for real time multimedia applications. It also demonstrates that the knowledge of video pattern, applied through CAPS, results in significantly better services for the video traffic. It must be noted that the better services provided through CAPS do not usually mean worse EDCA services for other traffic types, since most of the lost service in EDCA is due to collision.
Figure 3. CDF of delay for an uplink video flow (CAPS versus EDCA).
In addition to high jitter levels, the ability of EDCA to maintain service levels decreases quickly as the background traffic increases in the WLAN. In contrast, CAPS is able to maintain the service level requested by the multimedia session. To see this, we observed the average and maximum delay of a 256 Kbps H.264 video traffic as the background traffic of all classes (including voice) increased in an 11 Mbps 802.11e WLAN. The results shown in Figure 4 indicate that CAPS protects the flow from background traffic, whereas EDCA fails to protect the flow. The same result is also seen when similar class traffic increases in the network. When using EDCA, contrary to when CAPS is used, a malbehaving high bitrate flow can take over the channel and low bitrate flows of the same class suffer from excessive delay.
Figure 4. Delay of a single video session as background traffic increases.
The above experiments assumed negligible error rates at the PHY layer (bit error rate: 10−6), and only considered the MAC layer issues. To study the effects of PHY errors, we set up a new simulation scenario. Interestingly, it was observed that the capacity of the network (MAC layer) decreases at a faster pace than expected due to the increase of PHY error rates. This is mainly due to retransmission attempts and increased collision that further reduce the MAC capacity. The WLAN that was used for this experiment was comprised of one uplink video source (CIF size H.264 encoded foreman video with 500 Kbps bitrate and 700 Bytes slice sizes) and a number of stations generating background traffic (30 stations, 200 Kbps bitrate, with 1000 Bytes packets with exponential interarrival). Two PHY conditions with bit error rates of (no error) and 10−5 (typical to moderate) were considered. We also simulated a lightly loaded (6 background stations) network with typical error levels. The cumulative distribution function of the measured delay in each scenario is depicted in Figure 5. Link adaptation was disabled in this experiment, in order to see the effect of PHY error on MAC operation.
Figure 5. CDF of delay in a WLAN with and without PHY errors.
We observe that introducing errors in the PHY layer has a significant effect on EDCA operation because it incurs retransmission, effectively increasing the load of the network and the probability of collision. The PHY error effects are very limited in CAPS.
The above experiments demonstrate the effectiveness of CAPS in providing higher-quality services for video traffic. The better delay performance directly affects the quality of the real time video delivered to and played back at the receiver. To better understand this effect, we implemented an offline network simulator framework. This framework, depicted in Figure 6, is used to apply the effect of packet loss due to physical layer errors and MAC delay issues to a real time video whose packet traces were used in previous experiments.
Figure 6. Offline video communication simulator.
Using this offline simulator, the effects of PHY errors and MAC delay were applied to the 500 Kbps foreman video (from the previous experiment) and the output video was observed. Some snapshots of the played back video are depicted in Figure 7. We considered several different delay deadlines for the received packets. As it was expected, CAPS performance was clearly superior to that of EDCA and the video quality is considerably better. Having studied the characteristics of CAPS, Section 4.2 focuses on the main subject of this article which is the delivery of partitioned H.264 video over WLANs.
Figure 7. Snapshots of foreman video, transmitted over a WLAN with delay deadlines of 100 and 250 microseconds.
4.2. Transmission of Partitioned H.264 Video
The data partitioning feature is available in the extended profile of the H.264 standard. Using this feature, three different data sets (A, B, and C) with different importance are generated for each video frame or slice. If the underlying delivery network is able to provide unequal error protection (UEP) or any kind of QoS, each data partition can be served differently, potentially achieving better services than the single streaming case. In effect, the availability of the data partitioning feature allows a network-aware video source to adapt its output to the requirements and services of the underlying delivery mechanism, that is, the 802.11e WLAN. In this case, the interaction between the network-aware multimedia source and the QoS-enabled delivery layer results in a cross-layer solution with many configurations.
One such cross-layer design is presented in . The work in proposes to serve each partition data using a different EDCA access category. Using this method, IDR and partition A are served in WLAN using AC2 (priorities 4 and 5). Partitions B and C are transmitted using AC1. The highest priorities, 6 and 7 or AC3, are reserved for the initial parameter sets.
Although the method in may provide better services than just using single stream and EDCA access, it does not consider the significantly large PHY and MAC overhead of transmitting 3 packets (one for each partition type) instead of 1 (with no partitioning). The PHY and MAC overheads in an 802.11 WLAN are significantly larger than the RTP/UDP/IP overheads. Other than adding the MAC and PHY headers to the packet, the increase in the number of packets results in increased contention attempts and higher collision probabilities in the MAC. These issues are not considered in .In this section, we demonstrate the inefficiency of using 3 partitions and EDCA.
To reduce the effect of increased contention and collision, we propose to use the CAPS mechanism to deliver partitions data in separate flows. This mechanism is directly comparable to the method in that uses EDCA. As an enhancement, we also consider using NAL aggregation to combine partitions B and C in one RTP packet. This enhancement should significantly boost the system efficiency. The reason is that partition B usually has a smaller size than partitions A and C, thus aggregating it with either type A or C results in considerable capacity savings without considerably jeopardizing the unequal error protection. The performance of these mechanisms is examined through simulation experiments. Since the delay performance of EDCA and CAPS streams was studied in Section 4.1, we examine a more visible performance measure in this case, the loss ratio for each data partition.
To take into account the multirate operation of the PHY, partial guarantee for flows using CAPS is assumed in our experiments. For partitioned video, the available guaranteed throughput can be assigned to the more important partitions, and let the less important data be delivered through EDCA. Assuming that the data rate of partitions are , and , and all partitions are delivered at PHY rate of C, the share of each partition and the total share are the following: , , , .
When the rate drops to , the required service time share will increase to . This excess time share is not granted by the service time fair scheduler; thus, only partial guarantee with a guaranteed throughput of is provided. This guaranteed throughput is what is used in our experiment, instead of R. The guaranteed throughput is first provided to partition A flow, and the remainder is provided to partitions B and C.
As a first step in examining our proposed method, an experiment was set up to observe the loss ratio for each partition type of a single CIF size foreman video delivered using EDCA and CAPS mechanisms in a WLAN with different levels of background traffic. To have a fair comparison, it was assumed that partitions B and C are delivered using the same flow in the CAPS scenario (since they both use AC1 in method 3 of [10, Table 1]). The background data sources in the experiment had a rate of 500 Kbps and generated packets with uniformly distributed size between 50 and 1950 Bytes. The interarrival of these packets was exponential. The delay limit for the real time (conversational class) video application was set at 100 microseconds, and late packets were dropped at the receiver. For the case where CAPS was used, we reserved 300 Kbps of the WLAN capacity, using CAPS, for partition A, and 50 Kbps for partitions B and C combined. This is 150 Kbps less than the total bitrate of the video, in order to simulate a case with partial resource guarantee. Since CAPS allows partial reservation for a flow, any amount of reservation for partitions will result in performance better than EDCA.
The results of the experiment are depicted in Figure 8 and show the increase in packet loss ratio when the background traffic increases. From this figure it is clearly seen that EDCA-based scheme fails much sooner than CAPS, which manages to deliver partition A packets with negligible loss ratio. For partitions B and C, the loss ratio when CAPS is used increases as the background traffic increases. The reason is that the guaranteed resources are first used to serve partition A, and partitions B and C only receive the leftover service. Nevertheless, since they still receive partial guaranteed access, the performance of CAPS is considerably better than the EDCA-based solution. This performance improvement can be further enhanced by aggregating some of the small partition packets in the H.264 NAL. This scheme is examined in the Section 4.3.
Figure 8. Delivery of partitioned video using CAPS and EDCA.
An interesting observation from Figure 8 is that although for EDCA-based scheme, partitions B and C start losing packets sooner, the loss ratio of partition A packets rises more quickly than partitions B and C. This is in fact the result of smaller contention window sizes for higher priorities in EDCA. To overcome this problem, one can increase the contention window size of AC2 to the same level as AC1. However, the effect in this case would be a reduction of priority for partition A packets. As a result, packets of this type start missing their deadline at a lower number of background stations. These problems, which are inherent in EDCA due to its contention access mechanism, add to the existing issues of high jitter and wide delay pattern for EDCA delivered flows (refer to the delay patterns shown in Figures 3 and 5).
4.3. Partitioned H.264 Video Communications with Aggregation
Knowing that the overhead of transmitting small partition packets is significant, the capacity of the system is boosted by aggregating partitions B and C packets. The performance enhancement in this case is shown to be significant, and easily compensates the effect of the lost differentiation between partitions B and C (due to aggregation). Since partition B packets are usually the smallest packets in P frames, one could also aggregate partitions A and B instead. Our tests have shown that the two different mechanisms yield similar results. For this reason, this paper only shows results obtained using aggregation of partitions B and C.
To examine the gain achieved by the aggregation mechanism, we set up an experiment to measure the capacity of a WLAN supporting H.264 video sources. We again used the 500 Kbps encoded foreman sequence, with and without partitioning. We also included 6 stations, each generating 500 Kbps background traffic as in the previous experiment. For this experiment, we increased the number of video sources and observed the loss ratio (with 100 microseconds delay limit). Resource reservation for single stream video was 350 Kbps. CAPS reservation was set to 300 Kbps for partition A flow and 50 Kbps for aggregated B and C partitions. Figure 9 shows the loss ratio for the video without partitions that is served by CAPS and EDCA, as well as the total loss ratio for the aggregated partitioned video scenario served by CAPS. As is expected, the loss ratio for EDCA is much higher than CAPS. It is also seen that with partitioning, the increased overhead reduces capacity (compare "CAPS partitioning no aggregation" with "CAPS single stream"). With aggregation, we can compensate the reduced capacity and achieve lower loss ratios, only slightly higher than the single stream case.
Figure 9. Total loss ratio for EDCA, CAPS, and partitioned video with CAPS (aggregation and no aggregation).
From the same set of experiments, the loss ratio for each partition type is also observed and depicted in Figure 10. As it is seen from this figure, the aggregation method results in the lowest loss ratio for the important partitions of a video sequence and provides higher capacity than all other methods. The important point to observe in Figure 10 is that the loss ratio of partitions A and B type in the NAL aggregation case is lower than the combined loss ratio of the video in the same scenario, and the single stream case. For example, when 14 video sources are used, although the aggregation method and the single stream case both lose almost 7% of the packets (Figure 9), in the aggregation-partitioning method most of the loss occurs for the partition B or C packets, and important data in partition A has a loss ratio of only 4%. It is also shown that the modification to CAPS to allow partition A packets be served ahead of partitions B and C of the same stream results in some enhancements, especially when the network load is high. One interesting observation in Figure 10 is that despite the use of modified CAPS, some partition A packets are still being lost while partitions B and C are served. This is due to the fact that the modified CAPS cannot prioritize partition A packets of one flow over partitions B and C of other flows. The modified CAPS only prioritizes partitions within a video stream, and not between different video streams.
Figure 10. Packet loss ratio for different partitions, using different schemes.
Figure 10 shows the increase in system capacity and emphasizes that employing CAPS and partial aggregation of partitioned video in the NAL can ensure a higher multiplexing gain. For example, if we have video streams with very high variation in bitrate, to ensure a certain PSNR is always met for these videos, we either have to limit the number of videos or use a combination of CAPS, partitioning, and aggregation to ensure minimum required frames are delivered. Figure 11 depicts the PSNR quality measure for the received video streams, whose loss ratios are shown in Figure 10. It is seen that protecting partition A at the expense of partitions B and C does indeed result in higher quality of the received video. Figure 11 also shows that the slight increase in the loss ratio due to the use of partitioned video stream and higher overhead (as was shown in Figure 9) is more than compensated by the quality gain due to protecting important parts of the video stream (i.e., partition A). It is observed in Figure 11 that at very high loss ratios, when EDCA solutions are used with more than 10 video sources, the video PSNR becomes similar for the DP-based scheme and the single-stream scheme. For such situations, where the loss ratio is very high, the received video quality is so low that a comparison of PSNRs is not very informative.
Figure 11. PSNR of the received H.264 video stream.
It must be noted that the performance gain in the case of NAL aggregation is due to the reduction of the MAC and PHY overhead. This is based on the fact that the MAC and PHY layers add considerable overhead to each packet . This might suggest that the largest packet sizes must be used to achieve highest efficiency; however, the PHY packet error rate is directly related to the size of packets and smaller packets have a lower loss probability.
Resolving the above tradeoff is a challenging task as it is usually not possible to find the optimum packet size . Consequently, we propose to use the maximum packet size that yields acceptable PHY loss rate (usually a PER of less than 5% or 10%) and is less than the MAC fragmentation level. This way we ensure that the PHY loss rate is restricted while least MAC inefficiency is incurred. This size, L, can be calculated from the following inequality: , where e is the assumed physical layer BER, P is the acceptable PER , and transmission attempts are limited to X times for each lost packet.
4.4. Customized Link Adaptation for Transporting Partitioned H.264 Video
In previous sections, we assumed that a conventional LA scheme controls the multirate operation of the PHY. Conventional LA schemes attempt to maintain a certain BER. In this case, the multirate operation of the PHY is independent of the MAC operation and scheduling. The scheduler and admission control modules are responsible to handle the change in transmission rate and maintain service time fairness. As it has been seen in the previous section, the reduced transmission rate results in lost-guaranteed service for streams and flows have to partly use EDCA. Knowing that video is error resilient, we may accept higher PHY error rates and use higher transmission rate in order to achieve higher-guaranteed throughput. Using this fact, it is possible to customize the link adaptation scheme for partitioned video stream and potentially achieve better performance. For this purpose, we consider an LA scheme in this subsection that attempts to achieve better quality of delivered video rather than maintaining a certain BER. This LA scheme operates under the service time constraints and attempts to distribute the service time share of a video stream between different partitions in a way that the best video quality is achieved. This scheme can be used with all methods described in the previous sections, but we use the last method (CAPS with aggregation) to simplify the description.
To see how service time constraints are met, assume that the data rates of partitions are , and , and for aggregated partitions B and C . Assuming all partitions are initially (at stream setup time) delivered at PHY rate of C, the share of each partition and the total share are the following:
Given that partitions B and C data are not decodable if partition A is not available, it is necessary that partition A data is protected at the expense of partitions B and C. This means that the link adaptation algorithm should first find a PHY rate with acceptable PER for partition A data, within the service time limits, and then assign a rate for partitions B and C that results in a loss ratio lower than or equal to the loss ratio in case of using the same transmission rate as partition A. The typical PHY PER of 10% is an acceptable value for partition A since the MAC uses at least three retransmissions, achieving a PER of less than 0.1%. Denoting the PHY rate achieving this PER as , the service time share of partition A becomes . The remaining service time is used for partitions B and C which yields .This means that a PHY rate of can be used to ensure that is guaranteed for partitions B and C, providing full-guaranteed throughput using CAPS.
The full throughput guarantee eliminates the packet loss due to EDCA access (in case of partial guarantee); however, the PHY layer loss may become significant in this case due to the fact that may not provide a low PER in the PHY. Considering the retransmission in the MAC, we should choose so that the minimum loss for B and C partitions occurs. Determining this rate is not a straightforward task. If is selected so that PHY PER is 10% (0.1% in MAC), the solution becomes what we already discussed in method 5 Section 4.3, and the loss due to partial guarantee occurs. If higher PHY rate and PER values are acceptable, the loss due to partial-guaranteed access decreases, but the total loss may be more or less than the previous case (PER of 10%). In fact, the best solution depends on the network load as well as the wireless channel condition. Therefore, PHY PER as well as the network or MAC layer loss (due to EDCA or CAPS access) ought to be known before a solution can be found. Given that this subject requires extensive study of the PHY PER patterns and interaction of PHY and MAC, it is left for future analysis.
However, as an example, we consider the experiment of the Section 4, whose results are depicted in Figures 9 and 10. In this experiment, the link adaptation was assumed to provide PHY PER of 10%. Given the retransmission in MAC, the effect of PHY loss is limited. Thus the loss reported in Figure 10 is solely due to congestion and collision in the MAC. When 14 stations are considered in this network, the loss ratio for partitions B and C is around 20%. Any PHY rate that results in a combined PHY and MAC loss ratio of less than 20% will be a better solution. As a conclusion, it can be stated that solution 5, serving partitioned data with modified CAPS and aggregating partitions B and C, provides the best results when link adaptation is not involved and only MAC layer solutions are considered. If further network and channel information are available, a customized link adaptation scheme can be used, which may result in better performance.
Video communications over WLANs require certain QoS measures that are not readily available in regular 802.11 or 802.11e-based networks. This paper has shown that using cross-layer mechanisms can improve the quality of the delivered video. This is primarily achieved through providing knowledge of the video traffic pattern to the 802.11e MAC layer, and informing the VCL or NAL layers of an H.264 video source of the availability of guaranteed services in the WLAN. We have proposed three methods based on CAPS and a modified version of CAPS for supporting video communications over 802.11e WLANs. These methods were tested and compared with existing EDCA-based mechanisms. This paper also discussed how link adaptation may be customized for partitioned H.264 video.
Through experiments, it was shown that for the baseline profile of H.264 the best performance is achieved using CAPS, which is a guaranteed access HCCA schemes that is able to accommodate VBR traffic. The ability to provide partial guarantee is a key factor in preferring CAPS to other HCCA schemes. For the extended profile, the best performance is gained when data partitioning feature of H.264 is used along with NAL/RTP aggregation of some partitions, and resulting streams are transported using modified-CAPS services in the 802.11e MAC layer.
This paper presents a preliminary framework for customizing the link adaptation schemes for the delivery of partitioned H.264 streams. Extending this framework and analysis of optimization problems rising from the use of this framework are interesting open research subjects.
S Wenger, H.264/AVC over IP. IEEE Transactions on Circuits and Systems for Video Technology 13(7), 645–656 (2003). Publisher Full Text
T Stockhammer, MM Hannuksela, S Wenger, H.26L/JVT coding network abstraction layer and IP-based transport. Proceedings of IEEE International Conference on Image Processing (ICIP '02), September 2002, Rochester, NY, USA 2, 485–488
T Stockhammer, MM Hannuksela, T Wiegand, H.264/AVC in wireless environments. IEEE Transactions on Circuits and Systems for Video Technology 13(7), 657–673 (2003). Publisher Full Text
Y Shan, A Zakhor, Cross layer techniques for adaptive video streaming over wireless networks. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '02), August 2002, Lausanne, Switzerland 1, 277–280
P Bucciol, G Davini, E Masala, E Filippi, JC De Martin, Cross-layer perceptual ARQ for H.264 video streaming over 802.11 wireless networks. Proceedings of the 47th Annual IEEE Global Telecommunications Conference (GLOBECOM '04), November-December 2004, Dallas, Tex, USA 5, 3027–3031
YP Fallah, H Alnuweiri, Hybrid polling and contention access scheduling in IEEE 802.11e WLANs. Journal of Parallel and Distributed Computing 67(2), 242–256 (2007). Publisher Full Text
YP Fallah, P Nasiopoulos, H Alnuweiri, Scheduled and contention access transmission of partitioned H.264 video over WLANs. Proceedings of the 50th Annual IEEE Global Telecommunications Conference (GLOBECOM '07), November 2007, Washington, DC, USA, 2134–2139
S Kumar, L Xu, MK Mandal, S Panchanathan, Error resiliency schemes in H.264/AVC standard. Journal of Visual Communication and Image Representation 17(2), 425–450 (2006). Publisher Full Text
P Ansel, Q Ni, T Turletti, An efficient scheduling scheme for IEEE 802.11e. Proceedings of IEEE Workshop on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt '04), March 2004, Cambridge, UK
A Grilo, M Macedo, M Nunes, A scheduling algorithm for QoS support in IEEE802.11E networks. IEEE Wireless Communications 10(3), 36–43 (2003). Publisher Full Text
YP Fallah, D Koskinen, A Shahabi, F Karim, P Nasiopoulos, A cross layer optimization mechanism to improve H.264 video transmission over WLANs. Proceedings of the 4th IEEE Consumer Communications and Networking Conference (CCNC '07), January 2007, Las Vegas, Nev, USA, 875–879