Codec Negotiation Technologies for 3G

Release Date:2007-03-26 Author:Chen Yi, Gao Jie, Yu Li Click:

    codec negotiation process aims at avoiding traditional twice encoding/decoding in a Mobile Station (MS)- to- MS call in a Global System for Mobile Communications (GSM), an MS- to- User Equipment (UE) call in a GSM/3G system, and a UE-to-UE call in a 3G system. In a UE-to-UE call configuration, the speech signal is first encoded in the originating UE, sent over the air interface, converted to 64 kb/s A- law and μ-law G.711 Pulse Code Modulation (PCM) in the local transcoder, and carried over the fixed network. Then, the distant transcoder encodes the PCM signal again, and sends the encoded signal to the terminating UE over the air interface. At the end, the terminating UE obtains the rebuilt speech signal by decoding. The whole process is shown in Figure 1. This process leads two Tandem Operations (TOs) of two pair of codecs. With TO, a speech signal is double encoded/decoded, which degrades the speech quality, especially when the speech codecs are operating at low rates.


    As shown in Figure 2, if the originating and the terminating UE use the same codec, a speech signal in the coding domain can be transparently carried from the originating UE to the terminating UE, with no necessity to activate the transcoders of both local and distant networks.


    The codec negotiation has the following strengths:

  • TO is avoided, which improves the speech quality.
  • Compressed codes can be used after codec negotiation, which saves link resources.
  • It is unnecessary for the transcoder to implement the code transformation function, saving the processing capability.
  • End-to-end transmission delay can be reduced.

    There are three leading codec negotiation technologies. One is Transcoder Free Operation (TrFO), which fulfills codec negotiation by Out-of-band Transcoder Control (OoBTC) signalling. Another is Tandem Free Operation (TFO), an in-band codec negotiation. The third is that, when a 3G core network interworks with a NGN core network, the gateway office of the NGN network will use the network quantity deciding technology to select the preferred codec mode.

1 The Transcoder Free Operation Technology
TrFO uses OoBTC signalling to implement speech codec negotiation in a call. Avoiding the speech codec, TrFO improves the speech quality, saves the codec resource, and network bandwidth in the packet core network (for the speech is carried in the core network with the rate after adaptive multi-rate encoding instead of 64 kb/s). Moreover, the codec negotiation is completed before bearing establishment, which guarantees a call to use suitable bearer resources.

1.1 Description of TrFO
Document [1] figures out that TrFO is selected with priority if two or more call control nodes have negotiated on a unified transported codec. The detailed process is as follows.

  • The originating calling control node sends the codec list supported by its gateway. All codec types in the list are sorted by priority.
  • The transit calling control node analyzes the codec list, deletes the codec types it doesn’t support, and forwards the list. However, the codec priority is unchanged.
  • The terminating calling control node analyzes the codec list, deletes the codec types it doesn’t support, and selects the codec with the highest priority.

    Figure 3 shows a TrFO model based on a Universal Mobile Telecommunications System (UMTS) to UMTS call in R4 system.

1.2 Codec Negotiation in BICC Call
Figure 4 describes a simple signaling process for a Bearer Independent Call Control (BICC) call. The figure shows that codec negotiation is done before bearing establishment, so a best bearer resource for this call may be selected. Document [2] suggests that Originating Mobile Switching Center (O-MSC) starts codec negotiation when it sends the Initiative Address Message (IAM), and sends the codec list it supports to the transit node. The transit node drops the codec types it doesn’t support and forwards the list. The Terminating Mobile Switching Center (T-MSC) uses Application Propagation Message (APM) to bring the optimized codec type and the final codec list back to O-MSC.

1.3 Controls on Medium Gateway
The TrFO uses compressed speech stream in the entire end-to-end communication, such as Radio Network Controller (RNC) to RNC, and RNC to other compressed speech terminals. Document [3] and [4] elaborate on the transmission flow of compressed speech frames on Nb and Iu interfaces in the core network. The user plain has to work in a supporting mode to support codec negotiation.

    With regard to the TrFO call, RNC and Medium Gateway (MGW) must support at least one user-plain version with TrFO capability, that is, both Iu and Nb interfaces must support Version 2 of the user plain. If RNC only supports Version 1 of the user plain without TrFO capability, the mobile switching center server must insert a Transcoder (TC) between RNC and MGW. Certainly, it is not enough for RNC and MGW to support Version 2 physically. The mobile switching center server is required to indicate the use of Version 2 clearly when it requests RNC for radio access bearing (RAB Assignment) and MGW for terminal establishment (ADD Request). That is, because when it is in negotiation, the initiative frame of the user plain needs to take the message about the version in RAB Assignment and ADD Request indicated by the mobile switching center server to negotiate with other MGW/RNC for selection of a version supported commonly.

    The initiation direction of the user plain is always forward, and no relationship with the direction of bearer establishment. The mobile switching center server is notified by Notify message that the bearer of the user plain has made ready only after the bearer is set up and the user plain completes initiation. Only when the mobile switching center server receives Notify message and COT message sent forward, would its COT message be sent backward.

1.4 User Data Flow After Implementation of TrFO
In Figure 5, the blue line shows the user data flow of a TrFO call in one MGW. User data on Iu interface accessed over the interface board are adapted by Asynchronous Transfer Mode Adaptive Layer 2 (AAL2) for ATM bearer, or processed by Realtime Transport Protocol/Realtime Transport Control Protocol (RTP/RTCP) for IP bearer. Then they are sent to certain an Iu Interface User Plain (IuUP) instance for uplink processing according to the transit list, then sent to the IuUP instance corresponding to the distant user for downlink processing. Finally the processed user data handled by the Iu interface board and sent to Iu interface. The entire flow avoids Adaptive Multi Rate (AMR) encoding/decoding and Time Division Multiplex (TDM) switching.


    The yellow line in Figure 5 shows the user data flow of a TrFO call between MGWs. On one MGW, the user data on Iu Interface accessed over the interface board are adapted by AAL2 (for ATM bearer) or processed by RTP/RTCP (for IP bearer) and sent to certain an IuUP instance for uplink processing according to the transit sheet.  Then the processed user data sent to the Nb Interface User Plain (NbUP) instance corresponding to Nb interface, and finally handled by the Nb interface board and sent to Nb interface. On the other MGW, user data accessed over Nb interface are processed by AAL2 or RTP/RTCP and sent to corresponding NbUP instance for uplink processing. Then the processed user data sent to corresponding IuUP instance for downlink processing, and finally handled by the Iu interface board and sent to Iu interface. The entire flow avoids AMR encoding/decoding and TDM switching.

    Figure 5 clearly shows the strength of the TrFO call. Due to the identical codec type and rate of RNC, only user data packets are to be transported transparently on the core network, avoiding encoding/decoding.

2 Tandem Free Operation Technology
The TrFO is a mechanism for selection with priority in a call. It tries to set up UE-to-UE connection without TC. If a TrFO call is successfully set up, it not only avoids using TC, but also makes the best use of bandwidth. However, the TrFO cannot be applied to every case. When there is TDM bearer or communication with 2G users, TC is a must. In this case, TFO works as a standby technology of TrFO.

    The TFO is an in-band codec negotiation protocol. The TFO makes codec negotiation between double speech codec after a call set up. With successful negotiation, the decoder of the sender and encoder of the receiver are passed by, the speech frame used on the air interface is directly covered on G.711 frame and sent to the receiver. Compression/decompression of speech codec is unnecessary for user data flow, so the speech quality is improved. The TFO picks up a certain amount of bits based on standard 64 kb/s links to make up a sub-channel for the transmission of TFO signalling and speech frames.

2.1 Basic Principle of TFO
Before a TFO call is set up, 64 kb/s PCM speech messages are transported between TCs. The least significant bit per 16 speech-sampling points (equivalent to a 0.5 kb/s channel) is picked up to carry the control messages negotiated by the TCs. The TCs exchange the TFO messages to make TFO negotiation. The TCs will automatically activate the TFO once they find matched codec types and configuration at the two ends. With TFO, TC uses the least significant bit per speech sampling point (equivalent to a 8 kb/s channel) or the least two significant bits (equivalent to a 16 kb/s channel) to transport the TFO frames that carry compressed speech. The High 6 or High 7 of a PCM speech sampling point (uncompressed) keeps unchanged and sent to the distant end, in order to avoid impacts on the speech quality and time delay brought by conversion between TFO and PCM frames. After a call set up, the TC unit fulfills establishment of TFO by processing the TFO protocol. Therefore, the TC unit cannot be passed by, which is the main difference between TFO and TrFO.

2.2 Implementation Steps of TFO
Document [5] elaborates the implementation steps of TFO.

    (1) Pre-synchronism of In-path Equipment
    When the local TC has received or sent speech frames and the TFO has been activated, TC sends the message TFO_FILL for pre-synchronism of In-path Equipment (IPE), which makes IPE guarantee transparent transmission of TFO in-band signalling, rather than treat it as a speech signal to amplify it. The distant end certainly pre-synchronizes IPE at the same time.

    (2) TFO Negotiation
    The TFO negotiation starts when the distant end supports TFO and IPE is pre-synchronized for transparent transmission of path. The TCs at the two ends will simultaneously send the message TFO_REQ, and bring their Activated Codec Lists (ACL) and identifiers to each other. If the ACL of one end has intersection with that of the other end, it will send the message TFO_ACK back (taking the common codec list and selected codec, and notifying the local TC and mobile MSC center), or the solution to codec mismatching will be initiated.

    (3) Scheme for Codec Mismatching
    This scheme will be initiated when there is no intersection between the ACLs at the two ends, but the intersection between Supported Codec Lists (SCL). The common codec list and selected codec are determined by exchanging the messages TFO_REQ_L and TFO_ACK_L, and then the mobile switching center server is notified. The servers at the two ends make codec modification according to the determined codec list and selected codec to unify the codec through the entire process. This scheme requires many signaling message exchanges. Therefore, it cannot be supported in most cases. The TFO will be given up in most cases when there is no common ACL.

    (4) Establishment of TFO
    After negotiation, the TCs send the message TFO_TRANS to each other, and then the compressed speech flow is transported between them. The bandwidth is certainly unchanged.

    (5) Codec Optimization
    After a TFO call is set up, handover or other complementary services may lead to codec optimization. If supporting codec modification, the TCs will exchange the messages TFO_REQ_L and TFO_ACK_L to select the common codec list and codec. The whole process is extremely similar to that for codec mismatching. The codec optimization doesn’t bring obvious improvement of the speech quality, but leads to many signalling message exchanges, which heavens the handling load of signalling. Therefore, this function is not applied widely.

    (6) Termination of TFO
   The TFO will terminates no matter when one of the TCs loses its TFO capability; or the call is released; or services are changed (from speech to data services); or handover happens and new TC doesn’t support TFO; or there is no intersection between codec supported by new office after handover and the distant office.
The TC will stop sending TFO frames, go back to the normal mode, and notify IPE by the message TFO_NORMAL once the TFO terminates.

3 Network-quality Deciding Technology
Considering the network topologies development, the 3G core network shall inter-work with the NGN core network. It is necessary in order to meet the demand of the cooperative relation of mobile and fixed NGN networks. However, the 3G core network generally supports the codec types supported by 3GPP protocols (such as different types of AMR coding with various rates), while NGN only supports the codec types defined by ITU (such as G.711, G.729 and G.723). Therefore, there are few common codecs between the two networks. The TrFO and the TFO calls are accordingly impossible to be set up. When it is impossible to reduce time delay and decline of the speech quality caused by TC, it is expected to make use of the strengths of codec negotiation flexibly.

    The G.711, traditional 64 kb/s codec, has dense sampling points and few quantification errors. Therefore, when G.711 is used, the speech quality is good but the bandwidth is great. Contrarily, with G.729 codec, compression introduces noise, so the speech quality is poor. However, the bandwidth G.729 uses only 16 kb/s. It is economical. It is expected to select codec flexibly according to the number of accessed calls. The G.711 is used to keep good speech quality when there are a few users and network quality is good. The G.729 is used when several users are accessed and the network quality declines, which won’t excessively heaven the burden of network and will allow new calls to access as well.

    Practically, the network quality is decided by the gateway office of NGN, and compared with the value set by staff at the office. If the quality doesn’t reach the value, the network quality is though good and the G.711 is used, or the network is thought congested and the G.729 is used to save bandwidth and relieve the congestion.

4 Test Results
A test on TrFO and VoIP was conducted for verification. Since the strengths of TFO are included in those of TrFO, the TFO was not on test. The testing environment is shown in Figure 6.


    The Voice Quality Tester is an all-in-one voice tester developed by GL. A single speech source in the tester was selected for the test. The network scrambler IP-WAVE was used to simulate such IP transmission characteristics as packet loss, jitter and delay. Each testing item was averagely tested for 30-35 times, and the average value was by using Perceptual Evaluation of Speech Quality (PESQ) based on the hearing model.

    When the TrFO was on test, the voice tester was serially connected between UE at the two ends. When the VoIP was on test, the tester was connected to PSTN. The speech source was sent by one end of GL-VQT, processed by two MGWs (and there are the IP scrambler between the two MGWs), sent back to the other end of GL-VQT for recording. Finally GL-VQT compared the received speech with the speech source to get the PESQ value.

    The TrFO test was conduced under the condition of excellent network quality. The speech quality and delay was tested. Test results showed that the speech quality of a TrFO call was better than that of a non-TrFO call, while time delay of a TrFO call was shorter than that of a non-TrFO call. Therefore, the strength of TrFO is obvious.
In VoIP test, the G.711 and the G.729 were tested separately under three network conditions. Table 1 shows the testing results.


    According to the testing results, on one hand, the speech quality of G.711 is obviously better than that of G.729, which is the reason for selecting G.711. On the other hand, when the network condition is worsened, the use of G.711 with wide bandwidth accelerates the decline of the network quality. However, operators won’t want to refuse the access of new calls owing to the poor network quality. At this time, the G.729 with narrow bandwidth has an advantage.

5 Conclusions
This article introduces three codec negotiation technologies used in the 3G core network. The testing results show that codec negotiation mechanism improves the speech quality, saves link resources, enhances the processing capability of the exchange, and reduces the end-to-end transmission delay. In particular, network-quality deciding technology may adjust codec types according to the real-time network quality by deciding the channel quality. However, codec negotiation needs transmission of much signalling. Besides, codec negotiation and codec modification are often made during call set-up and in a call, and the codec list is generally big. Therefore, with much traffic, codec negotiation will occupy great bandwidth of control channels. This case needs further improvement. Presently, the demands of unified codec during the entire communication and flexibility of codec negotiation cannot be met simultaneously. Therefore, it is a future research focus how to fulfill unified but flexible codec.

References
[1] 3GPP TS 23.153 3rd Generation partnership project: Technical specification group core network and terminals: Out of band transcoder control: Stage 2(Release 4)[S]. 2001.
[2] ITU-T Recommendation Q.765.5 Signaling system No.7: Application transport mechanism: Bearer Independent Call Control (BICC)[S]. 2000.
[3] 3GPP TS 29.232 Media Gateway Controller (MGC): Media Gateway (MGW) interface: Stage 3[S]. 2001.
[4] 3GPP TS 25.415 3rd Generation partnership project: Technical specification group radio access network: UTRAN Iu interface user plane protocols[S]. 2002.
[5] 3GPP TS 28.062 Inband Tandem Free Operation (TFO) of speech codecs: Service description: Stage 3[S]. 2004.

Manuscript received:2006-03-12