<<

. 105
( 132 .)



>>

features. This provides a number of advantages over existing systems, including optimal
use of bandwidth, integration of voice with data services and tandem free coding of
calls made from mobile to mobile user. However, unlike its ATM counterpart the IP
protocol suite was not originally designed to provide QoS. To ensure QoS a number of
QoS enhancements for use with IP have already been developed such as differentiated
services (DiffServ), resource reservation protocol (RSVP), RTP and multi-protocol label
switching (MPLS).



8.3.1 VoIP call control
Within a conventional telephony network, SS7 protocols are used to handle call control
by establishing a circuit for the data to follow. However, within an IP domain there is
no such equivalent, since the IP address contained within the data packet is all that is
required to route it to the correct destination. Two different general approaches have been
proposed for handling call control within IP:

• Use the SS7 application protocols (and any other circuit switched call control proto-
cols such as Q.931) unmodi¬ed, but transport them over the IP network transparently.
For this approach a new transport protocol called stream control transmission proto-
col (SCTP), instead of transmission control protocol (TCP) or user datagram protocol
(UDP), has been developed. This type of con¬guration is generally termed sigtran and
has the advantage of being able to use the SS7 software already developed for the
mobile network components with only slight modi¬cation.
514 IP TELEPHONY FOR UMTS RELEASE 4


• Replace the call control protocol completely with a new protocol optimized for the IP
environment. There have been a number of different proposals for call control within
IP networks, including H.323, session initiation protocol (SIP) and BICC. These new
protocols have been optimized to work within the IP network.

Since some networks may be using SS7, some sigtran and some H.323, SIP or BICC,
there is a need to be able to interwork between these protocols. Messages must be trans-
lated at the interfaces between the networks. For the R4 network show in Figure 8.2, the
interworking function is performed by the MSC servers.
Also, as was discussed in Chapter 2, before voice data can be transmitted on a network
it must be placed in packets. The original analogue signal must be sampled (or measured),
converted to a digital form (quantized), coded, optionally compressed and encrypted and
then put into packets.




8.4 REAL-TIME TRANSPORT PROTOCOL (RTP)

As mentioned, the data transport UP at the Nb interface is carried over either AAL2 (for
ATM networks) or RTP/UDP (for IP networks). The RTP protocol (RFC 1889) provides
end-to-end delivery of real-time audio and video over an IP network via the UDP. RTP
provides payload-type identi¬cation, sequence numbering and delivery monitoring. Each
payload that is sent using RTP is timestamped to ensure that it can be delivered to the
CODEC at the correct rate. RTP is speci¬ed for use in R4 if IP transport is being used
(29.414), to carry packets across the CS domain between the MGWs. It is also de¬ned
as the transport option for all real-time media for R5 in the IP multimedia subsystem
(IMS) domain.



8.4.1 RTP at the Nb interface
Figure 8.3 shows the UP protocol stack for a voice call between two UEs. The UP protocol
operating in support mode provides framing and timing functionality over both the Iu and
Nb interfaces. With this means for communication within the R4 network, the delivery
timing functionality of RTP at the Nb interface will be redundant. For the Nb interface,
3GPP states that the RTP timestamp can be ignored (29.414) and support for the real-time
control protocol (RTCP) (see Section 8.4.5) is optional at the MGW. RTP in this case is
only being used to encapsulate UP messages for IP transport.
For R5 networks, RTP is used to control the timing of media end-to-end, for example
multimedia being transmitted between a pair of UEs. In this case the UP protocol will
be operating in transparent mode, passing the data straight through, without the timing
functionality.
The RTP header (Figure 8.4) starts with a version number, which is currently 2. The
PTYPE de¬nes the payload (for example G.711, G.723 audio or H.263 video).
8.4 REAL-TIME TRANSPORT PROTOCOL (RTP) 515


User part protocol timing control

UE RNC MGW MGW RNC UE

AMR AMR AMR AMR AMR AMR

RLC RLC UP UP UP UP UP UP RLC RLC
RTP RTP
MAC MAC AAL2 AAL2 AAL2 AAL2 MAC MAC
UDP/IP UDP/IP
ATM ATM ATM ATM
lower lower
L1 L1
L2/L1 L2/L1
layers layers
L1 L1 L1 L1

Radio Access Network Core Network Radio Access Network


Figure 8.3 User plane for UMTS Release 4

0 31
VER P X CC M PTYPE Sequence Number
Timestamp
Synchronisation source identifier
Contributing source ID 1
Contributing source ID 2

...

Contributing source ID N

Figure 8.4 RTP header

Each payload type de¬nes a pro¬le, which not only de¬nes the media type and CODEC
but also speci¬es the granularity of the RTP timestamp and possible use of the M (marker
bit) as described below. Table 8.1 shows how the payload types have been assigned. The
payload has two types of payloads: static and dynamic. The static assignments are de¬ned
in a number of RFCs, in particular 1890. Table 8.2 describes some of the more common
static payload types. The RTP clock frequency given in the table, de¬nes how the RTP
timestamp is to be incremented. So for an 8 kHz clock the timestamp will be incremented
every 125 microseconds. Dynamic payloads are assigned when the session is created using
some mechanism not covered by RTP (for example session description protocol (SDP)
and SIP).
The comfort noise payload is for CODECs which do not provide a special format for
comfort noise. RTP comfort noise packets contain a 7-bit value, specifying the required
level of noise from 0 to ’127 dBov (dBov = 10log10 (S/Sov), where S = signal strength
and Sov = signal strength that will overload system).
The RTP pro¬le for adaptive multirate (AMR) is de¬ned in RFC 3267. The RTP clock
frequencies are assigned as 8 kHz and 16 kHz for AMR and AMR wideband, respectively.
The payload type for AMR is assigned dynamically.

8.4.1.1 RTP payload type in release 4
For R4 networks the RTP payload is UP support mode packets; this UP/RTP con¬guration
is referred to as the Iu framing protocol. In this case the payload type is dynamic and its
516 IP TELEPHONY FOR UMTS RELEASE 4


Table 8.1 RTP payload type assignment
Values Assignment
0“34 Static
35“71 Unassigned
72“76 Reserved
77“95 Unassigned
96“127 Dynamic


Table 8.2 RTP static payloads
Payload Name Type RTP clock Tick time
type Frequency (Hz) (s)

125 — 10’6
0 PCMA (G.711) Voice 8000
125 — 10’6
3 GSM Voice 8000
125 — 10’6
4 G.723.1 Voice 8000
22.6 — 10’6
10 L16 Audio 44 100
125 — 10’6
13 Comfort noise Audio 8000
11.11 — 10’6
31 H261 Video 90 000


value negotiated when the bearer is established. The RTP clock frequency is 16 kHz, and
the mime type is set to VND.3GPP.IUFP.
The sequence number, chosen at random at the start of each session, is used to keep
the packets in order (remember RTP packets are carried using UDP with no guarantees
that packets are delivered in the correct order).
The padding (p) bit indicates that the data contains padding. This is needed sometimes
if the payload is encrypted. For padded RTP packets the last byte in the packet indicates
the number of padded bytes added.
The X bit is set to 1 to indicate if the RTP header is followed by an extension header.
The header extensions are application speci¬c. The format of the header extension is a 16-
bit ¬eld dependent on the application (extension type ¬eld) and a 16-bit length ¬eld giving
the length of the extension in 32-bit words. RTP does not de¬ne any extensions itself.
The use of the marker (M) bit is dependent on the payload type. It can be used, for
example, by a video stream to mark the beginning of each frame, when each frame is
carried in multiple RTP frames.
The TIMESTAMP ¬eld tells the receiver when the media sample was taken. Two packets
can contain the same timestamp if the data they contain was sampled at the same time.
The meaning of the TIMESTAMP ¬eld is media dependent (i.e. dependent on the PTYPE
¬eld); for example, for G.711 and G.723.1 the RTP clock frequency is 8 kHz, therefore
each new timestamp should be incremented by 125 µs. For video traf¬c, on the other
hand, the sample time may require ¬ner granularity (see Table 8.2). The initial timestamp
value is chosen to be a random number at the start of the session and each subsequent
value calculated relative to this initial value. To translate from packet timestamp to relative
time from the beginning of the session the following formula is used:

T r = (Packet timestamp’initial timestamp) — tick time
8.4 REAL-TIME TRANSPORT PROTOCOL (RTP) 517


8.4.2 Source identi¬ers
Each source of media in a session is de¬ned by a unique 32-bit number called a source
identi¬er. The identi¬er should be chosen randomly to reduce the chance of two stations
picking the same identi¬er. Even though the chance of a collision (i.e. two stations using
the same source ID) is small, each station is expected to provide some mechanism to
detect its occurrence and remedy the situation. In particular, a station that detects another
station transmitting with the same source identi¬er is expected to pick a new identi¬er
randomly. It then will send an RTCP BYE message indicating that it is terminating the
¬rst media stream, and then use the new identi¬er in future transmissions. In the RTP
header, two types of source identi¬er can be present, the synchronization source identi¬er
and the contributing source identi¬er.


8.4.2.1 Synchronization source identi¬er
This is used to indicate the source host which sent the media packet. This could be the
machine with the microphone which is sampling and coding the audio. Alternatively, if
the audio is being mixed subsequently by a conferencing bridge, the identi¬er will be
used to identify the bridge as the synchronization source.


8.4.2.2 Contributing source identi¬er and media mixing
RTP supports the mixing together of media streams, which is of particular use to appli-
cations such as voice and video conferencing. The CONTRIBUTING SOURCE ID ¬elds
de¬ne the source identities of the sources before mixing takes place. This ¬eld is vari-
able length and can hold from 0 to 15 entries. The CC ¬eld contains a count of the
number of contributing sources. An example of this is shown in Figure 8.5: four voice

CC = 4 PTYPE Sequence Num
Packets UNICAST to mixer TIMESTAMP
SYNCHRONISATION SOURCE IDENTIFIER 500
CONTRIBUTING SOURCE ID 1 204
RTP G.711
CONTRIBUTING SOURCE ID 2 205
RTP mixer
CONTRIBUTING SOURCE ID 2 209
ID=500
ID = 204 CONTRIBUTING SOURCE ID 2 211

Mixed G.711 data


Packets MULTICAST
back to participants

ID =205
ID =211


ID = 209

Figure 8.5 Conferencing with RTP
518 IP TELEPHONY FOR UMTS RELEASE 4


sources that have been mixed together are multicast to all the participants. Multicasting
is desirable since it reduces the amount of traf¬c to be sent from the mixer by a factor
of three. Use of the mixer also allows the participants to deal only with the well-known
mixer address and not have to worry about directly communicating with other individual
hosts.


8.4.3 Encryption with RTP
RFC 1889 states that RTP packets can be encrypted to provide security. The recom-
mended technique is cipher block chaining using data encryption standard (DES). The
correct decryption of the packet can be determined by certain header validity checks, for
example checking that the version number is 2 and the payload type is known. There is
no mechanism within RTP itself to allow for the exchange of encryption keys. This will
usually be carried out at the call/session control stage of the call using protocols such as
SIP and SDP.


8.4.4 Redundancy with RTP
The addition of redundancy with RTP allows some of the RTP packets to be lost in transit
without loss of data. RTP redundancy format consists of a primary payload plus multiple
secondary payloads. It is particularly useful for transmissions such as wireless where error
rates may be high.



8.4.5 Real-time control protocol (RTCP)

<<

. 105
( 132 .)



>>