. 106
( 132 .)


RTCP (also covered in RFC 1889) provides control services for an RTP session. These
include QoS feedback (e.g. round trip time, jitter, etc.), session identi¬cation and synchro-
nization information. RTCP supports ¬ve different packet types, as shown in Table 8.3.
The RTCP packet types can be (and are expected to be) aggregated into compound
RTCP packets to save on packet overhead. Since each RTCP packet type contains a ¬xed

Table 8.3 RTCP packet types
Type Description
RR (receiver report) Conveys reception statistics, sent by stations that only receive
SR (sender report) Conveys transmission and reception statistics, sent by active
SDES (source description) Sent by a station that is sending media, provides text description
of the sender
BYE Sent by a station to indicate it is ending a session
APP Used for application-speci¬c extensions

header including a length ¬eld, demultiplexing the parts at the far end is not a problem.
For compound RTCP packets, each packet should contain at least an SR or RR as well
as a source description. Each of the ¬ve types is now described.

8.4.6 RTCP receiver report
Each RR packet contains zero or more report blocks. Each of these corresponds to a
particular source of media and is pre¬xed with the 32-bit synchronization source identi¬er.
The format of the RR is shown in Figure 8.6. Each RTCP packet starts with the RTP
version number (equal to 2) and the padding bit P as described previously. The next ¬eld,
RC (report count), indicates how many receiver report blocks are contained in this report.
The packet type is 201 to indicate receive report. The length indicates the total length
of the packet and is represented in 32-bit units and actually stored as the length ’ 1.
The ¬nal ¬eld in the common header is a 32-bit synchronization identi¬er indicating the
sender of the report.
The receiver report also carries a number of report blocks. Each of these gives some
indication of the quality of the transmission link, including jitter, packet loss statistics
and timing delay between reports. The jitter is measured in the same units as the RTP
timestamp. It is calculated as the smoothed average of the difference in spacing between
the packets™ actual sent time and the time they are received.
The deviation for a packet is de¬ned as:

Di = (ATi ’ ATi’1 ) ’ (TSi ’ TSi’1 )

0 31
V=2 P RC=2 Packet type=201 Length
Synchronisation identifier of packet sender
Synchronisation identifier for receiver report block 1
Fraction lost Cumulative number of packets lost
Extended highest sequence number received
block 1 Interarrival jitter
Last sender report
Delay since last sender report
Synchronisation identifier for receiver report block 2
Fraction lost Cumulative number of packets lost
Extended highest sequence number received
block 2 Interarrival jitter
Last sender report
Delay since last sender report

Figure 8.6 RTCP receiver report

where ATi = arrival time of packet i and TSi = timestamp of packet i. The jitter is
calculated over the last 16 values using the following formula:

Ji = Ji’1 + (|Di | ’ Ji’1 )/16

The last sender report ¬eld holds the top 32 bits of the NTP timestamp associated with
the last sender report packet received. The delay since last sender report ¬eld indicates
the delay since receiving this sender report and sending the receiver report block. The
round trip time can be calculated as follows:

RTT = Arrival time of receiver report block “ Last sender report
“ Delay since last sender report

8.4.7 RTCP sender report
An SR is transmitted by each active participant in a session. An SR also doubles up as
a RR as it can also contain RR blocks. For senders with no statistics to report, the RC
¬eld is set to 0. The SR has a similar format to the RR but with some additional ¬elds,
namely the NTP timestamp, RTP timestamp, and packet and byte counts for the sender. Network time protocol (NTP) timestamp
NTP is a protocol developed to allow a number of hosts to reach some agreement on
absolute time (i.e. to synchronize internal clocks with each other.) NTP time is de¬ned as
the number of seconds elapsed since midnight, 1 Jan 1900. It is represented using 64-bit
unsigned ¬xed-point storage composed of 32-bit integer and 32-bit fractional parts. Sender report RTP timestamp
The RTP timestamp in the SR corresponds to the same time as the NTP absolute time.
It is measured in the same units and with the same random offset of the RTP timestamp
contained within the RTP media stream. The function of the RTP timestamp is to allow
synchronization between streams. For example, to calculate the absolute NTP time for an
RTP packet, the following formula can be used:

(RTP timestamp packet ’ RTP timestamp SR) — RTP tick time + NTP sender report

Consider that the last SR contained an RTP timestamp of 47 200 and an NTP time of
3 248 208 000 seconds. If an RTP packet is received with a timestamp of 47 900 using a
G.711 payload (frequency of 8 kHz and tick time of 125 µs):

Absolute time = (47 900 ’ 47 200) — 125 — 10’6 + 3 248 208 000
= 3 248 208 000.0875 s

Once the absolute time for various media streams can be established the streams can
be synchronized by delaying the media packets which have the highest absolute time
value. This is useful, for example, where video is sent in a separate stream from audio
(lip sync), or multiple streams in the case of a conferencing application. The use of the
timestamp to synchronize media from different hosts (each with a different NTP clock),
is only possible if the hosts™ NTP clocks are synchronized.
Possible solutions to this synchronization problem include the use of the NTP protocol
or a standard external clock source such as one available from the global positioning
system (GPS).

8.4.8 SDES source description
The SDES packet contains text descriptions about the sender of the media. Since the
synchronization identi¬ers change, the SDES is needed to provide a binding between the
synchronization identi¬ers and static descriptions of the participants in a session. Table 8.4
describes the different item types that can be used to identify the source.

8.4.9 BYE goodbye
This packet is used to tell the other end this stream of media is being terminated. Apart
from the actual termination of a media stream, its use is also de¬ned when handling
synchronization number collisions (see Section 8.4.2). As well as a list of identi¬ers
indicating which streams are being terminated, an optional string can be added giving the
reason for the BYE message.

8.4.10 APP application de¬ned
This message is de¬ned to allow for experimental use to permit application developers
to extended functionality without the RTP protocol needing modi¬cation.

Table 8.4 SDES item types
Name Type Description Example

CNAME 1 Canonical end point identi¬er Provides a user@pc1.orbitage.com
consistent name that the source can be
identi¬ed from; usual format is user@host
Expected to be unique for each source in
NAME 2 Real name for media source Sebastian Coope
EMAIL 3 Email address of participant sebcoope@orbitage.com
TEL 4 Telephone number of participant
LOC 5 Location of participant #14-15-06, Cambridge
Business Park
APP 6 Name of application producing media stream MS Netmeeting
NOT 7 Transient messages ˜On-phone, can™t talk™
PRIV 8 Private extensions

8.4.11 RTP limitations
Even though RTP does have some loose session control via the use of SDES and BYE
messages, it does not have the complex functionality required of a true call control
protocol. For this reason RTP is invariably used with other session/call control protocols
such as SIP, BICC or H.323. RTP also does not allow complex media control, such as the
functionality provided by the real-time streaming protocol (RTSP), which allows media
to be rewound, cued, forwarded etc.


SDP (RFC 2327) is not really a protocol as such but a standard format used to represent
session information. SDP descriptions include the type of media, including RTP payload
type, timing of session, port numbers to send RTP data on etc. SDP is used by a wide
range of other protocols to represent media options, including BICC, SIP, media gateway
control protocol (MGCP) and MEGACO.
Table 8.5 shows some SDP ¬elds and their meanings. Each ¬eld consists of a one-
character token followed by the = sign and then a value.
The current version number is a major version number and is usually used to delimit
multiple session descriptions within a block. The origin ¬eld de¬nes who is announc-
ing the session. This ¬eld starts with a username (sebcoope) followed by a session
ID and a version ID. The session ID allows clients to distinguish between different
sessions announced by the same media server. The version ID is used to distinguish
between multiple announcements of the same session, which may have different media
options. All this is followed by the origin address of the announcer of the session. The
session name and description provide a short and long identi¬er for the session, the
latter being used by clients to determine if they are interested in participating. The
connection information de¬nes the IP address of the source or destination of the RTP
packets. This address is used by a receiver and tells a sender where to send RTP media
or can be set by a multicast sender to indicate the address on which receivers should
be listening.

Table 8.5 SDP ¬eld names
Token Name Example
v Version
o = sebcoope 345667 345668 IN IP4
o Origin
c = IN IP4
c Connection
s = SWDevel6
s Session name
I = Development meeting for Orbitage msecure project
i Session description
m = video 51372 RTP/AVP 31
m Media description
a = rtpmap: 96 AMR a = ptime: 30
a Media attribute
b = 64
b Bandwidth
e Email sebcoope@orbitage.com

The media description ¬eld describes the type of media (video/audio), CODEC used
and RTP port numbers. In the example given the media is video. This will be transported
on port 51372 and is using CODEC type 31 (H.261). The attribute ¬eld allows the media
description to be followed by a number of optional media attributes. One use of the
attribute ¬eld is to allow the de¬nition of RTP dynamic payload types. For example
a = rtpmap 96 AMR de¬nes dynamic payload type 96 to be AMR. The ptime attribute
describes the packetization interval for the media CODEC in milliseconds.
The bandwidth ¬eld gives the bandwidth requirement for the session in kilobits per
second, allowing the network to calculate QoS or admission requirements. In theory it is
possible to work out the bandwidth requirement from the media description. This can be
dif¬cult, however, since it also depends on the overheads incurred when putting the data
into packets. Finally, the email address of the originator of the session descriptor is given.
SDP is commonly used in an offer/response mode. In this case the ¬rst station may
offer to communicate with various media options; the second station will respond with its
preferred choice. The other possibility is to use SDP to announce sessions (such as a multi-
casting transmission) to all possible participants, in which case no negotiation is possible.

Media gateway control protocols were designed to allow the decomposition of gateway
functions into separate control signalling and media translation functions. This arrange-
ment can be seen in Figure 8.7.
The call signalling is separated from the media data and sent directly to the MSC
server. The MSC server uses the H.248 messages to instruct the MGW. The media
gateway provides a connection between the IP network and the external networks.

8.6.1 Evolution of media control protocols
The development of MGW control protocols commenced with simple gateway control
protocol (SGCP). This was developed at Bell Research Centers (currently known as

Circuit Core
MSC Server/ MSC Server/
Call Agent Call Agent

ll s



. 106
( 132 .)