Understanding the relationship between SIP and RTP

Getting your head around SIP and RTP traffic flows is a little daunting at first, but its actually not all that complicated when you understand the purpose of the protocols.
As its name implies, the Session Initiation Protocol is used to initiate a session between two endpoints. SIP does not carry any voice or video data itself - it merely allows two endpoints to set up connection to transfer that traffic between each other via the Real-time Transport Protocol (RTP).
The SIP protocol can be, and usually is, routed through one or more SIP proxy servers before reaching its destination. It is very similar to how email is transmitted, in that multiple email servers are usually involved in the delivery process, each forwarding the message in its original form. Each email server adds a Received header to the message, to track the route the message has taken. SIP uses a Via header to track the SIP proxies that the message has passed through to get to its destination.
SIP uses a very similar message format to HTTP. They are both human-readable, and use similar (if not the same) error codes. For example, both HTTP and SIP use 408 as the error code to signal a timeout error, 404 for 'not found', etc. Using wireshark, you can capture SIP packets and read the content of them.
Here is a breakdown of the structure of a SIP packet (Click to enlarge).


  1. This shows the source and destination IP addresses of the SIP packet. Note this information will change as the packet passes between SIP proxy servers.
  2. Transport Protocol and port. In this case, this is a SIP/UDP packet being sent to port 5060 (the standard SIP port)
  3. This is the SIP Request header that tells us what type of SIP message this is. This particular packet is a SIP INVITE request for extension 401 @ asterisk.lithnet.local
  4. The Via header contains a list of all SIP proxy servers that this packet has passed through, including the initiating client
  5. The To header specifies the SIP packet's destination
  6. The From header specified who sent the SIP packet
  7. This particular packet is a SIP/SD packet, meaning it contains a Session Description Protocol message that contains information the remote client needs to open an RTP session for this call
  8. The IP address of the SIP client that created this packet
  9. The IP address the destination SIP client should contact to open an RTP session. It also specifies the IP Address version (IPv4 or IPv6)
  10. The key pieces of information in this header are audio, 33438, and RTP/AVP. The audio component obviously signifies that this is an audio call, 33438 specifies the port that the remote computer should open at the IP address specified in (9), and RTP/AVP specifies that the Real-time Transport Protocol will be used for the session. The numbers at the end of this header represent the different codecs that this client supports. The SIP client at the other end must support one of the matching protocols in order to be able to make a successful connection.

Unlike SIP, which listens on port 5060 (usually UDP, but can be TCP), RTP uses a dynamic port range (and is only ever UDP), generally between 10000-20000. This range can usually be customized on the client to suit differing firewall configurations.
Now while SIP traffic passes from one server to the next to get to its destination, RTP sessions are set up directly between SIP clients (There is an exception to this rule, that I will explain shortly).
Here is an easy way to think of this. I want to call Bob on the phone, but I don't know Bob's number. I do however have his email address. So I send Bob an email telling him to call me on my phone number. The email passes through several servers and eventually arrives at Bob's inbox. Bob reads the email containing my phone number, picks up the phone, and calls me. We can then begin our audio conversation with each other. The email was used to help us set up a phone conversation, and after that it was no longer needed. Our phone call did not have to pass through the servers my email passed through to get to him, because they are two separate systems. The email in this example is analogous to a SIP packet, the phone call is our RTP session.

Now SIP is a good protocol, but things kind of break down when NAT gets involved. SIP packets themselves tend to move about without too much trouble (generally), as they 'hop' from one server to another. RTP sessions are somewhat more troublesome. Either both clients need to be aware they are behind a NAT, and substitute their local IP addresses for their public IPs in their Session Description messages and open the appropriate firewall ports, or something has to modify the SIP packets en route.
This is where the exception to the rule that I mentioned comes into play. Products known as Back-to-Back User Agents, one of the most well known being Asterisk, can can actually proxy RTP traffic.

Asterisk can modify SIP packets to direct the caller and destination to establish an RTP session with itself, rather than with each other. This is useful in situations where two SIP clients may not have direct access to each other, most commonly, when one or both of the SIP clients are behind a NAT.
It is important to note that Asterisk only proxy's RTP traffic when it has to, and when configured to do so. If both clients are on the same local network segment, Asterisk doesn't need to play a part in the RTP session, and it will proxy only the SIP traffic.
In summary, when troubleshooting packet captures, pay close attention to;
1. The ports and IP addresses specified in the SIP message header (to, from, via). Determine where the packet came from, where it thinks it needs to go, and the route it has taken to get to where you found it.
2. The ports and IP addresses specified in the Session Description (SD) portion of the SIP message. Ensure that the remote party will be able to connect to both the IP address and the port specified.

Comments

Anonymous said…
Excellent,
I found what I needed,
Thanks
Anonymous said…
Very well explained, like it, just what i needed, thanks from Santo Domingo, DR.
Anonymous said…
Excellent :-)
Gyass said…
>>I started writing BASIC programs on the Commodore 64 at the age of about 8.<<

Lol, i started from my Craddle :-D
Anonymous said…
Hi,

very good article, would you explain little bit more if we want to reconstruct both SIP and RTP packets to log the full call details what are the basic information which both packets have that this sip session belongs to this rtp stream

Regards,
Nisha
Asim said…
Each and every thing cleared
Anonymous said…
You are awesome....
Safi said…
Thanks! It a simple, straight forward guide to the world of SIP and RTP. i would recommend to anyone!
Raf said…
I really enjoyed reading the article however we should mention that RTP works perfectly well with TCP protocol as its carrier. And the good example are MS and IBM products which are actually based on RTP/TCP protocol.
Karthik Prabhu said…
Thanks for your clear explaination
Karthik Prabhu said…
Thanks for your clear explaination
Anonymous said…
Great documentation, thanks
Anonymous said…
perfect explanation
Anonymous said…
Thanks for the nice explanation
Anonymous said…
Thanks
Adi said…
The RTP packets start to pass only after the SIP protocol finished his initiation?
Anonymous said…
Thanks for effort, we are awaiting for more :)
Anonymous said…
Thanks for effort, we are awaiting for more :)
William said…
First decent explanation of SIP and RTP I found!

Thanks for this post!
shweta said…
This post is helpful . simple and concise. i just want to know that apart from SIP Message request is it mandatory to provide SDP part in a SIP request. ?
very well simplified and explanatory
Anonymous said…
This article is very helpful. Thank you :)
Josh said…
The other day I came across a page I thought I must share: ozekiphone.com/what-is-rtp-real-time-transport-protocol-321.html
It is the page of Ozeki Phone System XE telephone system and you can get to know RTP better here.
Anonymous said…
Great explanation, well done!
Unknown said…
incredible explanation
Anonymous said…
SIP Interview Questions

Adding one more appreciation to the list. Well explained buddy :-)
Just adding few cents of mine. The RTP/AVP is the Real time Transport protocol for "Audio Video Profile" and the fact why UDP is used is pretty straight forward - UDP has a very fast re-transmission rate even if a packet is lost(ex YouTube buffering). Whereas, for a TCP a 3-way handshaking will be required for every RTP packet to be re-transmitted.
Keep posting. Cheers !!!
Anonymous said…
Great Article mate! well done
Anonymous said…
Very good overview, Thanks
Anonymous said…
Thank you so much, this is what I was looking for.....thanks a lot buddy
Unknown said…
thank you very much ... well explained :)
Anonymous said…
excellent...exactly what i was looking for....
@n@Nt said…
superb....i like it....:)
Unknown said…
many Thanks for your clear explaination
Anonymous said…
nice description.explained clearly
Anonymous said…
A great post, everything else I've found on SIP/RTP has been hideously complicated -- thanks for explaining it so clearly and simply !
aslam said…
well done!
Bilal Abbasi said…
Great
Anonymous said…
Hello,
Thank you for such simple and precise article.
Please continue the good work.
Anonymous said…
Is it possible for Asterisk to send/receive SIP on interface eth1, and RTP on interface eth2. If so, how can it do this?
Unknown said…
Great !!! , explained in simplest form, very helpful.
Anonymous said…
Excellent. Such a simple explanation of a complex subject.
Tony from Hayward CA said…
Easy to understand. Thanks very much for sharing.
Wissam said…
Hi,
Will the Destination IP be visible to the client as the RTP is established between them!
Thx
Jeeves said…
Coming from SS7 background where signaling & trunkId are clearly related in SS7 message,
I was itching to find out how is the RTP related to SIP. your blog tells me how.
One request, I am unable to see the images in the post, as I am in Asia. e.g.
http://members.iinet.net.au/~blade9/UnderstandingSIPandRTP_DB23/image0_thumb27.png

If it possible to place the images so it can viewed by outsiders.
Mariana said…
Thanks Ryan, great explanation. :)
Manish Dwivedi said…
Excellent. Thanks a lot.
Bodner & Chas said…
This is dank. Thanks
Unknown said…
Kudos.

One of the best SIP RTP explanation! Keep up with with the good work....from Oxford
584165187488 said…
estoy necesitando configurar un stun server porque tengo problema de audio para llamadas a chile