Understanding Media Flows in Microsoft Teams and Skype for Business BRK4004 Summary

Following ignite there’s a ton and awesome content and session recordings to watch so this today i saw Thomas Binders session on “Understanding Media Flows in Microsoft Teams and Skype for Business” and thought this should be a goodie.

Great session by Thomas Binder and there’s a ton of awesome information and tips on media flows and understand media / transports relays and the difference between Skype for Business and Teams. Its amazing just how much happens under the hood that users never see just how SfB and Teams finds the best media path, codecs to set up and have a best quality call possible with client connected everywhere. Towards the end great tips on tools to use to read logs and traffic and troubleshooting.

Hot TIP with teams logs towards the bottom of the highlighted Yellow is how to format Teams logs to noted with line breaks “\r\n this is line break so replace with “ “

Thank you Thomas for this great session! there was a lot of applause at the end and well deserved!

Lets go!

Reference URL – https://www.youtube.com/watch?v=aD5mUg2ZzLQ

Done this session a couple of times for SfB before and opens questions the audience

Key Learning’s!

understand traffic peer to peer,
great to have local internet breakout and not all traffic to central locations,
stress UDP ports 3478, 3479 these are critical

Not taking about signalling, its all about media

Candidate is combination of IP and port and allow other peer to connect

ICE – uses two techniques, STUN and help to transverse a net device, TURN – relay technique. two types of relays., media relay and transport relay.

Two endpoints that need to communicate

First they need Signalling to say “Hey I’m here”

Here we have signalling via Office365

Call could be audio, video or desktop sharing

If they want a call we want to send as direct as possible, they could be in same site or same office or across floor but the network is directly routable.

They have devices that don’t allow direct calls.this is a problem.

Then theres Charlie’s, outside the network as well

Firewalls also may not allows direct communication from external clients on internet to internal clients. Charlie to Alice

Now we need some logic that helps to establish all the different call flows

lets break down

NAT

NAT – Network Address translation

Example at home you can have lots of different devices, Xbox’s, PlayStations, pcs with internal ip address all sharing a single public ip address. Your router does the NAT. Great as it provides security as well as unknown traffic to your ip would get dropped is not requested.

Control traffic that’s coming
Additional features, deep packet inspections and proxies
Sharing of IP Addresses

HTTP proxy servers

Now HTTP proxies

Bad for Teams and SfB as doesn’t allow UDP only HTTP will always use TCP
UDP preferred for real time
may corrupt packets
block traffic or slow down
real time may not be real time if any latency added

The solution is ICE, STUN and TURN!

First there’s signalling that goes via the Cloud

For SfB signalling is done via SIP
For Teams is not SIP its REST API via https and web sockets for more persistent comms no more sip

BUT

In terms of ICE very similar

Now we have STUN and TURN server these are servers and function as a relay if client wants to talk to someone but cant it can use stun and turn server as relays
also same time helps us find our public ip address and will allows net to allow incoming traffic
client sends packet to relay servers and allocates candidates and sends back packet and knows my public ip is this and then client knows this is my public ip and maybe i can accept traffic there

and ICE

Calls to PSTN via Office 365 uses ICE
ICE used for all real time modalities
Teams we upload files to OneDrive for Business

Relay – very important for ICE negations

Two types of Relay

Media relays
transport relays

Media relay component built for Skype for business server and was the edge server and was moved to the cloud but wasn’t built for the cloud so a cloud solution was born

Transport relay built for scales and more flexibility

Media relay static in one DC, if your in Orlando and media relay in Europe traffic travels back to Europe to use the relay.

Transport Relays – much smarter and uses dynamic discovery via anycast

travel to orlando i can use transport relay in the US not Europe.

Important for local internet connections as you may not be able to take advantages of the transport relay and keep traffic local.

View the other two ignite sessions as well

Media relay same UDP ports
Transport relays uses different UDP port per workloads

Skype for Business uses Media Relay

Transport Relay in progress with SfB but is in use with Teams

Teams always transport relay!

One IP for all Anycast servers
and closest servers is always used with least hops
based on endpoint location and privacy boundaries
US government cloud uses only US
Tenant in EMEA
all traffic encrypted with Key

based on ECMP and how can easily distribute load

super easy to manage

5 phases of ICE

1. request credentials

2. candidate discovery – once i know where i can be reached i send to client

3. candidate exchange and try to establish connection

4 connectivity Checks

5. candidate promotion selects best media path

Sign into service, from signalling learn a relay configured for me

SfB Online using Media relay or Lync 2010, Lync 2010 always uses media relay

Option 2 SfB Online, Lync 2013 or never

Teams always uses TRAP!

First Demo!

Snooper

Shows different sip dialogs and left SIP header and on the right the details

Look for MRAS

First incoming 200 OK – in band provisioning

Learn Audio ports range

Interested in MRAS, here we have a relay configured. Office365 should always have this!”"

Next Service request and there is a relay configure with credentials

Valid for 480 minutes – 8 hours (SfB)

Teams valid 24 hours

Next Credential Response

Here’s the credentials and used its own certificate to create this and if relay used it will present this

Media relay list

Learn what media relay is, username and password and ports to use

Only one relay listed and Office365 will only show external media relay

That’s was for SfB but for Teams its more tricky!

For teams there’s is no nice tools to read logs, all traffic is https and sometimes web socket. You need to trust the certificate and it does a man in the middle attack.

Charles web proxy, Charles has a sequence view and structure view

address is not a fqdn its an IP Address, different to media relay

Just tell the IP directly so faster

Now i need to discover my ip addresses
first candidate is always the local interface address
then ask the relay and allocated candidate for me
and then relay sends its candidates

then the same for TCP

Always prefer UDP but can use TCP as its better than no call at all!

3478 no matter the workload in TEAMS at the moment! 4478 listed above should be 3478 mistake on slides here

Candidates

Some SfB workloads always use TCP! 1:1 file transfer and desktop sharing via RDP

send message to peer i want to talk to
then other endpoint will do the same with where they can be reached
then person picks up and this is the endpoint were talking to.

Lets look at these logs

back to snooper

We can see here Martin calls Thomas by the invite

we can see this was an audio call and the candidates

scroll down and there’s more information

we can see the codecs Martins supports

let look at the candidates again

first one are 1 and 1 candidate come in pairs, one for RTP and one for RTCP

then UDP

Then priority – the higher the number the more i want to use this candiate

Then IP Address

This is this IP of this actual candidate

then ports

then Type

here we have host and we know this is the local ip address of the endpoint!

there are other interesting types

there srvflx raddress this is where i send a packet to the relay and the relay says the address is the following

then the ip address matches host address and relays says when you send messages from 192.168.1.110 the address is coming from 91.205.175.103

then relay address

if i can establish direct connection or srvflx address other may be able to talk to my relay address

also IPv6 candidates

TCP passive and active candidates

TCP passive will be able to received traffic as well, active and passive will match each other

overall looks

now theres session progress 183 session progress – back from called progress and here my information

There two here but Why ?

We see one from Skype for Business

and the other coming from SfB but an android Phone

user has more than one device we establish media session with all of them

now incoming packets there are no more pairs

here we have rctp mux (multiplexing) so i send old version and hey i know the new version as well.

and another thing that’s interesting is the encryption, so we can see hear cypto and suite and key this is how the two endpoints encrypt the traffic they will via the secure signalling channel and let each other know which cipher and only the two endpoints know how to encrypt the traffic, the relay never sees this and just passes them on.

MRAS allow endpoints to allocate candidates

No encryption of traffic

Connectivity Checks

Now each one know where the other one can be reached and will determine all possible udp and tcp ports pairing

IPv4 and IPv6

For SfB relay can bridge TCP and UDP, is SfB can only talk TCP and the other UDP and TCP the whole call needs to be TCP.

In Teams one can talk UDP and the other TCP and the relay will translate

We found out which candidate pairs work and prioritising and most optimum and that’s the one we use for the call

we can not see this for snooper or Charles

After other person picks up and identified best candidate and then we can see which one

IPv4 over IPv6

UDP over TCP

Prefer more direct path

See re invite in logs and there’s only one candidate that will be used for this call

TCP very good protocol as it protects against lost packets and lost information, if i send packet i will get acknowledge and if i don’t get it i wait then resend the packet but this times time and in real time comms we want to make sure traffic sent gets there as fast as possible, we don’t like lost packets but packet may contain 20ms of voice you may not hear that and codecs are smart and can recover

TCP adds lost packets , delays and can cause

UDP fire and forget approach ideal for real time communications

lets look at final candidates

before that lets look at teams candidates

In Charles search for a=candidate

select conversation

its one super long line !

\r\n this is line break

copy and paste into text editor and replace \r\n with line breaks and this gets you the below

not super nice to read but

scroll down and we can see info on codecs

look similar BUT

based on relay candidates it will use ports based on workloads

here we have 3480 not the high ports

other interesting thing all relay candidates will come with MTRUN ID this is security and who can access my service, in SfB we use the huge port range and when someone wanted to allocate we randomly picked one and gave some security and was opened for short time, it was additional, but if we use the same port for connections they can go there but they can as they need a MTURN ID to connect to that port.

back to snooper for final candidate for SfB

search for a=remote candidate

contains 1 candidate

and its the prflx candidate mean relexivate and who ever im talking to they are talking to my net device and relexative, IP the same as the reflexactive but port is different.

if we look at 200 OK

we can see here remote candidate is the relay, this client is talking to the relay.

we have talking from the calling person to the relay of the called person and theres one relay in the media path. we can understand how traffic is flowing.

Call Flows

mentioned before we have 1:1 call we want to send as direct as possible, different if meeting as the cloud needs to mix

We have two SfB clients and there own relays with 443, 3478-81

Both connect to relay allocated candidates port 433 TCP or 3478 UDP, for udp it will then be redirected on workload 3479 for audio

next try to establish direct call as best option

same time they try to talk via the relay

and now the calling client try’s to connect to called clients relay on the 50k port range as that was candidate allocated for me

then we do the same for the other relay

If all work then Fantastic and we can pick direct

if direct doesn’t work we pick the relay of the called client or if that doesn’t work we use the calling client relay

and if both don’t work then the relays need to talk to each other! this is why its still useful for SfB if the 50K is still open! if you have 50k port range open then calls can establish for one relay if you close 50K port range as Microsoft recently said its not required anymore then you have two media relays in the media path

Looked at the difference for quality if you close 50K and its not that big of difference, calls setup may be quicker, if you don’t have them open its seems not essential BUT if they are already open then no reason to close them.

TEAMS

Similar concept

they connect to relay on 443 TCP or 3478 UDP, they connect to their own relay always talk 3478

test direct

Then the other one via relay on 3479 – 3481 depending on workloads

other relay will be tested

and if all of that doesn’t work they could still talk to each other

SfB and Teams side by Side for 1:1 (Peer to Peer)

SfB – Client to Service

Mediation server or conferencing server

Mediation servers on right side as its internal to network

Client talks to its own relay 443 tcp , 3478-81 UDP

server does the same

now the client will try and talk directly to the server and if not firewalled this may be possible but cant be guaranteed

If it doesn’t work then we would use relay of called endpoint which is the servers

If that doesn’t work we can talk to the realy of the end user

you should not see two relays as the 50K port range is open as ports the cloud service

Teams: Client to Service

Teams client allocated candidates

The service will never allocate candidates as we know the service can talk its relays, it doesn’t needs its own relay

again we try direct connection, if direct works

The Teams client we talk to assigned transport relay and the service component will talk tot the same relay

Bring that all together! in single table

Left we have workloads, allocate candidate, audio, video, desktop sharing

Teams, SfB, service port media relay, transport relay

SfB Client port while i allocated candidates will honour client ports per workload, and all of this if i can have media relay to 3478 UDP or transport relay also to 443 tcp and be redirected and once sfb establishes audio send to 443 tcp / 3478 udp , transport relay 3479 udp for audio.

Teams client source port will always 1024 and up plan to change this and have similar to SfB so you can look at traffic and see what workloads

Teams client to transport relay it will be UDP 3478 always and plan to change this and you can look at source ports to destination port. still working on this.

Direct is required, every client needs to connect direct to Office365 so they can establish media path, talk direct to transport or media relay

no proxy
no shaping
no deep packet inspection
If possible use local internet breakout and go to shortest route to transport relay and route over Microsoft network.
Prefer UDP over TCP – better for real time
TCP can be used as backup and in SfB used for some scenarios
Important to look at documented list of IPs and FQDN to open environment to
aka.ms/o365endpoints
quite a list and is updated a lot, subscribe to RSS feed!
Open UDP ports

If people have SfB a year ago for media open 443 (not changed) or 3478 but in the past we didn’t need 3479-3481 UDP these may not be open

Problems seen with transports relays and client try 3478 and works and then allocate candidates and talk to this IP BUT on port 3479 or 80 or 81 which could be blocked. Firewall may block this and UDP will FAIL ! now media will go over TCP! no one will call and say calls wont work but quality may be worse!

Be sure all UDP ARE OPENED!

Skype for Business Hybrid you need your on premises servers needs to talk to Office365 they don’t need the new ports 3479-3481 just for client to service.

Edge server will still talk 3478

Tools and Troubleshooting

SfB super easy! Uccapilog.log and have snooper

Teams – not so easy!

Need to do trick with local proxy, man in middle attack, collect traffic, examples are fiddler and Charles proxy.

SfB turn on logs

may need to delete logs, sign out and sign back in, start with clean logs

Search tips

a=candidate

a=remote-candidate

when reproduce problem and you want to see a=candidate sometimes after someone answer it may take 7-10 seconds for this so recommend to leave call running for 20 seconds then disconnect calls so can makes sure final candidates are there.

Reason for that is when other person picks up we may not do call over optimal candidate, in background may be talking for better connection and then switching to better connection.once final candidate pair is listed it wont change.

Tips to configure, web sockets can be very persistent and in test had hard time to capture them each time and then close Teams and start and sometime would see it and sometimes not.

How teams does it today but it may CHANGE!

Also CQD Call quality Dashboard, after every call over signalling it logs the call quality experience, IPs, ports,

can look at data and create filters and look at UDP calls and TCP calls, shouldn’t see a lot of TCP calls,

Practical guidance on CQD.

Filters created on this example as below

then report created

lots of TCP but that’s on App sharing so that’s expected in SfB

very few session using VBSS and it seems a lot of RDP going on, could be giving control or old clients.

you can investigate client types and check if client support only RDP

Other report with filters applied on the left

subnets replace to hide customer data

can compare subnet by number of TCP and UDP

find top offending subnets and find out why so much TCP traffic

Test that ports are open

SfB network assessment tool send real media to transport relay and collects information on jitter, delay and packet loss.

However SOON new version will be available to test connectivity for TCP and UDP ports! run from pc and find can it connect to required ports

Tests all the ports against set of IPs and downloaded at run and always up to date IPs, any connectivity issue this tool is great to run on PC and test connectivity

might be situations where connectivity is working but something in the way may corrupt packets

IF the tools worked then perhaps trace a call

Resources and summary