Understanding Media Flows in Microsoft Teams and Skype for Business BRK4004 Summary

Following ignite there’s a ton and awesome content and session recordings to watch so this today i saw Thomas Binders session on “Understanding Media Flows in Microsoft Teams and Skype for Business” and thought this should be a goodie.

Great session by Thomas Binder and there’s a ton of awesome information and tips on media flows and understand media / transports relays and the difference between Skype for Business and Teams. Its amazing just how much happens under the hood that users never see just how SfB and Teams finds the best media path, codecs to set up and have a best quality call possible with client connected everywhere. Towards the end great tips on tools to use to read logs and traffic and troubleshooting.

Hot TIP with teams logs towards the bottom of the highlighted Yellow is how to format Teams logs to noted with line breaks “\r\n this is line break so replace with “ “

Thank you Thomas for this great session! there was a lot of applause at the end and well deserved!

Lets go!

Reference URL – https://www.youtube.com/watch?v=aD5mUg2ZzLQ

image

Done this session a couple of times for SfB before and opens questions the audience

image

image

Key Learning’s!

  • understand traffic peer to peer,
  • great to have local internet breakout and not all traffic to central locations,
  • stress UDP ports 3478, 3479 these are critical

image

image

Not taking about signalling, its all about media

image

Candidate is combination of IP and port and allow other peer to connect

ICE – uses two techniques, STUN and help to transverse a net device, TURN – relay technique. two types of relays., media relay and transport relay.

image

Two endpoints that need to communicate

First they need Signalling to say “Hey I’m here”

image

Here we have signalling via Office365

Call could be audio, video or desktop sharing

If they want a call we want to send as direct as possible, they could be in same site or same office or across floor but the network is directly routable.

image

They have devices that don’t allow direct calls.this is a problem.

image

Then theres Charlie’s,  outside the network as well

image

Firewalls also may not allows direct communication from external clients on internet to internal clients. Charlie to Alice

image

image

Now we need some logic that helps to establish all the different call flows

lets break down

NAT

image

NAT – Network Address translation

Example at home you can have lots of different devices, Xbox’s, PlayStations, pcs with internal ip address all sharing a single public ip address. Your router does the NAT. Great as it provides security as well as unknown traffic to your ip would get dropped is not requested.

image

  • Control traffic that’s coming
  • Additional features, deep packet inspections and proxies
  • Sharing of IP Addresses

HTTP proxy servers

image

Now HTTP proxies

  • Bad for Teams and SfB as doesn’t allow UDP only HTTP will always use TCP
  • UDP preferred for real time
  • may corrupt packets
  • block traffic or slow down
  • real time may not be real time if any latency added

image

The solution is ICE, STUN and TURN!

image

image

First there’s signalling that goes via the Cloud

  • For SfB signalling is done via SIP
  • For Teams is not SIP its REST API via https and web sockets for more persistent comms no more sip

BUT

In terms of ICE very similar

image

  • Now we have STUN and TURN server these are servers and function as a relay if client wants to talk to someone but cant it can use stun and turn server as relays
  • also same time helps us find our public ip address and will allows net to allow incoming traffic
  • client sends packet to relay servers and allocates candidates and sends back packet and knows my public ip is this and then client knows this is my public ip and maybe i can accept traffic there

image

image

and ICE

image

  • Calls to PSTN via Office 365 uses ICE
  • ICE used for all real time modalities
  • Teams we upload files to OneDrive for Business

Relay – very important for ICE negations

image

Two types of Relay

  • Media relays
  • transport relays

Media relay component built for Skype for business server and was the edge server and was moved to the cloud but wasn’t built for the cloud so a cloud solution was born

Transport relay built for scales and more flexibility

image

Media relay static in one DC, if your in Orlando and media relay in Europe traffic travels back to Europe to use the relay.

Transport Relays – much smarter and uses dynamic discovery via anycast

travel to orlando i can use transport relay in the US not Europe.

image

image

Important for local internet connections as you may not be able to take advantages of the transport relay and keep traffic local.

image

View the other two ignite sessions as well

image

  • Media relay same UDP ports
  • Transport relays uses different UDP port per workloads

image

Skype for Business uses Media Relay

Transport Relay in progress with SfB but is in use with Teams

Teams always transport relay!

image

  • One IP for all Anycast servers
  • and closest servers is always used with least hops
  • based on endpoint location and privacy boundaries
  • US government cloud uses only US
  • Tenant in EMEA
  • all traffic encrypted with Key

image

based on ECMP and how can easily distribute load

super easy to manage

image

image

5 phases of ICE

1. request credentials

2. candidate discovery – once i know where i can be reached i send to client

3. candidate exchange and try to establish connection

4 connectivity Checks

5. candidate promotion selects best media path

image

Sign into service, from signalling learn a relay configured for me

image

SfB Online using Media relay or Lync 2010, Lync 2010 always uses media relay

image

Option 2 SfB Online, Lync 2013 or never

image

Teams always uses TRAP!

First Demo!

Snooper

image

Shows different sip dialogs and left SIP header and on the right the details

Look for MRAS

image

First incoming 200 OK – in band provisioning

image

Learn Audio ports range

Interested in MRAS, here we have a relay configured. Office365 should always have this!”"

image

Next Service request and there is a relay configure with credentials

image

Valid for 480 minutes – 8 hours (SfB)

Teams valid 24 hours

Next Credential Response

image

Here’s the credentials and used its own certificate to create this and if relay used it will present this

Media relay list

image

Learn what media relay is, username and password and ports to use

image

Only one relay listed and Office365 will only show external media relay

That’s was for SfB but for Teams its more tricky!

image

For teams there’s is no nice tools to read logs, all traffic is https and sometimes web socket. You need to trust the certificate and it does a man in the middle attack.

Charles web proxy, Charles has a sequence view and structure view

image

image

image

image

address is not a fqdn its an IP Address, different to media relay

Just tell the IP directly so faster

image

image

  • Now i need to discover my ip addresses
  • first candidate is always the local interface address
  • then ask the relay and allocated candidate for me
  • and then relay sends its candidates

image

then the same for TCP

image

Always prefer UDP but can use TCP as its better than no call at all!

image

3478 no matter the workload in TEAMS at the moment! 4478 listed above should be 3478 mistake on slides here

Candidates

Some SfB workloads always use TCP! 1:1 file transfer and desktop sharing via RDP

image

image

image

  • send message to peer i want to talk to
  • then other endpoint will do the same with where they can be reached
  • then person picks up and this is the endpoint were talking to.

Lets look at these logs

back to snooper

image

We can see here Martin calls Thomas by the invite

image

we can see this was an audio call and the candidates

image

scroll down and there’s more information

we can see the codecs Martins supports

image

let look at the candidates again

first one are 1 and 1 candidate come in pairs, one for RTP and one for RTCP

image

then UDP

image

Then priority – the higher the number the more i want to use this candiate

image

Then IP Address

image

This is this IP of this actual candidate

then ports

image

then Type

image

here we have host and we know this is the local ip address of the endpoint!

there are other interesting types

image

there srvflx  raddress this is where i send a packet to the relay and the relay says the address is the following

image

then the ip address matches host address and relays says when you send messages from 192.168.1.110 the address is coming from 91.205.175.103

image

then relay address

image

if i can establish direct connection or srvflx address other may be able to talk to my relay address

also IPv6 candidates

image

TCP passive and active candidates

image

TCP passive will be able to received traffic as well, active and passive will match each other

overall looks

image

now theres session progress 183 session progress – back from called progress and here my information

image

There two here but Why ?

We see one from Skype for Business

image

and the other coming from SfB but an android Phone

image

user has more than one device we establish media session with all of them

now incoming packets there are no more pairs

image

here we have rctp mux (multiplexing) so i send old version and hey i know the new version as well.

image

and another thing that’s interesting is the encryption, so we can see hear cypto and suite and key this is how the two endpoints encrypt the traffic they will via the secure signalling channel and let each other know which cipher and only the two endpoints know how to encrypt the traffic, the relay never sees this and just passes them on.

image

image

MRAS allow endpoints to allocate candidates

No encryption of traffic

image

Connectivity Checks

Now each one know where the other one can be reached and will determine all possible udp and tcp ports pairing

IPv4 and IPv6

For SfB relay can bridge TCP and UDP, is SfB can only talk TCP and the other UDP and TCP the whole call needs to be TCP.

In Teams one can talk UDP and the other TCP and the relay will translate

We found out which candidate pairs work and prioritising and most optimum and that’s the one we use for the call

we can not see this for snooper or Charles

image

After other person picks up and identified best candidate and then we can see which one

IPv4 over IPv6

UDP over TCP

Prefer more direct path

See re invite in logs and there’s only one candidate that will be used for this call

image

TCP very good protocol as it protects against lost packets and lost information, if i send packet i will get acknowledge and if i don’t get it i wait then resend the packet but this times time and in real time comms we want to make sure traffic sent gets there as fast as possible, we don’t like lost packets but packet may contain 20ms of voice you may not hear that and codecs are smart and can recover

TCP adds lost packets , delays and can cause

UDP fire and forget approach ideal for real time communications

image

lets look at final candidates

before that lets look at teams candidates

In Charles search for a=candidate

image

image

select conversation

image

its one super long line !

image

\r\n this is line break

copy and paste into text editor and replace \r\n with line breaks and this gets you the below

image

not super nice to read but

scroll down and we can see info on codecs

image

look similar BUT

based on relay candidates it will use ports based on workloads

image

here we have 3480 not the high ports

image

other interesting thing all relay candidates will come with MTRUN ID this is security and who can access my service, in SfB we use the huge port range and when someone wanted to allocate we randomly picked one and gave some security and was opened for short time, it was additional, but if we use the same port for connections they can go there but they can as they need a MTURN ID to connect to that port.

image

back to snooper for final candidate for SfB

search for a=remote candidate

image

contains 1 candidate

image

and its the prflx candidate mean relexivate and who ever im talking to they are talking to my net device and relexative, IP the same as the reflexactive but port is different.

image

if we look at 200 OK

image

we can see here remote candidate is the relay, this client is talking to the relay.

image

we have talking from the calling person to the relay of the called person and theres one relay in the media path. we can understand how traffic is flowing.

Call Flows

image

image

mentioned before we have 1:1 call we want to send as direct as possible, different if meeting as the cloud needs to mix

We have two SfB clients and there own relays with 443, 3478-81

Both connect to relay allocated candidates port 433 TCP or 3478 UDP, for udp it will then be redirected on workload 3479 for audio

image

next try to establish direct call as best option

image

same time they try to talk via the relay

image

and now the calling client try’s to connect to called clients relay on the 50k port range as that was candidate allocated for me

then we do the same for the other relay

image

If all work then Fantastic and we can pick direct

if direct doesn’t work we pick the relay of the called client or if that doesn’t work we use the calling client relay

and if both don’t work then the relays need to talk to each other! this is why its still useful for SfB if the 50K is still open! if you have 50k port range open then calls can establish for one relay if you close 50K port range as Microsoft recently said its not required anymore then you have two media relays in the media path

image

Looked at the difference for quality if you close 50K and its not that big of difference, calls setup may be quicker, if you don’t have them open its seems not essential BUT if they are already open then no reason to close them.

TEAMS

Similar concept

image

they connect to relay on 443 TCP or 3478 UDP, they connect to their own relay always talk 3478

test direct

image

Then the other one via relay on 3479 – 3481 depending on workloads

image

other relay will be tested

image

and if all of that doesn’t work they could still talk to each other

image

SfB and Teams side by Side for 1:1 (Peer to Peer)

image

SfB – Client to Service

image

Mediation server or conferencing server

Mediation servers on right side as its internal to network

Client talks to its own relay 443 tcp , 3478-81 UDP

image

server does the same

image

now the client will try and talk directly to the server and if not firewalled this may be possible but cant be guaranteed

image

If it doesn’t work then we would use relay of called endpoint which is the servers

image

If that doesn’t work we can talk to the realy of the end user

image

you should not see two relays as the 50K port range is open as ports the cloud service

Teams: Client to Service

image

Teams client allocated candidates

image

The service will never allocate candidates as we know the service can talk its relays, it doesn’t needs its own relay

again we try direct connection, if direct works

image

The Teams client we talk to assigned transport relay and the service component will talk tot the same relay

Bring that all together! in single table

image

Left we have workloads, allocate candidate, audio, video, desktop sharing

Teams, SfB, service port media relay, transport relay

SfB Client port while i allocated candidates will honour client ports per workload, and all of this if i can have media relay to 3478 UDP or transport relay also to 443 tcp and be redirected and once sfb establishes audio send to 443 tcp / 3478 udp , transport relay 3479 udp for audio.

Teams client source port will always 1024 and up plan to change this and have similar to SfB so you can look at traffic and see what workloads

Teams client to transport relay it will be UDP 3478 always and plan to change this and you can look at source ports to destination port. still working on this.

image

Direct is required, every client needs to connect direct to Office365 so they can establish media path, talk direct to transport or media relay

  • no proxy
  • no shaping
  • no deep packet inspection
  • If possible use local internet breakout and go to shortest route to transport relay and route over Microsoft network.
  • Prefer UDP over TCP – better for real time
  • TCP can be used as backup and in SfB used for some scenarios
  • Important to look at documented list of IPs and FQDN to open environment to
  • aka.ms/o365endpoints
  • quite a list and is updated a lot, subscribe to RSS feed!
  • Open UDP ports

If people have SfB a year ago for media open 443 (not changed) or 3478 but in the past we didn’t need 3479-3481 UDP these may not be open

Problems seen with transports relays and client try 3478 and works and then allocate candidates and talk to this IP BUT on port 3479 or 80 or 81 which could be blocked. Firewall may block this and UDP will FAIL ! now media will go over TCP! no one will call and say calls wont work but quality may be worse!

Be sure all UDP ARE OPENED!

image

Skype for Business Hybrid you need your on premises servers needs to talk to Office365 they don’t need the new ports 3479-3481 just for client to service.

Edge server will still talk 3478

Tools and Troubleshooting

image

image

SfB super easy! Uccapilog.log and have snooper

Teams – not so easy!

Need to do trick with local proxy, man in middle attack, collect traffic, examples are fiddler and Charles proxy.

SfB turn on logs

image

may need to delete logs, sign out and sign back in, start with clean logs

image

image

Search tips

a=candidate

a=remote-candidate

when reproduce problem and you want to see a=candidate sometimes after someone answer it may take 7-10 seconds for this so recommend to leave call running for 20 seconds then disconnect calls so can makes sure final candidates are there.

Reason for that is when other person picks up we may not do call over optimal candidate, in background may be talking for better connection and then switching to better connection.once final candidate pair is listed it wont change.

image

Tips to configure, web sockets can be very persistent and in test had hard time to capture them each time and then close Teams and start and sometime would see it and sometimes not.

How teams does it today but it may CHANGE!

image

Also CQD Call quality Dashboard, after every call over signalling it logs the call quality experience, IPs, ports,

image

can look at data and create filters and look at UDP calls and TCP calls, shouldn’t see a lot of TCP calls,

Practical guidance on CQD.

image

image

Filters created on this example as below

image

then report created

image

lots of TCP but that’s on App sharing so that’s expected in SfB

very few session using VBSS and it seems a lot of RDP going on, could be giving control or old clients.

image

you can investigate client types and check if client support only RDP

image

Other report with filters applied on the left

image

subnets replace to hide customer data

can compare subnet by number of TCP and UDP

find top offending subnets and find out why so much TCP traffic

image

Test that ports are open

SfB network assessment tool send real media to transport relay and collects information on jitter, delay and packet loss.

However SOON new version will be available to test connectivity for TCP and UDP ports! run from pc and find can it connect to required ports

image

image

image

Tests all the ports against set of IPs and downloaded at run and always up to date IPs, any connectivity issue this tool is great to run on PC and test connectivity

might be situations where connectivity is working but something in the way may corrupt packets

IF the tools worked then perhaps trace a call

Resources and summary

image

image

image

  • Now we understand the challenges
  • find most optimum media path
  • use tools
  • Traffic peer to peer
  • client to server
  • Leverage local internet if possible
  • Open 3478-3481 UDP on firewall !

image

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.