Tuesday, 8 October 2019

Microsoft Teams Bandwidth Usage Deep Dive


I was recently reading this Microsoft Docs article called “Prepare your organization's network for Microsoft Teams” which says the following:

“The actual bandwidth consumption in each audio/video call or meeting will vary based on several factors, such as video layout, video resolution, and video frames per second. When more bandwidth is available, quality and usage will increase to deliver the best experience.”

I found the comment about bandwidth availability to be interesting and wondered how this actually functioned in practice. So I did (what turned out to be) a lot of testing to see exactly how the client behaves under various network and policy constraints and I thought the outcomes were interesting enough to write an article about. So steel yourself and lie back and think of Redmond as they say in the classics…

Note: The testing in this article is a point-in-time sample of how the product worked when the testing was done (Late September 2019). Office 365 is constantly being updated and things may change over time. If you have seen things operate differently or Microsoft provide information contrary to that documented here let me know.

Settings


The Teams service, at the time of writing this article, does not support individual Call Admission Control features. However, in the Teams Portal Meeting Policy there is a setting called “Media bit rate (KBs)”. This setting is shown below:



A bugbear I have with the naming of this setting is that it includes “KBs“, which to my eyes means Kilo Bytes per Second (ie. a capital ”B” means Bytes and a lower case “b” means bits). However, this is not the case - this setting is actually configured in Kilo bits per second (ie. Kbps). I have lobbied Microsoft to change this. We’ll see how that works out.

As you will see the default value for this setting is 50000 which seems kind of crazy because this is 50Mbps. In my opinion, this probably would have been better being represented as “Max” or “Unlimited” or anything else rather than this very high number. I can imagine the poor consultant that goes into the meeting with the networking team and says that the Teams client is currently configured to use 50MBs worth of bandwidth per user… Oh, to be a fly on the wall in that meeting… The truth is that this setting is basically telling the system to place no restraint on the maximum bit rate used by the Teams client or service (as 50Mbps will never be reached). So in practice, a Media Bit Rate setting of 50000 means that the Teams client and Teams service will function without any networking constraints. This means that Teams will try and use the maximum audio codec and video codec resolution and frame rate that it can. Read on to see how the client behaves under network constraints.

The example below shows what a setting of 800Kbps looks like:


Testing for this Article


During the writing of this article I did a great deal of testing using the Media Bit Rate settings within the Meeting Policy and by using networking tools to artificially limit network traffic to test the effects on the Teams client. During the testing the bit rate in and out of the PC running teams was monitored using Wireshark’s IO Graph feature which essentially counts the number of bits in each packet captured by a specific Wireshark filter. In this case the filter was capturing traffic based on Microsoft’s Teams media IP Address range 52.114.x.x to see exactly how much traffic was being transferred up and down from the cloud in each scenario.

During the tests the Teams Meeting Policy Media Bit Rate setting was used in conjunction with limiting bandwidth using a network emulator tool to see how the client behaved under different circumstances. In most cases a scheduled meeting was used with two parties joining from two clients running on PCs with typical i7 Intel CPUs you would expect to see in a business laptop. For consistency, each test meeting had the same 1080p video being sent via VBSS screen sharing from a PC that did not have any bandwidth constraints applied to it. All bandwidth limiting was only done on the PCs that was receiving the screen sharing traffic in order to avoid any compounding quality issues caused by limiting both client streams. The bandwidth limited client was viewing the screen shared video over the constrained link so the changes in quality could be monitored. Both of the clients also had their video cameras enabled which was displayed as thumbnails down the bottom right of the screen, however, this was covered up by the bandwidth graph. Be aware that this bandwidth was also there in the test calls.

During the testing and in the test videos Wireshark’s IO Graph is displayed in the bottom right hand side of the screen. For most of the testing this graph displayed with an SMA (Simple Moving Average) of 10 in order to smooth the graphs to see the average bit rate making it easier to compare with the Media Bit Rate value set in the Meeting Policy.

The videos supplied in this article are directly screen captured using OBS software running at 1080p and with 44khz audio. It offers a reasonable representation of the quality actually seen on the laptop screen, with perhaps only a small bit of additional compression distortion. When watching these videos on YouTube ensure that you have the quality set to 1080p in the video settings so you’re seeing an accurate representation of the video and are not adding additional artifacts when watching the YouTube stream.

For future reference the software versions used during testing for this article were:
  • Microsoft Teams Version 1.2.00.24753 (64-bit). It was last updated on 9/12/19
  • Microsoft Edge version 41.16299.1004.0
  • Google Chrome version 77.0.3865.90
  • iOS Mobile Client version 1.0.85

Network Based Bandwidth Limiting


The cloud has now ushered in a new world where applications can connect to O365 from basically anywhere that has Internet access. This means that administrators don’t have the luxury of knowing that their perfectly manicured and QoS enriched networks are always being used by applications anymore, so it is important to understand how applications will behave under various non-perfect network conditions. With that framing in mind, I wanted to see exactly what the Teams client does when you limit its bandwidth to O365.

When left at default settings (Media Bit Rate of 50,000 Kbps) the Teams client, practically speaking, does not have a fixed ceiling on the bandwidth that it can use for a call/meeting. The limits are simply based around whatever the maximum data rate of the audio and video codecs running at their maximum resolution and frame rate will allow. In reality, the client will only reach the maximum data rate of the codec whilst keeping under a certain packet loss rate (and potentially other media quality thresholds that I can’t easily test). In order to show the way this works in practice the following test case shows a meeting where the bandwidth available to the Teams client starts off at an unlimited amount followed by the network being limited all of a sudden to 500Kbps:

Bandwidth conditions change (22 secs into the video) during a meeting change from unlimited bandwidth to 500Kbps:


Note: Set the YouTube video quality to 1080p when watching the videos.

The bandwidth limit is applied at 22 seconds into the video right when the man waves after joining the meeting. The video freezes at this point for about 8 seconds as the Teams client recognises that the loss rate has increased to a point where it ramps down the video resolution and frame rate to try and stem the losses. When a suitable new bit rate has been found the Teams client will then use this new bit rate until more bandwidth becomes available. In a scenario where the client software doesn’t have this capability to ramp down the data rate, and the the original stream resolution/data rate was maintained, the video playback would not have recovered. In addition to this, the already saturated network would have continued to deliver a high data rate stream that would further choke the existing bottleneck in the network. If you consider the scenario when you might have tens or hundreds of Teams clients in a meeting that crosses the same network bottlenecks to get to O365, this ability to ramp down the bandwidth and find a safe balancing point for all users allows for meetings to continue where they might otherwise have failed due to overwhelmed network infrastructure.

What you might see in the field is that users are complaining that the video or screen sharing quality is not very good (ie. low resolution and low frame rate video). In this case it is highly likely that the client has been forced to decrease the stream quality to deal with limited data network bandwidth. So you should confirm that the network links that the Teams clients are using to get to O365 are not saturating and reporting data loss on them. You could potentially get a user (or someone capable of testing the network) to use a tool like the Skype for Business Network Assessment Tool or my Network Assessor Tool to check the link quality to O365. In the case where links are maxing out you might start to see results going over the recommended thresholds, like in this example:



What happens when bandwidth becomes available again?

Now that we know that the client can reduce its bandwidth usage when the network is congested, what happens when the network bandwidth becomes available again? To test this case I started the meeting with a 200Kbps bandwidth limit and then released the limit back to unlimited bandwidth to see how long it took to ramp the data rate back up again:

Bandwidth conditions change during a call from constrained 200Kbps to unlimited bandwidth (at 10 seconds into the video):


Note: Set the YouTube video quality to 1080p when watching the videos.

This case operates slightly differently to the previous test. In the previous test it was obvious to the client that the bandwidth was dramatically reduced because it would have almost instantly seen the packet loss rate go through the roof. As a result, the client could swiftly ramp down the data rate until the packet loss stopped. However, in the case of bandwidth becoming available again, the client needs to slowly ramp up the data rate and monitor the packet loss rate to see that it hasn’t put too much traffic on the link. If this is done too fast then the flooding of the network with more traffic could be very detrimental and lead to data link queuing and dropping traffic again. In the video above you will see after removing the bandwidth constraints at 10 seconds it takes the client about one minute to ramp the data rate back up. In practice this could be seen as being a bit problematic if there are data links that frequently get congested as the client will take much longer to increase the bandwidth than reduce it.    

How does the client determine when to ramp down the bandwidth?

In the testing above it appears that the client uses packet loss to determine when the client will reduce its data rate. In the following test a random packet loss algorithm was put on the network link to test this theory. There may actually be more thresholds like jitter that are also taken into account here,
however, packet loss is easiest to use for testing purposes. The following cases show how adding 40% of random packet loss will cause the Teams client to ramp down its data rate:   


Note: Set the YouTube video quality to 1080p when watching the videos.

It’s difficult to know exactly what the thresholds are that the client uses here, however, you can see in the video if you increase the packet loss enough the Teams client will start lowering the data rate to try and reduce load on the network and hopefully stop the packet drops from happening. Note: Please ignore the quality of the video in this case as there is a constant random drop of 40% that continues for the whole video. The point here is just to see what the Teams client and service will try and do in this circumstance. 


Policy Base Bandwidth Limiting


By default the Teams client will use as much bandwidth as the Meeting Policy allows for its meetings and peer to peer calling. The setting in the Meeting Policy that controls this limit is called “Media bit rate (KBs)”. As mentioned earlier, I take a bit of issue with the way this setting is labelled currently with KBs. This nomenclature indicates to me that it’s a value in Kilo Bytes per second, however, this setting is actually interpreted by the system as kilo-bits per second (Kbps). I have raised this typo with Microsoft so it may get fixed at some point but for now don’t misinterpret this to mean Bytes as it is in fact bits. By default this setting is set to 50,000 which means that 50Mbps is allowed for media streams. This value is the equivalent of saying that unlimited bandwidth is available because 50Mbps is not going to be reached in any practical scenario.

When the Media Bit Rate setting is used then the client and O365 meeting service are told to enforce some average bit rate values when generating video and audio streams. This value is enforced in both directions, so for example if you set a value of 800Kbps then it would allow a stream of 800kbps from the client up to O365 and an 800Kbps stream down from O365 to the client. This is a combined total of 1600Kbps for both up and down streams. When you’re talking to networking folks they will usually talk about uni-directional values on full duplex links in describing bandwidth used, so from that perspective it’s described as an 800Kbps stream. The Media Bit Rate basically becomes the new average bit rate ceiling for all calls made from a client with the Teams Meeting Policy assigned to them. If the client experiences packet loss it will still ramp down to a lower resolution and frame rate in the same way as it does when there is no Media Bit Rate policy assigned.

The video below shows a call with an 800Kbps Media Bit Rate set. You will see in this video that the average bit rate (graph is displaying an SMA averaged value) will never go over 800Kbps as a result of the 800Kbps setting being made:

Note: Set the YouTube video quality to 1080p when watching the videos.

You will have noticed I used the term average bit rate here rather than peak bit rate. These two concepts are quite different: average means that over some period of time the bit rate when averaged will turn out to be under a specific value, however peak bit rate means that at any point in time the rate must always be under the bit rate. To show the difference the video below has both an averaged SMA graph and a non-SMA (ie peak style graph) displayed. In the meeting the client had an 800Kbps Media Bit Rate set and at 50 seconds in the network is limited with no queuing (ie. anything over 800Kbps is dropped). For this video the graph is showing an SMA graph of the averaged data rate (Blue) and a non-averaged data rate (Red) to show the true peaks of the data usage:

Note: Set the YouTube video quality to 1080p when watching the videos.

What this shows is that the data rate described by the Data Bit Rate value is actually an average bit rate instead of a peak bit rate. This can be seen by the blue average line never crossing the 800,000 bit (800Kbps) line but the red peak bit rate line spiked significantly above this value. At the 50 second mark the network was strictly limited to 800Kbps and as a result packets from the media stream that went over 800Kbps were dropped. You will see that the video freezes for a couple of seconds at the 50 second point due to this change and the quality of the video dropped slightly. This bandwidth constraint caused the media stream to suffer losses and have to decrease the video quality (resolution/frame rate) to keep the peak rate under 800Kbps which made the average (blue SMA line) drop to about 500Kbps. In practical circumstances the network will likely queue the media stream packets and still deliver them to the other end avoiding the packet loss which led to the resolution/data rate decrease. However, if there are any network devices that are not queuing traffic but instead dropping it then your 800Kbps stream might suffer and be forced to ramp down the bit rate itself to keep the stream at an acceptable level.

What about peer-to-peer calls that go directly between clients - do they also follow this Meeting Policy?

The answer here is - yes they do. The video below shows a peer to peer call between two Teams clients with the Media Bit Rate set at 800Kbps. You will see that the average bit rate never goes over 800Kbps for this call:

Note: Set the YouTube video quality to 1080p when watching the videos.

Do the Browser based or Mobile clients follow the Media Bit Rate setting?

From my testing of the Google Chrome and Edge (classic) browser, they do not follow this setting. This can have a profound effect on your network if you think that you’ve limited all clients to a specific bit rate only to find out that there are some clients on the network still running with unlimited bandwidth ceiling. Below is the Chrome and Edge browsers tests I ran to prove this:

Meeting using Chrome Browser with 800Kbps policy limit set:

Note: Set the YouTube video quality to 1080p when watching the videos.

Meeting using Edge (Classic) Browser with 800Kbps policy limit set:
Note: Set the YouTube video quality to 1080p when watching the videos.

In addition to this, I tried the iOS Teams client to see if it would have its Bit Rate limited by the Media Bit Rate policy setting. It turns out that it also doesn't appear to follow this setting. Note: In this video i'm doing a screen mirror from the iPhone to my PC and getting a Wireshark capture (of the traffic to O365 from the phone) of the traffic off a monitor port of a network switch the traffic was travelling through. The quality of the video being displayed is not as good as was actually displayed on the phone due to it being streamed over the screen mirroring protocol, but the bit rate usage displayed should be accurate.

Meeting using iOS (iPhone 7) mobile client with 800Kbps policy limit set:


Note: Set the YouTube video quality to 1080p when watching the videos.

The Wrap Up


If you have taken the time to read this entire 3000 words article and watch the videos, congratulations! I hope that you’ve learnt a thing or two about how the Teams client and Teams service will behave under different network conditions. Doing all this testing has definitely given me some incites on how the client and service actually use bandwidth as apposed to how the maximum and average codec bandwidth tables in the Microsoft documentation describe the usage. As far as the Media Bit Rate setting goes, there's not a huge number of cases where I can see this being useful over the automatic bandwidth management provided by Teams. If there was a case where you had constrained links and you really didn't want Teams taking bandwidth over a specific ceiling bit rate then you could use this setting. However, it's a pretty restrictive and will set a quality ceiling for all calls for users with the Policy applied to them. Not to mention, if the user moves to another location they will have this limit following them everywhere, which is not ideal. With the setting available, my recommendation is to let Teams do it's thing. If you're getting complaints about video quality on calls then you need to look into where there are bandwidth bottlenecks on your network and see if bandwidth can be increased to meet the needs of users. Optionally if the bottleneck is on an internal network then Diffserv DSCP values could potentially be used to tune the queues for the Teams media traffic (however, this is perhaps a completely different topic for another time). Till next time, au revoir!



0 comments to “Microsoft Teams Bandwidth Usage Deep Dive”

Post a Comment

Popular Posts