Tuesday, 8 October 2019

Microsoft Teams Bandwidth Usage Deep Dive


I was recently reading this Microsoft Docs article called “Prepare your organization's network for Microsoft Teams” which says the following:

“The actual bandwidth consumption in each audio/video call or meeting will vary based on several factors, such as video layout, video resolution, and video frames per second. When more bandwidth is available, quality and usage will increase to deliver the best experience.”

I found the comment about bandwidth availability to be interesting and wondered how this actually functioned in practice. So I did (what turned out to be) a lot of testing to see exactly how the client behaves under various network and policy constraints and I thought the outcomes were interesting enough to write an article about. So steel yourself and lie back and think of Redmond as they say in the classics…

Note: The testing in this article is a point-in-time sample of how the product worked when the testing was done (Late September 2019). Office 365 is constantly being updated and things may change over time. If you have seen things operate differently or Microsoft provide information contrary to that documented here let me know.

Settings


The Teams service, at the time of writing this article, does not support individual Call Admission Control features. However, in the Teams Portal Meeting Policy there is a setting called “Media bit rate (KBs)”. This setting is shown below:



A bugbear I have with the naming of this setting is that it includes “KBs“, which to my eyes means Kilo Bytes per Second (ie. a capital ”B” means Bytes and a lower case “b” means bits). However, this is not the case - this setting is actually configured in Kilo bits per second (ie. Kbps). I have lobbied Microsoft to change this. We’ll see how that works out.

As you will see the default value for this setting is 50000 which seems kind of crazy because this is 50Mbps. In my opinion, this probably would have been better being represented as “Max” or “Unlimited” or anything else rather than this very high number. I can imagine the poor consultant that goes into the meeting with the networking team and says that the Teams client is currently configured to use 50MBs worth of bandwidth per user… Oh, to be a fly on the wall in that meeting… The truth is that this setting is basically telling the system to place no restraint on the maximum bit rate used by the Teams client or service (as 50Mbps will never be reached). So in practice, a Media Bit Rate setting of 50000 means that the Teams client and Teams service will function without any networking constraints. This means that Teams will try and use the maximum audio codec and video codec resolution and frame rate that it can. Read on to see how the client behaves under network constraints.

The example below shows what a setting of 800Kbps looks like:


Testing for this Article


During the writing of this article I did a great deal of testing using the Media Bit Rate settings within the Meeting Policy and by using networking tools to artificially limit network traffic to test the effects on the Teams client. During the testing the bit rate in and out of the PC running teams was monitored using Wireshark’s IO Graph feature which essentially counts the number of bits in each packet captured by a specific Wireshark filter. In this case the filter was capturing traffic based on Microsoft’s Teams media IP Address range 52.114.x.x to see exactly how much traffic was being transferred up and down from the cloud in each scenario.

During the tests the Teams Meeting Policy Media Bit Rate setting was used in conjunction with limiting bandwidth using a network emulator tool to see how the client behaved under different circumstances. In most cases a scheduled meeting was used with two parties joining from two clients running on PCs with typical i7 Intel CPUs you would expect to see in a business laptop. For consistency, each test meeting had the same 1080p video being sent via VBSS screen sharing from a PC that did not have any bandwidth constraints applied to it. All bandwidth limiting was only done on the PCs that was receiving the screen sharing traffic in order to avoid any compounding quality issues caused by limiting both client streams. The bandwidth limited client was viewing the screen shared video over the constrained link so the changes in quality could be monitored. Both of the clients also had their video cameras enabled which was displayed as thumbnails down the bottom right of the screen, however, this was covered up by the bandwidth graph. Be aware that this bandwidth was also there in the test calls.

During the testing and in the test videos Wireshark’s IO Graph is displayed in the bottom right hand side of the screen. For most of the testing this graph displayed with an SMA (Simple Moving Average) of 10 in order to smooth the graphs to see the average bit rate making it easier to compare with the Media Bit Rate value set in the Meeting Policy.

The videos supplied in this article are directly screen captured using OBS software running at 1080p and with 44khz audio. It offers a reasonable representation of the quality actually seen on the laptop screen, with perhaps only a small bit of additional compression distortion. When watching these videos on YouTube ensure that you have the quality set to 1080p in the video settings so you’re seeing an accurate representation of the video and are not adding additional artifacts when watching the YouTube stream.

For future reference the software versions used during testing for this article were:
  • Microsoft Teams Version 1.2.00.24753 (64-bit). It was last updated on 9/12/19
  • Microsoft Edge version 41.16299.1004.0
  • Google Chrome version 77.0.3865.90
  • iOS Mobile Client version 1.0.85

Network Based Bandwidth Limiting


The cloud has now ushered in a new world where applications can connect to O365 from basically anywhere that has Internet access. This means that administrators don’t have the luxury of knowing that their perfectly manicured and QoS enriched networks are always being used by applications anymore, so it is important to understand how applications will behave under various non-perfect network conditions. With that framing in mind, I wanted to see exactly what the Teams client does when you limit its bandwidth to O365.

When left at default settings (Media Bit Rate of 50,000 Kbps) the Teams client, practically speaking, does not have a fixed ceiling on the bandwidth that it can use for a call/meeting. The limits are simply based around whatever the maximum data rate of the audio and video codecs running at their maximum resolution and frame rate will allow. In reality, the client will only reach the maximum data rate of the codec whilst keeping under a certain packet loss rate (and potentially other media quality thresholds that I can’t easily test). In order to show the way this works in practice the following test case shows a meeting where the bandwidth available to the Teams client starts off at an unlimited amount followed by the network being limited all of a sudden to 500Kbps:

Bandwidth conditions change (22 secs into the video) during a meeting change from unlimited bandwidth to 500Kbps:


Note: Set the YouTube video quality to 1080p when watching the videos.

The bandwidth limit is applied at 22 seconds into the video right when the man waves after joining the meeting. The video freezes at this point for about 8 seconds as the Teams client recognises that the loss rate has increased to a point where it ramps down the video resolution and frame rate to try and stem the losses. When a suitable new bit rate has been found the Teams client will then use this new bit rate until more bandwidth becomes available. In a scenario where the client software doesn’t have this capability to ramp down the data rate, and the the original stream resolution/data rate was maintained, the video playback would not have recovered. In addition to this, the already saturated network would have continued to deliver a high data rate stream that would further choke the existing bottleneck in the network. If you consider the scenario when you might have tens or hundreds of Teams clients in a meeting that crosses the same network bottlenecks to get to O365, this ability to ramp down the bandwidth and find a safe balancing point for all users allows for meetings to continue where they might otherwise have failed due to overwhelmed network infrastructure.

What you might see in the field is that users are complaining that the video or screen sharing quality is not very good (ie. low resolution and low frame rate video). In this case it is highly likely that the client has been forced to decrease the stream quality to deal with limited data network bandwidth. So you should confirm that the network links that the Teams clients are using to get to O365 are not saturating and reporting data loss on them. You could potentially get a user (or someone capable of testing the network) to use a tool like the Skype for Business Network Assessment Tool or my Network Assessor Tool to check the link quality to O365. In the case where links are maxing out you might start to see results going over the recommended thresholds, like in this example:



What happens when bandwidth becomes available again?

Now that we know that the client can reduce its bandwidth usage when the network is congested, what happens when the network bandwidth becomes available again? To test this case I started the meeting with a 200Kbps bandwidth limit and then released the limit back to unlimited bandwidth to see how long it took to ramp the data rate back up again:

Bandwidth conditions change during a call from constrained 200Kbps to unlimited bandwidth (at 10 seconds into the video):


Note: Set the YouTube video quality to 1080p when watching the videos.

This case operates slightly differently to the previous test. In the previous test it was obvious to the client that the bandwidth was dramatically reduced because it would have almost instantly seen the packet loss rate go through the roof. As a result, the client could swiftly ramp down the data rate until the packet loss stopped. However, in the case of bandwidth becoming available again, the client needs to slowly ramp up the data rate and monitor the packet loss rate to see that it hasn’t put too much traffic on the link. If this is done too fast then the flooding of the network with more traffic could be very detrimental and lead to data link queuing and dropping traffic again. In the video above you will see after removing the bandwidth constraints at 10 seconds it takes the client about one minute to ramp the data rate back up. In practice this could be seen as being a bit problematic if there are data links that frequently get congested as the client will take much longer to increase the bandwidth than reduce it.    

How does the client determine when to ramp down the bandwidth?

In the testing above it appears that the client uses packet loss to determine when the client will reduce its data rate. In the following test a random packet loss algorithm was put on the network link to test this theory. There may actually be more thresholds like jitter that are also taken into account here,
however, packet loss is easiest to use for testing purposes. The following cases show how adding 40% of random packet loss will cause the Teams client to ramp down its data rate:   


Note: Set the YouTube video quality to 1080p when watching the videos.

It’s difficult to know exactly what the thresholds are that the client uses here, however, you can see in the video if you increase the packet loss enough the Teams client will start lowering the data rate to try and reduce load on the network and hopefully stop the packet drops from happening. Note: Please ignore the quality of the video in this case as there is a constant random drop of 40% that continues for the whole video. The point here is just to see what the Teams client and service will try and do in this circumstance. 


Policy Base Bandwidth Limiting


By default the Teams client will use as much bandwidth as the Meeting Policy allows for its meetings and peer to peer calling. The setting in the Meeting Policy that controls this limit is called “Media bit rate (KBs)”. As mentioned earlier, I take a bit of issue with the way this setting is labelled currently with KBs. This nomenclature indicates to me that it’s a value in Kilo Bytes per second, however, this setting is actually interpreted by the system as kilo-bits per second (Kbps). I have raised this typo with Microsoft so it may get fixed at some point but for now don’t misinterpret this to mean Bytes as it is in fact bits. By default this setting is set to 50,000 which means that 50Mbps is allowed for media streams. This value is the equivalent of saying that unlimited bandwidth is available because 50Mbps is not going to be reached in any practical scenario.

When the Media Bit Rate setting is used then the client and O365 meeting service are told to enforce some average bit rate values when generating video and audio streams. This value is enforced in both directions, so for example if you set a value of 800Kbps then it would allow a stream of 800kbps from the client up to O365 and an 800Kbps stream down from O365 to the client. This is a combined total of 1600Kbps for both up and down streams. When you’re talking to networking folks they will usually talk about uni-directional values on full duplex links in describing bandwidth used, so from that perspective it’s described as an 800Kbps stream. The Media Bit Rate basically becomes the new average bit rate ceiling for all calls made from a client with the Teams Meeting Policy assigned to them. If the client experiences packet loss it will still ramp down to a lower resolution and frame rate in the same way as it does when there is no Media Bit Rate policy assigned.

The video below shows a call with an 800Kbps Media Bit Rate set. You will see in this video that the average bit rate (graph is displaying an SMA averaged value) will never go over 800Kbps as a result of the 800Kbps setting being made:

Note: Set the YouTube video quality to 1080p when watching the videos.

You will have noticed I used the term average bit rate here rather than peak bit rate. These two concepts are quite different: average means that over some period of time the bit rate when averaged will turn out to be under a specific value, however peak bit rate means that at any point in time the rate must always be under the bit rate. To show the difference the video below has both an averaged SMA graph and a non-SMA (ie peak style graph) displayed. In the meeting the client had an 800Kbps Media Bit Rate set and at 50 seconds in the network is limited with no queuing (ie. anything over 800Kbps is dropped). For this video the graph is showing an SMA graph of the averaged data rate (Blue) and a non-averaged data rate (Red) to show the true peaks of the data usage:

Note: Set the YouTube video quality to 1080p when watching the videos.

What this shows is that the data rate described by the Data Bit Rate value is actually an average bit rate instead of a peak bit rate. This can be seen by the blue average line never crossing the 800,000 bit (800Kbps) line but the red peak bit rate line spiked significantly above this value. At the 50 second mark the network was strictly limited to 800Kbps and as a result packets from the media stream that went over 800Kbps were dropped. You will see that the video freezes for a couple of seconds at the 50 second point due to this change and the quality of the video dropped slightly. This bandwidth constraint caused the media stream to suffer losses and have to decrease the video quality (resolution/frame rate) to keep the peak rate under 800Kbps which made the average (blue SMA line) drop to about 500Kbps. In practical circumstances the network will likely queue the media stream packets and still deliver them to the other end avoiding the packet loss which led to the resolution/data rate decrease. However, if there are any network devices that are not queuing traffic but instead dropping it then your 800Kbps stream might suffer and be forced to ramp down the bit rate itself to keep the stream at an acceptable level.

What about peer-to-peer calls that go directly between clients - do they also follow this Meeting Policy?

The answer here is - yes they do. The video below shows a peer to peer call between two Teams clients with the Media Bit Rate set at 800Kbps. You will see that the average bit rate never goes over 800Kbps for this call:

Note: Set the YouTube video quality to 1080p when watching the videos.

Do the Browser based or Mobile clients follow the Media Bit Rate setting?

From my testing of the Google Chrome and Edge (classic) browser, they do not follow this setting. This can have a profound effect on your network if you think that you’ve limited all clients to a specific bit rate only to find out that there are some clients on the network still running with unlimited bandwidth ceiling. Below is the Chrome and Edge browsers tests I ran to prove this:

Meeting using Chrome Browser with 800Kbps policy limit set:

Note: Set the YouTube video quality to 1080p when watching the videos.

Meeting using Edge (Classic) Browser with 800Kbps policy limit set:
Note: Set the YouTube video quality to 1080p when watching the videos.

In addition to this, I tried the iOS Teams client to see if it would have its Bit Rate limited by the Media Bit Rate policy setting. It turns out that it also doesn't appear to follow this setting. Note: In this video i'm doing a screen mirror from the iPhone to my PC and getting a Wireshark capture (of the traffic to O365 from the phone) of the traffic off a monitor port of a network switch the traffic was travelling through. The quality of the video being displayed is not as good as was actually displayed on the phone due to it being streamed over the screen mirroring protocol, but the bit rate usage displayed should be accurate.

Meeting using iOS (iPhone 7) mobile client with 800Kbps policy limit set:


Note: Set the YouTube video quality to 1080p when watching the videos.


The Wrap Up


If you have taken the time to read this entire 3000 words article and watch the videos, congratulations! I hope that you’ve learnt a thing or two about how the Teams client and Teams service will behave under different network conditions. Doing all this testing has definitely given me some insights on how the client and service actually use bandwidth as opposed to how the maximum and average codec bandwidth tables in the Microsoft documentation describe the usage. As far as the Media Bit Rate setting goes, there's not a huge number of cases where I can see this being useful over the automatic bandwidth management provided by Teams. If there was a case where you had constrained links and you really didn't want Teams taking bandwidth over a specific ceiling bit rate then you could use this setting. However, it's a pretty restrictive and will set a quality ceiling for all calls for users with the Policy applied to them. Not to mention, if the user moves to another location they will have this limit following them everywhere, which is not ideal. With the setting available, my recommendation is to let Teams do it's thing. If you're getting complaints about video quality on calls then you need to look into where there are bandwidth bottlenecks on your network and see if bandwidth can be increased to meet the needs of users. Optionally if the bottleneck is on an internal network then Diffserv DSCP values could potentially be used to tune the queues for the Teams media traffic (however, this is perhaps a completely different topic for another time). Till next time, au revoir!



Read more →

Thursday, 5 September 2019

Teams Tenant Dial Plan Tool


I recently released a tool for configuring Direct Routing within an Office 365 tenant. Its name is the imaginative Microsoft Teams Direct Routing Tool. This tool, whilst allowing you to configure all your PSTN Gateways and Routing, did not allow you to configure the normalisation of numbers that users dial prior to being routed. This step in the process is very important because nearly all users are not going to dial phone numbers in E.164 format. As a result, prior to getting to the E.164 based routing rules we need to do some work to ensure that the numbers dialed have been converted into the right format. The number normalisation in O365 is done by the Tenant Dial Plan policies, which contain normalisation rules. The configuration of these are done using the Skype for Business Online module and bunch of pretty complicated PowerShell that really shouldn’t be inflicted on a regular human being. So to try and avoid this pain, I decided to make a sister tool to my Direct Routing Tool that will allow for simple configuration and editing of these Tenant Dial Plans. This time, in order to ensure that I came up with the most imaginative name possible for the tool, I trekked into the deepest jungles of Peru on a vision quest where after drinking several litres of Ayahuasca came up with the name “Teams Tenant Dial Plan Tool”. Enjoy…

Teams Tenant Dial Plan Tool


The Tenant Dial Plan Tool is a PowerShell based tool that allows you to configure and edit Tenant Dial Plans within Office 365 for use with Microsoft Teams Direct Routing and Calling Plans. This tool is a sister tool to my Microsoft Teams Direct Routing Tool that allows you to configure all the routing for Direct Routing within Office 365. To use the tool, simply open it with PowerShell (with the Skype for Business Online Module installed) and you will be presented with the following GUI and features:



Tool Features
  • Log into O365 using the Connect SfBO button in the top left of the tool. Note: the Skype for Business Online PowerShell module needs to be installed on the PC that you are connecting from. You can get the module from here: https://www.microsoft.com/en-us/download/details.aspx?id=39366
  • Create/Edit and Remove Tenant Dial Plan policies using the New.., Edit.. and Remove buttons.
  • Copy existing Tenant Dial Plans and all their Normalisation rules to a new Tenant Dial Plan.
  • Add/Edit Tenant Dial Plan normalisation rules. If the rule you are setting has a name that matches an existing rule, then the existing rule will be edited. If the rule’s name does not match an existing rule then it will be added as a new rule to the list.
  • Delete one or all normalisation rules from a Tenant Dial Plan policy.
  • Easily change the priority of normalisation rules with the UP and DOWN buttons.
  • Test the normalisation rules! Teams currently (at the time of writing this) doesn’t have any normalisation rule testing capabilities. So I wrote a custom testing engine into the tool providing this feature. By entering a number into the Test textbox and pressing the Test Number button, the tool will highlight all of the rules in the Dial Plan that match in blue. The rule that has the highest priority and matches the tested number will be highlighted in green. The pattern and translation of the highest priority match (the one highlighted in green) will be used to do the translation on the Test Number and the resultant translated number will be displayed in the Test Result.


Updates:
  • Initial Release 1.00

Note: the Skype for Business Online PowerShell module needs to be installed on the PC that you are connecting from. You can get the module from here: https://www.microsoft.com/en-us/download/details.aspx?id=39366

Download from TechNet Gallery:



Frequently Asked Questions


1. What is the deal with the OptimizeDeviceDialing setting - I can't edit it? 

In order to use the Access Prefix value that you can enter when creating a policy in the tool, a setting in the background called OptimizeDeviceDialing must also be turned on (for more details about what an Access Prefix is, refer to Ken Lasko's post about how they work in Lync). In addition to this, there is some weirdness in the PowerShell commands, which means that after you have set an Access Prefix for a policy you cannot then delete this value. You can only overwrite an existing Access Prefix with another number. When you delete the Access Prefix in the Edit dialog of the tool it will set the OptimizeDeviceDialing setting to FALSE (and leave the existing Access Prefix because it can't delete it). For example, if you already have an Access Prefix configured (as say "0") on a policy and then open the Edit dialog and remove the Access Prefix value like shown in the image below:


... then the result will show as the Access Prefix still being "0" in the main window (due to it not being able to be deleted by PowerShell) but it will update the OptimizedDeviceDialing setting to FALSE so the Access Policy is not used:



The Wrap Up


Well that was one hell of a ride. I think the Ayahuasca has nearly worn off and it's time for me to lie down. Enjoy the tool and remember, kids, don’t drink weird potions in the jungle…



Read more →

Sunday, 18 August 2019

Skype for Business 2019 Call Forward Tool

In the July 2019 update of Skype for Business Server 2019 (a release that might also be called CU1 or CU2 depending on if you include a Hotfix release that came about a month earlier) now includes some new PowerShell commands that allow you to centrally control users' call forwarding settings. This functionality used to be available via a tool called SEFAUTIL.exe in the Skype for Business resource kits in previous versions of the server releases. This is obviously great news because this is the functionality that was used all the time in practice by most organisations that I have seen.

The first question is what do we actually get with these new commands? We basically get the ability to do all the things a user can do from the Call Forward settings dialog within the Skype for Business client. This includes adding users to their Team Call Group and Delegate lists as well as setting call forward immediate, unanswered and simultaneous ring.

For more details on using the commands directly, Greig Sheridan has done a nice write up here: https://greiginsydney.com/sfbs-2019-sefautil-in-powershell/

Whilst it’s great to have these commands at our disposal, I still find that there is a learning curve to figuring out which settings and flags to use to achieve the call forward type that you might want in practice. So I thought it would be good to build a GUI for the PowerShell commands that looked exactly the same as the call forward settings screen from the Skype for Business client that we have all been using for many years and understand already. So that’s what I did… Introducing the Skype for Business 2019 Call Forwarding Tool:

Skype for Business 2019 Call Forwarding Tool




Tool Features:
  • No learning curve - it works the same as the call forward configuration on the Skype for Business client!
  • Get the call forwarding settings for any user on the system.
  • Edit team-call groups members.
  • Edit delegate members
  • Forward calls immediately to another number, delegates or contact.
  • Set simultaneous ring to team-call members, delegates, number or contact.
  • Control when the settings will apply ("all of the time" or "during work hours in Outlook")
  • Set the call forward settings on one or a number of users by selecting them from the user list on the right hand side of the tool.


Updates:

1.00 Initial Release


Download from TechNet Gallery:



Limitations


The PowerShell commands that have been supplied by Microsoft have the following limitations when compared to what can be set in the Skype for Business Client:

  • In Delegate and Team-call settings the "ring after" timer can only be set to 0, 5, 10 or 15 seconds, whereas in the Client you can set it from zero to 55 seconds (the maximum value is actually is whatever the unanswered call timer is, minus 5 seconds, which is a maximum of 55 seconds). 
  • There is no ability to select which delegates will be able to receive calls. This is represented in the client as a checkbox next to the delegate in the "Call Forwarding - Delegates" dialog. This capability is not available in the PowerShell commands at the moment.

Known Issues


Known Issue 1: Call Forward Unanswered to a Phone Number Issue

There is a bug in Skype for Business Server 2019 July 2019 update when Call Forward Immediate is disabled but Call Forward Unanswered is set to point to a number. This scenario looks like this in the Client:



From PowerShell it looks like this:

Set-CsUserCallForwardingSettings -Identity "sip:john.woods@domain.com" -DisableForwarding -UnansweredToOther "+61395554444" -UnansweredWaitTime 10

Whilst this command will be accepted by the system and look like the data has set correctly within the client the actual Call Forward will not work when you call the Client (ie. instead of the call going to the number it forwards to the user's Voicemail). This is due to a bug in the Set-CsUserCallForwardingSettings command which will hopefully be fixed in the next CU.


Known Issue 2: Call Forward Immediate to Voice Mail Issue

In Skype for Business Server 2019 July 2019 the PowerShell commands do not tell you if the user has Call Forward Immediate to Voice Mail configured. If you run the Get command it will show:

User                             : sip:john.woods@sfb2019lab.com
CallForwardingEnabled            : False
ForwardDestination               :
ForwardImmediateEnabled          : False
SimultaneousRingEnabled          : False
SimultaneousRingDestination      :
ForwardToDelegates               : False
SimultaneousRingDelegates        : False
TeamRingEnabled                  : False
Team                             : {}
Delegates                        : {}
DelegateRingWaitTime             : 0
TeamDelegateRingWaitTime         : 0
SettingsActiveWorkhours          : False
UnansweredToVoicemail            : True
UnansweredCallForwardDestination :
UnansweredWaitTime               : 30

... which looks exactly the same as if there is no Call Forward set at all.

The commands also do not have a flag to allow you to set Call Forward Immediate to "Voice Mail". As a work-around for this, I have implemented the setting of Call Forward Immediate using a special SIP Address format. The SIP Address of a user's Voice Mail can be represented as their SIP Address with the following parameters after it ";opaque=app:voicemail". So, in order to forward to Voice Mail, the tool currently uses this method which the Client also respects and displays correctly as a Forward to Voice Mail. When this issue is fixed I will update the tool accordingly.


The Wrap Up


Forward away my friends! Forward away!



Read more →

Saturday, 23 February 2019

Microsoft Teams Direct Routing Tool

If you want to bring your own PSTN carriage via an SBC to Microsoft Teams, then you have to do quite a bit of configuration within Office 365. This configuration is done using PowerShell and can be complex to understand for someone who hasn’t worked a lot with Skype for Business Enterprise Voice deployments in the past (or even if you have!). This is especially the case if there are multiple gateways deployed around the country or world and complex failover routing is required. In order to help to make this easier, I have created a new tool that gives you a full GUI for creating, troubleshooting and testing your Direct Routing configuration.

Teams Direct Routing Overview


Microsoft has done a pretty good job of documenting the configuration of Direct Routing for people that are familiar with the concepts of Voice Routing Policies, Voice Routes, PSTN Usages, and PSTN Gateways from the days of Skype for Business Enterprise Voice. The documentation is available at Microsoft Docs here: https://docs.microsoft.com/en-us/microsoftteams/direct-routing-configure
The most helpful explanatory diagram from Microsoft’s documentation is the one below:


This diagram shows the components of Direct Routing and their relationship with each other. From a higher level it’s easiest to think of a Voice Routing Policy as being the container that has the routing elements inside of it. The Voice Routing Policy is assigned to a user and describes how calls from that user will be routed. Inside the Voice Routing Policy are PSTN Usages, which are containers that hold multiple Voice Routes. The ordering of both PSTN Usages and Voice Routes are important to the order in which calls will be sent to specific PSTN Gateways. The PSTN Gateway configuration contains all of the protocol related settings that describe information that will be sent to the physical SBCs you deploy.

The part that is most confusing about this is that Voice Routes have specific Priority settings that are assigned to them in PowerShell, which are used within a PSTN Usage to determine precedence and order of evaluation - however, this doesn’t tell the full story. The order of the PSTN Usage then functions as an overarching ordering for the Voice Routes. This relationship in diagrammatic form is relatively easy to see; however, when presented in PowerShell format it can be very difficult to understand. When designing this tool I decided to make it have the capability of helping the user easily understand the order in which routing will occur for any number dialled, by ordering everything in highest to lowest priority order.


The Microsoft Teams Direct Routing Tool




Tool Features:
  • The “Connect to O365” button allows for regular and MFA based authentication with O365. Note: the Skype for Business Online PowerShell module needs to be installed on the PC that you are connecting from. You can get the module from here: https://www.microsoft.com/en-us/download/details.aspx?id=39366
  • Select a User from the User drop down box to see their current Voice Policy assignment.
  • Create and Remove Voice Routing Polices.
  • Create, Remove, Edit and Order PSTN Usages.
  • Add PSTN Usages to Voice Routing Polices.
  • Create, Remove, Edit and Order Voice Routes.
  • Add Voice Routes to PSTN Usages.
  • Add Gateways and Regex Patterns to Voice Routes.
  • Add, Remove and Edit PSTN Gateway settings.
  • Enter a normalized number (ie. E164, +61400555111 style format) and click the Test Number button to see PSTN Gateways the routing order and failover choices for that specific number.
  • What doesn’t it do? The tool currently doesn’t do Tenant Dial Plan configuration. This could be a future development item for a later version.
UPDATES

1.00 Initial Release


Download from TechNet Gallery:




Example of Tool Capabilities


As a basic example to show how the tool works, I will demonstrate making changes to the International Routing plan for Australia as created by www.ucdialplans.com (MVP Ken Lasko’s creation). The changes will be to add additional rules to allow calls to be sent via Direct Routing to On Premises PBX extensions (extension range 1000-1999). In order to do this, I will create a new Online PSTN Usage and added a Voice Route to it, then change the priority of usages and finally test that the new rule works as expected.

Step 1:  Connect to O365 and select Voice Routing Policy

After importing the basic templates in from the ucdialplans.com site (which basically involved running a PowerShell script that I’m not going to document here in detail) I then opened the Direct Routing Tool from a PowerShell window. After the GUI loaded I then clicked the “Connect to O365” button and entered my O365 administrator credentials (Note: both regular auth and MFA based auth is supported). After doing this, the tool discovered all of the existing Voice Routing Policies and displayed them in the policies drop down box:



Note: In this example I am only making changes to the International policy for brevity’s sake. 

I then selected the International policy from the Policies drop down box. The tool then loaded all of the PSTN Usages and Voice Route data associated with this Voice Routing Policy in the main window:



Step 2: Create a New PSTN Usage

I then created a new PSTN Usage that will be used to allow calls to be sent directly to an On Premises PBX that has extensions in the number range 1000-1999. To do this I clicked on the “Add Usage…” button which then displayed the Add PSTN Usage dialog. In the dialog I selected the “New” check box to indicate that I’m creating a new policy and gave it a name that aligns with the convention used for the other PSTN Usages:



After clicking OK the PSTN Usage was added to the Voice Routing Policy. However, at this point it didn’t have any Voice Route associated with it so it wasn’t capable of routing calls. You will see in the main window screenshot below that the Voice Route, Number Pattern and Gateway List columns are empty:



Step 3: Add a Voice Route to the PSTN Usage

To add the Voice Route information to the PSTN Usage I double clicked the new PSTN Usage row (this can be done by either Double Clicking on the Usage or highlight the Usage and clicking the Edit Usage button). Once this was done the Edit PSTN Usage dialog was displayed:



Step 4: Add a Voice Route

From within the Edit Usage dialog a Voice Route can be added to the usage. This can be done by either Double Clicking the PSTN Usage row or Highlighting the Usage and clicking the “Edit Voice Route” or “Add Voice Route” Buttons. When creating a new Voice Route I recommend using the Double Click or "Edit Voice Route" button because this puts you directly into the "Edit Voice Route" dialog in a single step. Once the Edit Voice Route dialog is open I assigned it a Name, Number Pattern (in this case the pattern was “^1\d{3}$” to capture the 1000-1999 extension range) and PSTN Gateway.



After filling in the dialog I clicked OK and was returned to the Edit Usage dialog where I could see that the new Voice Route info was added to the PSTN Usage:



Having completed the configuration, I clicked the OK button on the Edit Usage dialog which took me back to the main window. You will now see that the Extensions PSTN Usage has the Voice Route information in the row at the end of the Usage list:



In this case I wanted the more specific Extensions PSTN Usage to be at the top of the list because it is more specific than the other PSTN Usages. I clicked the Usage Order button to open the Usage Order dialog which allowed me to move the priority of the Extensions PSTN Usage to the top of the list and then clicked OK:



The Extensions Usage was then moved to the top of the list in the main window:



This now completed all the configuration that I needed to get calls routing to the On-Premises SBC and PBX. However, it’s important to check that the Voice Routing policy is behaving the way you want it to before moving on. In order to do this, I entered an extension number with the PBX’s extension range in the Normalized Dialled Number box and clicked the Test Number button. The results are shown below:


Second Choice:



The tool now highlights all of the PSTN Usages and Voice Routes that will get used when this number is dialled. The information in the area below the "Choice Number" drop down box shows that the first choice for route calls to this number will be the new Voice Route that I just added, which is great. However, it appears that there is a second PSTN Usage that will also be used as a second choice if the first choice is not available. In this case the second choice is matching against the “AU-SouthEast-Service” usage which was not intended as part of this configuration. This second choice route may result is calls being sent to sbc02 or even in the case of other Voice Routing Policies (that also have the “AU-SouthEast-Service” PSTN Usage) surreptitiously having the ability to dial the PBX extensions. The testing in this case has been very useful in uncovering an issue that may need to be corrected before running Direct Routing in production.

Gateway Configuration for Bonus Points: 

You may also need to create or make changes to PSTN Gateways within your O365 tenant. The good news is that the tool can also do this. Simply click on the “Gateways…” button to edit gateway settings or add and remove gateways from the tenant:



The Wrap Up


Thanks for reading the post and checking out this tool I created. You now have the power of Teams Direct Routing in your grasp: use this power wisely for good instead of evil. Best of luck with your Direct Routing configurations. Enjoy!




Read more →

Sunday, 12 August 2018

Polycom VVX Not Displaying PIN Authentication Option

I had an interesting issue with a Polycom VVX deployment recently that I thought I would share in case others run into the same issue.


The Issue

The symptom of the problem was that after the Polycom VVX had completed booting, including getting an IP Address and downloading software/config files, the PIN Authentication option did not appear on the sign-in options screen. This meant that I was unable to use PIN Authentication at all for signing in the devices which was a problem because we planned on using it for all the phones. Below is an example of what the screen looked like:

Sign-in Screen without PIN Auth option

Troubleshooting

There was a series of steps that I went through in troubleshooting this issue. I will take you through all of them so you too can check whether your issue might be solved with some of the earlier steps that I tried before reaching a resolution.

STEP 1
I first confirmed that PIN Authentication was in fact turned on in the configuration file(s) of the phone. To do this I checked that the following setting was not in the configuration files:

<!-- Disable PIN Auth by setting "0" -->
<reg reg.1.auth.usePinCredentials="0" />
Note: The phone can have multiple configuration files that are both manually added by administrators and automatically created by the phone (ie. <MAC>-phone.cfg, <MAC>-web.cfg, etc). You need to check all of the files associated with the phone's MAC address to ensure it’s not being overridden by another file.

I also checked the setting directly in the phone using my VVX Phone Manager Tool to get the active setting out of the phone using the REST interface. In my case this setting was not configured in the config file and it defaults to being on (ie. set to "1"). So this wasn't the problem.

STEP 2
I checked that PIN Authentication was actually enabled on the Skype for Business server. This can be done in the Control Panel > Security > Web Services > Pin Authentication Enabled:


This was also enabled - so in this case it wasn't the problem.


STEP 3
I tested the PIN Authentication process on the server by running Test-CsPhoneBootstrap PowerShell command on the system. This worked just fine:

PS C:\ > Test-CsPhoneBootstrap -PhoneOrExtension 4500 -PIN 12345 -TargetFqdn 2015ENTFE004.myskypelab.com -TargetUri https://2015ENTFE004.myskypelab.com:443/CertProv/CertProvisioningService.svc

Target Fqdn   : 2015ENTFE004.myskypelab.com
Target Uri    : https://2015ENTFE004.myskypelab.com:443/CertProv/CertProvisioningService.svc
Result        : Success
Latency       : 00:00:01.2333041
Error Message :
Diagnosis     :

STEP 4
In this deployment there was a centralised Windows Server that was serving DHCP to all the client subnets. On the central DHCP server I confirmed that all of the DHCP options were correct using my Skype4B/Lync DHCP Config Tool. This tool parses the byte format Vendor Options and displays them as readable text, and if it is unable to parse the byte format it will display an error:

This is an example image from my lab

In this case, all settings were displayed and no encoding issues were detected by the tool, which means this wasn’t the issue. So I checked that there was no DHCP server on a closer subnet (ie. a switch or router) that was responding to DHCP before the central Window DHCP server. This also wasn’t the case as I could see that the central DHCP server had logged the address lease for the Polycom VVX with the particular MAC Address of the test device.

STEP 5
At this point this was starting to look like a more complex problem so I took to the lab to see if I could reproduce such behaviour.  I noticed that after a factory default the phone initially didn’t display the Pin Authentication option for a couple of seconds - it appeared belatedly. This indicated to me that there was some additional check that was being done by the VVX before it would display this option. So this begged the question: what is required for PIN Authentication to function on the VVX? The most important thing that is required is that the phone gets the DHCP Options which tell it where the Cert Provisioning services resides, so it can communicate with the web services required for PIN Authentication.

Given that the VVX phones were being issued IP Addresses via DHCP, it didn’t seem likely to be a connectivity issue between the VVX and the DHCP server. However, I looked into the traffic flows to confirm this and found something interesting. In this Wireshark capture, you can see that the DHCP Options get sent out in response to an INFORM message that the VVX sends. The INFORM message is a special DHCP message that is outside of the initial DHCP IP Address discover process (DISCOVER > OFFER > REQUEST > ACK). The interesting thing about the INFORM message is that the ACK for this message from the DHCP server gets sent as a Unicast response directly back VVX itself rather than to the DHCP Relay IP Address, unlike all the other messages. The screen shot below also shows this from the DHCP server perspective - you can see the final ACK message has a Destination IP Address of the VVX instead of the DHCP relay IP Address:


The highlighted INFORM ACK message in Wireshark shows that it contains all the additional Microsoft specific Vendor Class Options (Certificate Provisioning Service details). It’s the packet that has the information that the VVX needs to get PIN Authentication working.

In this case, because a centralised DHCP was being used, the broadcast DHCP messages on the local subnet were being changed into unicast messages by the local router and sent over to the central DHCP server. There was also a firewall in between this local router and the centralised DHCP server. This meant that because the returning INFORM ACK message was sent directly back to the phone (which is part of the DHCP specification and is correct operation) the firewall had not created a UDP flow for it and the packet gets blocked. The diagram below shows how the DHCP traffic flow works with a DHCP Relay in place and where the issue resides:


As you can see from the diagram above, the firewall appears to be allowing traffic from the DCHP relay through to the DHCP server. After transiting the DCHP relay, the DHCP traffic flows from source port 67 to destination port 67. Then the INFORM ACK message then gets sent back from source port 67 on the DHCP server to destination port 68 on the VVX -  which the firewall did not have an existing flow for and it dropped the packet. As a result, the VVX didn’t receive its required Cert provisioning service URL and because of this didn't display the PIN Authentication sign-in button.

The Solution

So the solution here, as it often is, is firewall related. In this case we had to allow port 68 from the DHCP server IP Address to all the VVX phone subnets. After this was done the INFORM ACK messages could flow as required for the VVX to get its Vendor Class options.


If you don’t have access to the firewall or you need a quick solution to the problem, you can hard code the data contained in the Vendor Class options into the phone. This was added as a config option in software version 5.3. The configuration item is shown below:

<dhcp dhcp.option43.override.stsUri="https://s4bwebint.domain.com:443/CertProv/CertProvisioningService.svc" />

This can also be set in the web interface of the phone in the Settings > Provisioning Server > DHCP Menu > DHCP Option 43 Override STS-URI:



The Wrap Up

For all the old-school UC people out there, let's finish with a Haiku in the style of the old Lync 2010 powershell blog:

Firewalls drop packets,
This causes many issues,
Switch off all firewalls.

Till next time, see ya! 


Read more →

Popular Posts