Understanding the Lync Video Interoperability Server (VIS)

March 3rd, 2014 | Tags:

Last year at Lync Conference 2013 Microsoft announced that they were going to introduce a new video interoperability component to Lync vNext, namely the “Video Interoperability Server” or (VIS). During the conference Mike Stacy, Jeff Schertz and I presented a session on Lync 2013 video interoperability, during the session we covered off these three pillars

  1. Lync 2013 new video functionality
  2. Microsoft’s SVC implementation
  3. Modes leveraged for interop, specifically (Gateways vs. MCU vs. native integration)

At this year’s Lync Conference the keynote again visited the VIS component and Microsoft provided a public demo.


During the demo we can see a traditional room system (in this case a Tandberg/Cisco EX60) being added to a Lync multi-party conference and presence being reflected within the Lync client.

So architecturally how does this work and how does this fit into the overall Lync video interop landscape, the biggest question I hear is “does this replace the needs for Gateways, MCUs and the need for native Lync support?”

These are good questions and as much as I’d love to give you a one size fits all answer, I can’t – ultimately this is going to be based on how you use Lync, your investment in video endpoints, the dependencies you have on H.323 and standard SIP calling, the architecture of your environment, size and geographic location of your end-users. Dustin Hannifin and I re-visited Lync 2013 video interoperability at this year’s Lync Conference, we both re-visited the pillars above and included VIS as one of those options. I’m going to go over here what was discussed at that session (in finer detail), cognizant that VIS is still under wraps and some lower level detail is still yet to be disclosed by Microsoft.

First let’s start with the problem, when Lync 2013 was announced so was support for H.264 SVC, which led to some confusion, specifically “so now all my traditional video just works with Lync?” Of course this isn’t the case, whilst the H.264 SVC codec has far closer alignment with traditional video systems from a media perspective there’s still some outstanding work required to make things work on the signaling side.

Now we know that there’s not really such a thing as “standard SIP”, all UC call control platforms have their own flavors and SIP extensions, so with this in mind Microsoft’s SVC implementation follows suit.
Jeff Schertz wrote an article on Lync 2013 video interoperability that explains this is greater detail, but for those that have either not read this (or don’t have a few hours set aside :-)) the conclusion is that Microsoft’s implementation of H.264 SVC isn’t interoperable with AVC VTCs without the following:

a) Re-packetization of media – Microsoft’s SVC is Mode 1 (Temporal Scalability with Hierarchical P i.e. a single video stream is sent for each requested resolution that could be comprised of different frame rates, typically a stream with two layers is sent (one layer @ 15fps and another @ 15fps, this in turn would be capable of facilitating either 15fps or a cumulative 30fps).

However traditional video systems utilize H.264 AVC is Mode 0. Folks will argue that Mode 1 contains streams that are AVC compliant, but a non-SVC or AVC only room system isn’t going to be able to handle this Mode 1 stream without some modifications.

b) Signaling translation – as stated above there’s often this misnomer hanging over standard signaling, Lync 2013 and the new world of SVC compounds this further given that even more work is offloaded to the endpoints i.e. dynamic layout composition and multi-stream video intelligence.

Another interim puzzle that some early Lync 2013 adoptees experienced is whereby H.263 is being leveraged for point-to-point Lync calls, Lync 2013 dropped support for this dated CIF resolution capable codec. So the priority level for compatibility within the items above become high on the UC agenda.

The answer to this problem hasn’t changed largely speaking, we’re still in the world of Gateways, MCUs and endpoints that natively support Lync. VIS fits neatly into the Gateway category, VIS should be regarded as a Back-to-Back User Agent or (B2BUA), co-located either on your Lync Front End or deployed standalone in environments where additional scale is required.

B2BUAs aren’t new to Lync, in fact Session Border Controllers have effectively been acting in this capacity with Lync for voice, specifically performing signaling translation. However in this case more than just signaling is augmented, bit stream information also needs to be modified to effectively setup calls, it’s still lightweight work i.e. there’s no transcoding but due to this requirement both signaling and media needs to flow via the B2BUA is all scenarios.

VIS Blog Post


A B2BUA registered VTC has no Lync intelligence, it’s completely unaware of Lync and the B2BUA component is acting as a SIP proxy, tunneling all payloads. A Lync client would behave differently and where possible in point-to-point scenarios will always seek out the best media path, this in turn offers lower latency and in many cases gives more back to the network folks by avoiding WAN traversal.

As per the demo at Lync Conference other Lync clients will now see presence and be able to add the VTC to a Lync multi-party call but single click-to-join from the VTC isn’t possible as Lync Online Meeting translation/presentation is a whole other story. Without performing significant heavy lifting a B2BUA isn’t going to add Gallery View (we’re entering MCU territory here), so a Lync 2010 style will be on offer for the VTC (single speaker voice switched experience). In this brave new SVC world DSP driven MCUs are becoming less relevant, but a non-SVC VTC isn’t going to be able to handle multiple simulcast video streams, therefore no snazzy layouts here.

Another consideration is content, VTCs have standardized on H.239/BFCP it’s not clever stuff and is nowhere near as sophisticated or collaborative as what Lync has on offer (RDP, PSOM & WAC), but unlike the video payload there is no way today in which these can be interoperable without transcoding.

In conclusion, seeing Microsoft’s investment in VIS is a really good thing, but as always there are other mechanisms for interoperability which either negate the need for this work to be performed by the backend or facilitate easier ways of joining Lync conference calls whilst also preserving the Gallery View experience.

For more information on this I’d encourage you to watch our presentation (for now it’s only open to Lync Conference attendees, but this will go big bang in the coming weeks…)

  1. March 3rd, 2014 at 22:07
    Reply | Quote | #1

    Confused..If the VIS only has B2BUA capability for call set up and tear down what video codec was used to present video to the Tandberg VTC in the demo? H.264 SVC?

  2. Adam [I’m a UC Blog]
    March 4th, 2014 at 00:03
    Reply | Quote | #2

    Hi Shawn,

    Once the B2BUA has performed header modification for the H.264 SVC bitstream then effectively H.264 AVC is presented to the endpoint.

    – Adam

  3. March 4th, 2014 at 18:21
    Reply | Quote | #3

    Ok so H.264 AVC presented to VTC and H.264 SVC presented to Lync then?

  4. Adam [I’m a UC Blog]
    March 4th, 2014 at 22:29
    Reply | Quote | #4


    – Adam

  5. March 5th, 2014 at 19:03
    Reply | Quote | #5

    Righteo Adam! Thanks for the explanation. Keep the blog posts coming!

  6. March 14th, 2014 at 18:55
    Reply | Quote | #6

    Ofcourse it will lack some (or even many) features a full blown transcoding MCU/Gateway offers
    considering that VIS will most likely be free as its part of the FrontEnd whereas Gateways cost A LOT of money the choice is easy for most users.
    VTC companies currently make serious money with those gateways but the prices of VTC endpoints keep on falling thanks to guys like Vidyo and others I dont see shiny times in the future for traditional VTC guys. It will be a tough ride …

  7. Matt H
    March 16th, 2014 at 23:33
    Reply | Quote | #7

    Hi Adam, are you able to clarify you comments around interop and content? Are you saying that in the new version to be released, content will be available to just view (as opposed to collaboration) in conjunction with the video stream? Or is it as is now, and content feeds between Lync and open standards (BFCP) based systems are still not possible without transcoding through a 3rd party.
    Thanks 🙂

  8. Adam [I’m a UC Blog]
    March 17th, 2014 at 01:50
    Reply | Quote | #8

    Based upon feedback at Lync Conference from Microsoft, content is not in scope at this time. Polycom (my employer), does have a solution in market today that extends the Lync client – making the client BFCP capable.

    Watch this space for more options in the near future…

    – Adam

  9. Matt H
    March 17th, 2014 at 01:54
    Reply | Quote | #9

    Thanks for the quick response (I also hit you up on Twitter), I’m currently working with Polycom in Brisbane, Australia on our options and content is a fairly important factor for us. Thanks!

  10. Adam [I’m a UC Blog]
    March 17th, 2014 at 02:10

    Pleasure Matt, reach out if you need anything…

    – Adam

  11. Adam [I’m a UC Blog]
    March 19th, 2014 at 03:41

    Hi Harald,

    I agree VIS is a great solution for many Lync customers that already have a significant investment in Tandberg endpoints – the purpose of this article wasn’t to down play this solution, more set expectations. In certain cases VIS will be enough to facilitate a stop gap approach until such time existing assets an be replaced with something else, perhaps Lync Room Systems or endpoints that natively support Lync. The latter offers the best user experience without compromise on the network or additional infrastructure complexity.

    Regarding your point around “VTC companies”, I disagree (well I would right! ;)), no but seriously modern day VTCs are significantly lower cost (lower than Vidyo) and without the need to buy into VidyoWay infrastructure.