It seems like CPaaS vendors have grown complacent compared to the rapid innovation coming from UCaaS vendors. This makes no sense.
CPaaS has been leading the innovation when it comes to how developers build communication products. This has been the case ever since CPaaS was coined. But now, the trend is changing. This is doubly true for WebRTC and video communication services. UCaaS vendors have taken the lead in innovation and setting the pace of the market, leaving CPaaS vendors behind.
Can this trend be reversed? Is this a bad omen for CPaaS vendors competing in video use cases?
Predicting future communication trends
I used to work at RADVISION. The company specialized in video conferencing equipment but was split into two business units. The one I was a part of licensed VoIP software stacks to developers. You could say that what we did predates CPaaS. We didn’t have the cloud or server APIs but we sure did have SDKs.
In each and every townhall the company had, the CEO used to mention that our business unit was a precursor of the industry. Whatever requirements we’ve seen, whatever trend we experienced in sales (increase or decrease) was just an indicator of what is to come in the market in 3 years or so. The reasoning was simple – we licensed to developers, which then built their products and put them to market. Development cycles being as they were, 3 years was a good estimate.
Fast forward to today, and you have CPaaS vendors (the technology licensors of communication development tools) and the rest of the industry. And the large part of the rest of the industry is UCaaS.
The thing is, UCaaS vendors are no longer waiting for CPaaS vendors to innovate – they are just doing it on their own.
The promise of CPaaS
Communication Platform as a Service. What is it for anyways?
The whole purpose of CPaaS is to reduce the time to market for developers. Make it easier to get things done with communications by developing all the nasty little details for you.
Call it low code. Call it SDK or API or whatever.
I did an interview with Jeff Lawson, CEO of Twilio years ago. There Jeff explains the essence of Twilio – why he started the company. And the reason is to solve the communication problem for companies so they can focus on building great customer experiences.
Remember this one. We will be back to this interview a wee bit later.
Pandemic requirement shifts
Then the pandemic hit. And with it, a change in what communication requirements looked like around the world for all use cases.
4 distinct changes took place:
#1 – meetings became larger
We had large meetings before. The difference was that we connected rooms with groups of people in each room. Now? Everyone’s joining from his own place.
A meeting with 20 people in 3 rooms became a meeting with 20 people from 20 rooms. We will be back in the office, but the requirement for bigger meetings, with more people joining remotely will still be there with us.
Look at the start of this session from last year’s Kranky Geek virtual event.
Here Li-Tal Mashiach, Senior Engineering Manager at Facebook in the Messenger team explains what they’ve seen as changes in the usage of video calls in Messenger. Look at around the 2:40 mark in that video.
#2 – more meetings for longer periods of time
This one is obvious. Or is it?
Almost all vendors have seen a significant growth in both the number of video sessions conducted on their platforms as well the length of these sessions.
Scale had to be dealt with across these two axes.
You need to make sure you can carry conversations that now take hours on end instead of minutes:
- My daughter had 4+ hour long sessions with her friends during lockdowns going well into the night. They talked, are, cooked and did whatever the hell teenage girls do together – just remotely
- My son is still video calling with his cousin while playing Fortnite on his Xbox. And that usually lasts… well… until we stop them forcefully
In both cases, much of the interaction is just ambient video. They do things together or apart and just have these social interactions take place because they can’t meet. Funny enough, my son and his cousin aren’t stopping it now even though everything is open – that’s because meeting physically requires a 20 minute car ride…
How does that change the focus? How do you maintain servers, upgrade and update them when sessions can take hours on end on a machine? Does it mean the media servers also need to be stabler in how they operate?
And what about the number of sessions? Is it that easy to scale 10x or more your current traffic? This isn’t a simple question to contend with. Google shared their own challenges with scaling Meet which makes for a fascinating read. I had my share of vendors to help with best practices in scaling their WebRTC infrastructure during the last 15 months as well.
#3 – more networks
Back to that Kranky Geek video by Facebook. They saw an increase in desktop access. More than they had expected being mobile first.
I’d argue that we’ve all seen more variety in devices and networks. My apartment went from 1 video calling user to 4 video calling users in a matter of a day. Billion people or more who never went on a video call have done so and will continue to do so at least some of the time.
What devices do these billion people have? What does their home network look like?
If you look at the technology adoption curve, these aren’t the innovators or early adopters. They aren’t even the early majority. They include both the late majority and the laggards.
This means we’re facing a lot more variance in devices and networks. In the need to deal with lower end capabilities and resources available. And to deal with having these large groups take place with a larger variety of the differences across devices.
#4 – more places
The best part of video calling during the lockdowns and up until today is taking a peak at other people’s home office. You get to see a piece of who they really are outside “work”.
These places are almost always less than ideal.
- Dogs and cats being part of the background
- Kids. Lots of kids. Popping into the screen. Making noise
- People walking in the back doing laundry, cooking, running after kids. The works
- Construction noises from outside
- Poor lighting conditions
Everything you can think of that affects the audio and video quality due to external sources will be there. And you can’t always ask the user to go purchase a better camera, change where he is sitting or replace his device.
It becomes a technical problem to solve many of these issues, especially when the service offers ad-hoc connectivity for its users.
CPaaS during the pandemic
CPaaS were supposed to help vendors build their products. Look at future needs and cater for them. And for the most part they do. But somehow during this pandemic, it seems that many of them have failed to do so.
I’ll look at Twilio here – and not because they are the only vendor with these issues – but because they are the biggest CPaaS vendor and the precursor of the industry.
Last year after Twilio’s Signal 2020 event I wrote that I expected more of them:
For me this says that Twilio hasn’t invested in video as much in the last year or two. If they had, they would have announced something more thrilling and interesting. Maybe larger meetings, above 50 participants? Broadcasting capabilities? Noise suppression? Something…
Since I wrote that, 8 months have passed. Meeting sizes for Twilio Programmable Video are still limited to 50 participants. There are no broadcasting capabilities. No noise suppression. No background blurring. Nothing.
I can’t even recall any real additional feature that Twilio introduced for Twilio Programmable Video since that Signal event. Maybe updates and improvements to their React reference app, but nothing more.
Most other vendors showed similar inclination and introduction of new features throughout the pandemic. It seems like the trend now for video APIs is to focus on embedded iframes for faster development. These have been discussed and experimented with years ago, and now seem to be finding new traction and interest.
It takes more time to develop features in CPaaS than it does on other platforms. The reason for that is the CPaaS vendors need to do 2 things others don’t have to deal with:
- Make the feature generic, solving a problem for more than a single use case or customer
- Document the feature properly, so that developers will be able to figure out how to use it
But let’s face it. These new requirements have been around for 15 months now…
There are obviously a few caveats here:
I am griping here about video
CPaaS has grown during the pandemic, so this hasn’t hurt them. Yet
Video is usually a small percentage of traffic and income for a CPaaS vendor
UCaaS during the pandemic
UCaaS shows a stark contrast to how CPaaS responded.
Many of the leading vendors have added background blurring and replacement, noise suppression and other features and capabilities. They have done so in breakneck speeds and they seem to be spewing out new features every week or so.
This isn’t limited to a single vendor. Out of the top of my head: Zoom, Microsoft Teams, WebEx, Google Meet and RingCentral all introduced these features in the past year. And all of them seem to be investing further into these areas while pushing forward other initiatives they have, each with its own focus.
Remember Jeff’s interview? I asked him if he believed UC vendors should develop their services on top of CPaaS. This is what he answered:
Yeah. I believe that companies whose primary business is communications can and definitely should and would get competitive advantage by using a platform like Twilio to build upon. The reason why is this. It used to be when those UC companies started, their core competency was making the phone ring. Then they’d add some software functionality on top of it, sure, but the vast majority of what they worried about was how do I make the phone ring? The problem is Twilio has democratized that ability.
The existing UCaaS vendors, they would be wise to build on top of the same platform that any developer in the world can come and start to compete with them on. If they don’t, those independent software developers, they can actually start and build companies that are really compelling competitors, because they don’t have to focus on the low level bits. They’re focused on the things customers really care about, which is features, functionality, and the user experience that matters.
While mostly true, this doesn’t hold water these days for video communications. Relying on CPaaS vendors means you need to figure out the feature set that is necessary to be a compelling competitor yourself – larger groups, background replacement, noise suppression, …
CPaaS vendors need to put their act together in the video domain, or start losing customers that will just go build this on their own. Especially when we see Zoom coming up with their Video SDK and becoming a direct competitor to CPaaS vendors.
UCaaS vendors are having their own headaches in the market due to the dramatic changes that Microsoft and Google are bringing into this domain. I’ll leave that for a future article.
The pandemic also changed the dynamics in communication vendor valuations, shifting the focus to slightly different domains.
Hopin and Clubhouse, which I already touched on in my previous article about the new era in WebRTC.
Agora (video CPaaS vendor) had a hugely successful IPO, followed by another spike due to the popularity of Clubhouse (who is using them). They are now back to roughly their initial IPO price point.
Twilio (CPaaS) increased in their valuation throughout the pandemic. My guess is that this is mostly due to the increased use in voice and SMS. Less so in video, where they invest a lot less.
Zoom. Need I say more?
The differentiation dilemma & Build vs Buy
How does one differentiate then?
- CPaaS vendors haven’t done enough during the pandemic to enable differentiation for the video use cases
- The same CPaaS vendors also haven’t differentiated enough from one another – at least not on the surface level
- Build on top of your CPaaS vendor the missing features (if possible)
- Build your infrastructure in-house
I am seeing the following trends in CPaaS adoption and use. They used to be related to pricing, but now they are becoming more and more related to feature sets and differentiation needs:
Most enterprises stick with the use of CPaaS vendors. They rely on them for their communication needs. They will switch from a CPaaS vendor to another CPaaS vendor if they can get better pricing or if their current vendor is lacking features (or provides poor support).
Technology vendors and startups will pick either CPaaS vendors as their starting point or prefer going it alone from the get go. Those that become hugely successful will end up actively working on replacing the CPaaS vendor with their own infrastructure. They will see that as an imperative a lot more than their enterprise brethrens.
Unified communication vendors will continue as they are. Assuming that communication infrastructure is core to their business and will work towards maintaining their own knowledge and experience in the area – doubly so after the pandemic.
Wake up and smell the coffee
CPaaS vendors should wake up and smell the coffee.
The world has changed. Drastically.
There’s no going back to the old ways – even without quarantines.
I believe that there’s a competitive advantage waiting here. CPaaS vendors have been shying away from these requirements. The first ones to come out with actual solutions and feature capabilities that will ease the development of customers will win due to this differentiation.
The reason this hasn’t happened so far is that traditionally, such things weren’t catered for directly by CPaaS vendors – it is out of their comfort zone. This leads to an opportunity that is up for the taking.
On a similar note, after running successfully the Future of Communications workshop with Dean Bubley, we decided that it is both information packed and fun to do. If you are interested in a private session for your company – let us know.