5 Predictions for Synthetic Media in 2020

Victor Riparbelli
7 min readJan 16, 2020

--

1Synthetic Media will expand from fun, consumer applications and into B2B media production pipelines.

My overarching prediction for 2020 is that we will see Synthetic Media move from being primarily used in fun, consumer applications to making its way into traditional media production pipelines.

Today, the gap between how much media content companies want to produce and what they can afford cost/time wise is very large.

Synthetic Media will bridge that gap; making it easier and more cost-effective to produce large amounts of audio, video or image content without the need for costly, physical processes.

Three years into building Synthesia I am convinced that the impact of these technologies will be similar to that of synthetic instruments (making music on a computer) and PhotoShop (creating images digitally).

In 2020, I predict we will see the three first use cases for synthetic video catch steam. Within the next 3 years I am convinced that AI video editing of some form will be a line item on most video production projects, just like Photoshopping or color grading is today.

Synthetic images will enter the stock-photo market
You’ve probably seen the website that generates fake pictures of humans. If not, check it out here.

In 2020 these GAN-driven technologies will be used commercially to synthesise images of not only humans but also landscapes, objects, homes and so.

None of these people are real (Credit: Rosebud.ai)

Due to the near-zero marginal cost of creating new images we will see companies building marketplaces that host a vast amount of synthetic images that are much cheaper than traditional stock photography (which is already really cheap).

Rosebud.AI are already offering images of humans.

AI-video editing: translation / personalization
2020 will see a significant uptake in the use of AI to edit videos — taking existing video and enhancing it through personalization or seamless translation.

Scale content across geographies is primarily important for advertisements and corporate communications. AI dubbing will take a single video and natively translate it into any language, without degrading the quality of the video. We’re already working with many Fortune 1000 companies to translate and expect to significant scale up operations in the coming year.

We documented the effects on viewer engagement using traditional overdubbing vs. AI dubbing and the results were significant.

Synthesia Translate

Personalization will also be massive — we live in a world of increasingly personalized experiences making it difficult for brands to stand out in the deluge of online content. Personalized video will be an important tool to capture the viewer’s attention and it is the natural step up from personalized emails and images.

AI-generated text-to-video (end of year)
End of year we’ll see the first commercially available products to generate entirely new video (as opposed to enhancing/editing them via translation or personalization) by simply typing in text.

A real actor will be used as the avatar but can be made to say different things by using text-to-speech technology as the input to the system.

One of the first use cases for this technology will likely be virtual assistants and chatbots where synthetic technology can provide a lifelike experience compared to the very CGI-looking products that exist today.

While AI-driven video synthesis has advanced far there are still fundamental problems to solve in body language, audio/voice and scalability before you’ll be able to generate an advertisement purely through software without the need for cameras.

But in the future, video creation will be as easy as writing an email; and this will be the first step to fully automating that process.

2Synthetic voice will cross the uncanny valley

Synthetic voice has improved a lot in the last couple of years. Yet we (in my opinion) have yet to see production ready voice synthesis that truly escapes the uncanny valley and are usable as a replacement to traditional actor-driven voice over.

Today voice synthesis technology is primarily used for virtual assistants such as Siri or Alexa.

In 2020 I think we’ll see rapid adaptation of voice technology in traditional media production pipelines. Particularly in video games and audio books which are markets that today face significant challenges scaling human voice over.

Think about gaming — vast amounts of dialogue needs to be recorded in a studio using actors. Voice synthesis is a cost saving mechanism, but it also opens up new possibilities for much more open-ended dialogue from in-game characters. Speech snippets can be rendered out on demand as opposed to being pre-recorded in a studio.

I believe adaptation will be driven by significant technical leaps in two major areas:

The ability to easily generate many different voices
It’s still difficult to create many different sounding voices at scale. Alexa, Siri and similar technologies are the products of lots of effort and resources to produce a single voice — often requiring very large datasets for training.

For personal assistant type of use cases a single voice works well. But as a tool in traditional media production the ability to easily create new, unique voices is important — each brand or company will want their own distinct voice(s) that are in tune with their brand and tone of voice.

Controlling emotion, speed and pronunciation
Would you like to listen to Siri read out an entire eBook? Likely not — for short voice snippets the voice quality is digestable, but it lacks the emotion, pronunciation and variations in speed that makes it interesting to listen to.

DeepZen editing interface

Several companies are building creator-tools that enable a human editor to change and modulate the voice to fit the content. In the near term this human-assisted approach will be key to producing high-quality results.

Companies worth watching: Zenith, Deepzen, Modulate.ai, Replica Studios, Respeecher, SpeakAI

3 — Ethical Standards will Emerge

As more and more companies look to synthetic media technologies to accelerate their content creation we will increasingly see these companies take a formal stance on how to use these technologies and which providers to work with.

Tech companies have been under lots of scrutiny in the past couple of years, and as new technologies rapidly develop more and more companies are starting to think deeper about how the products they build impact not just the bottom line but society as a whole.

We’re seeing this up front and center in regards to environmental issues but also in areas such as facial recognition or privacy (eg. Apple limiting use of social media). Synthetic media brings similar questions forward; are we OK doing non-consenual content if clearly marked? Should viewers know if they’re watching a synthetic video? How do we pay actors for their likeness?

We published our own internal framework for ethical use of synthetic media which can be found here.

4Consumer applications of Synthetic Media will be on the rise

In the past year we’ve seen many examples of consumer-driven synthetic media technologies evolve.

Snapchat released their gender-swap filter almost a year ago. FaceApp made us look older and spurred a (really) weird conspiracy theory that it was all a ploy from the Russian government to gather images of our faces. ZAO released a ‘deepfake’ app that can swap the users face into pre-selected clips from well-known film and tv (and also spurred a weird conspiracy theory).

ZAO enables you to put yourself into well-known movie scenes.

FaceTunes parent company, Lighttricks, raised $135m to build what I think of as the content creation suite for Gen Z — accessible photo/video editing apps that are born on mobile and are able to produce professionally looking results without the effort or skills traditionally required. In the not so distant future I am convinced that this will be the de-facto way to create fast content for brands. Canva is a great indicator of the easy-to-use content creation tools that we’re headed towards.

I predict we’ll see more and more of these kinds of applications and that we will increasingly see apps and/or social networks revolve around a particular synthetic media technology. For example Instagram (in part) took off because of the ability to use filters that made any photo look great.

It’s not hard to imagine the next iteration of Tiktok being one where users can sing beautifully, cos-play as their favourite celebrities or create their own digital avatar to express themselves.

5 — Deepfakes won’t impact the 2020 Election

Will we see Dr. Fakenstein-style faceswapping and other satirical videos/PSA’s emerge? Definitely. But I highly doubt we will see any genuine attempts to use deepfakes for disinformation purposes.

Why?

Reading the news makes it look like it’s easy to create synthetic videos. In reality it’s really, really hard — particularly if the goal is to edit what someone says in a video as opposed to merely face swapping which only changes the appearance of the face in a video.

Secondly, the speed and scale at which such a video would get debunked (see Nancy Pelosi which wasn’t even a deepfake) compared to the production costs would simply not be worth it today.

Politicians and public figures have significant means of providing provenance of video. And unlike text-based disinformation, which can be more nuanced in the way the lie is presented, a video is demonstrably either real or not.

For now it’s simply much easier, effective and more scaleable to write fake articles or do basic image/video editing to change the context of existing media. That is also what most experts will agree on — even basic image/video editing is only prominent in around 10% of all disinformation today. The rest is purely written disinformation.

That said, I’m sure the media will go bonkers either way.

Sign up to discover human stories that deepen your understanding of the world.

--

--

Victor Riparbelli
Victor Riparbelli

Written by Victor Riparbelli

CEO & Co-founder @ Synthesia.io — we’re building the future of synthetic media. Interested in technology, culture, music & art.

Responses (1)

Write a response