Hyperreality and the Urgent Need to Scale Content
The demand for virtual, augmented, and mixed reality content is set to skyrocket to such an extent that current content creation and distribution technologies cannot cope. At the same time, content will evolve to a point where physical and virtual realities appear to merge, giving rise to a new "hyperreality"—the inability to distinguish what's real from a from a simulation of it
"If the forecasts are right, not only will a whole new set of tools and workflows be needed but these tools need to be in the hands of a lot more people," says Jon Wadelton, CTO at visual effects (vfx) software developer Foundry.
The biggest market for virtual reality (VR) content won't be entertainment but training and enterprise scenarios (such as medical surgery and CAD). In its Virtual Reality Industry Report: Spring 2017, Greenlight forecasts a $26.7 billion global VR content marketplace by 2021. The bulk of this is in design and industrial visualisation ($18.2bn) with games creation around $4 billion.
That sum is dwarfed by figures from the International Data Corporation, which estimates worldwide revenues for VR and augmented reality (AR) will grow to more than $162 billion in 2020 (including hardware sales). That's from a base as low of $5.2 billion in 2016.
"That is a massive amount of AR/VR revenue in a short time period and means massive amounts more content which begs the question who is going to make it all," contends Wadelton. "The market is going to need a lot of content, and they are going to want it to look as real as possible—to see things digitally that are even better than the real world. To do that, the entire production workflow needs to be revised and it needs to be made scalable."
As a vfx production and postproduction software house with credits including Gravity and Star Wars, Foundry has potential answers to this in R&D.
Wadelton outlines the numerous challenges to creating hyper-real content in such volumes.
This starts with capturing 360° video with multiple cameras. "Three to four years ago you had to build your own camera," says Wadleton. "These days you can buy one off the shelf. There are tools for stitching the images together such as Kolor's Autopano, Foundry's Cara, and camera systems which include realtime stitching (such as Nokia Ozo). We're getting there with camera capture, but there are still issues. These techniques can record from only one point of view. You can't walk behind walls, for example, move into a room, or interact with objects. It doesn't in principal lend itself to a greater sense of immersion and therefore doesn't deliver that hyper-real feeling."
Part of the solution is to run the content through a games engine. This brings its own set of issues, not least that games engines work at 90fps per eye, a rate at which mobile devices are not powerful enough to render in realtime.
There are other issues too. "For mixed reality, you need to perform accurate real-time lighting capture. Getting game assets into the engine is quite hard unless you reduce the complexity of the asset so it can run in rea ltime. It remains very hard to capture human expression to lay on top of digital humans."
Another big challenge is reusing assets. "We want to get to a place where if you create high-resolution master assets for a commercial or feature film, then you can also publish that to 3D renderers or game engine or some unknown format in future," says Wadelton. There's a big challenge around that. We don't want to be manually re-creating assets."
Foundry's R&D Project Bunsen is exploring this. Its framework can be used for both high-quality image and video rendering and for preparing data for VR and real-time use.
Arguably the biggest issue is simply that of creating sufficient 3D content to satisfy the predicted surge in demand. "We need to make tool so intuitive that making 3D assets is simple for anyone who is currently not involved in visual effects," he insists.
An alternative to using multiple cameras is to capture volumetric data using light field or point cloud. Light field captures all the light rays at every point in space, travelling in every direction. A point cloud is a large collection of points acquired by 3D laser scanners to create 3D representations of existing structures
Both will allow capture of depth information and theoretically real-time capture from multiple points of view.
Development in this space is being led by companies including UK-based Figment Productions, which has its own motion synchronisation and control platform for VR; 8i, the New Zealand headquartered company which has a studio in Culver City fitted with multiple cameras to develop holographic video for display on mobile phones; and Lytro, the light field camera developer behind the Immerge VR and Cinema Camera which is currently targeting short vfx sequences. Foundry has a development partnership with Lytro in which it is integrating its Nuke compositor alongside Lytro plug-ins for post work. It is also extending this into the cloud with its forthcoming virtual pipeline system Elara.
MPEG-I on the Drawing Board
Nonetheless, the biggest headache faced by all companies working in immersive formats is handling the vast amounts of data. That's where the work of international standards body MPEG is comes in. It has begun conducting experiments into compression technologies "for use in applications that provide an increased sense of immersion beyond that which existing video coding standards can provide," according to an ISO paper on the subject.
The proposed ISO/ IEC 23090 (or MPEG-I) standard targets future immersive applications. It's a five-stage plan which includes an application format for omnidirectional media (OMAF) "to address the urgent need of the industry for a standard is this area"; and a common media application format (CMAF), the goal of which is to define a single format for the transport and storage of segmented media including audio/video formats, subtitles, and encryption. This is derived from the ISO Base Media File Format (ISOBMFF).
While a draft OMAF is expected by end of 2017 and will build on HEVC and DASH, the aim by 2022 is to build a successor codec to HEVC, one capable of lossy compression of volumetric data.
"Light Field scene representation is the ultimate target," according to Gilles Teniou, Senior Standardisation Manager - Content & TV services at mobile operator Orange. "If data from a Light Field is known, then views from all possible positions can be reconstructed, even with the same depth of focus by combining individual light rays. Multiview, freeview point, 360° are subsampled versions of the Light Field representation. Due to the amount of data, a technological breakthrough – a new codec - is expected."
This breakthrough assumes that capture devices will have advanced by 2022 – the date by which MPEG aims to enable lateral and frontal translations with its new codec. MPEG has called for video test material, including plenoptic cameras and camera arrays, in order to build a database for the work.
"If we can capture a volumetric representation of the world then we can produce true 6Dof since you can't do this with 360-video," says Wadelton. "With 360 you can't see behind things or around things which breaks the experience. For VR, a new compression system is a must to be able to deliver hyper-real versions of the world."
Filmmaker Kelichi Matsuda's video Hyper-Reality shows an environment where real and CG worlds are merged into one.
Hyperreality Explored
The concept of hyperreality itself demands closer investigation. Officially coined in 1981 by French philosopher Jean Baudrillard, the idea can be traced back to at least as far as naturalism, the nineteenth-century movement towards representing things closer to the way we see them. The invention of still photography, then motion pictures, adding sound, colour and stereo 3D, are all stages on the same trajectory toward greater immersion and a closer and closer simulation of reality.
Arguably, scripted (or structured) documentary ("reality") shows like Keeping up with the Kardashians is one form of hyperreality. Facebook is another, where what we choose to post online may be what we want the world to see of our life.
The emergence of AR/VR/MR (mixed reality) takes this to another level. Here, hyperreality is about whether one can tell the difference between real and unreal when you're in that environment.
Literature and films have explored the concept, including Ready Player One, the highly anticipated Steven Spielberg feature adaptation of Ernest Cline's novel.
VR is already evolving from being able to look into a scene and rotate around a specific spot—o called three degrees of freedom (3Dof)—to six degrees of freedom (6Dof) which simulates movement forward and backwards.
A stage along this evolution is location-based VR where the virtual world is matched with the physical world. In some VR theme parks users walk untethered in spaces which are mapped to the VR sim and are able to sense touch through haptics. Location-based VR is a rapidly growing business. Greenlight Industries forecasts that installations in malls and movie theatres will bring in $222 million worldwide this year; by 2021, that amount will have grown to almost $1.2 billion.
Another related development is MR, having virtual content integrate with the real world.
An example of this mixed reality is envisioned by designer and filmmaker Keiichi Matsuda whose art video Hyper-Reality has gone viral. It shows the real and CG worlds merged into one. "In this scenario life itself has become gamified," observes Wadleton, echoing the central idea behind Ready Player One.
Google, Apple, and Facebook are already capturing data, such as geometry about real-world objects scanned via mobile phones (cameras, GPS, depth sensors) into databases. Data from Google's Tango or Apple's ARkit, for example, can be used to build a geometric representation of the world into which content creators might be able to licence and insert CG, rather than having to map the world afresh each time.
Wadelton reveals that in a poll conducted by Foundry, nearly a third of those surveyed believed that real-world experiences will be eventually be replaced by virtual reality. "Already with vfx we have got to the point where simulation on screen is better or more interesting than the real thing," he says. "People expect to experience hyperreality."