While investors might fret about when or whether forthcoming virtual reality headsets will be ready to support a new technology ecosystem, Google is building out a substantial one of its own around its low-end Cardboard viewer. Much of the work behind the scenes is led by a team in Seattle.
There are now more than 5 million Cardboard devices out there from Google and third-party manufacturers who have taken the open-sourced plans for the device and made them out of plastic, wood, and even in the form of the old View-Master stereoscopic slide viewer.
Now, Google may be building a new, more substantial virtual reality headset, according to a report Sunday in the Financial Times. This one would also utilize smartphones, says the report, citing unnamed sources, but would apparently be at least a step or two up from the basic experience of Cardboard. The new design would include a head-mounted device that features built-in motion sensors—as with the Samsung Gear VR—rather than relying only on the capabilities of the smartphone slotted in to the device, which is how Cardboard works.
Cardboard is merely a box with a pair of lenses inside it to help you focus on a smartphone running the Cardboard app, which Google says has been downloaded 25 million times.
“The box doesn’t do anything magical other than focus your eyes on the phone,” said Google’s Steve Seitz at a Seattle virtual reality conference late last month. “It’s your phone that is the virtual reality device. It has high-res displays and sensors like the gyros to track your head as you move back and forth.”
Regardless of the viewing device, virtual reality needs content. And much of Google’s work to enable creation of virtual reality content—from high-end video capture rigs for professionals to a more accessible way of taking virtual reality panorama photographs with your smartphone—has been carried out by a Google team in Seattle, led by Seitz, whose Google title is “teleportation lead,” which hints at the promise of this emerging platform. (I’m not sure if that’s his real title, but it’s how he was introduced during the Technology Alliance virtual reality event.)
“One of the applications that I am most excited about in VR is teleportation,” said Seitz, who is also a computer science professor at University of Washington, and whose credits include a precursor to Microsoft’s awe-inspiring Photosynth. “So basically, trying to take you somewhere that you’re not. Really that’s the goal of VR.”
But before you can teleport with virtual reality to Iceland or the surface of Mars, someone or some thing has to go there first with a camera to capture the sights and sounds.
Google developed a camera rig comprised of 16 GoPro cameras arrayed in a circle for capturing 3D, 360-degree virtual reality video, and a system to process it—called Google Jump—taking advantage of the Internet giant’s powerful data centers. San Mateo, CA-based GoPro announced in September that the first Jump-compatible camera rig, the GoPro Odyssey, would be available on a limited basis to professionals beginning last November for $15,000.
The technology underlying Google Jump is pretty impressive. Seitz and his team began working on the system in late 2014.
They started with panoramic photography, a well-known technology applied in Google products like Street View.
“What’s neat is that when you look at this kind of imagery in a head-mounted display, it really comes alive, because it’s now all around you, and as you move your head you see different parts of the scene, just like you would if you were there,” Seitz said. “However, it doesn’t look quite real. The experience of showing this image in Cardboard, for example, it’s like being in a room with the image projected on the walls, kind of like a planetarium.”
Adding depth would help make a more immersive experience. That requires capturing a scene in stereo so that a different image can be projected to each eye, creating the sensation of depth.
While there were existing techniques for capturing stereoscopic images and for capturing 360-degree panoramic images, “capturing stereo and 360 [degrees] turns out to be very hard, to the extent that no one has really done this with video,” Seitz said. “Very few people have even tried to do this with video before.”
One early solution that was abandoned involved a pair of cameras mounted together on an axis that rotates 360 degrees to capture the surrounding scene in 3D. But it fails to capture moving scenes because the cameras are spinning. You could perhaps spin the cameras very fast, but Seitz and team ruled that out because it might be dangerous.
The team then turned to GoPro video cameras. Two cameras could again be paired to produce a video image with depth. After lots of experimentation, analysis, and optimization, they settled on a configuration of 16 GoPros arrayed in a circle, capturing 16 stereo viewpoints.
“But 16 is not enough,” Seitz said. “What we’d like to be able to do is interpolate hundreds or thousands of viewpoints in between, basically everywhere along the perimeter of the circle.”
That’s where advanced computer vision algorithms come in to interpolate viewpoints no matter where along the circle you turn your head, to eliminate vertical seams where the different camera images meet, and to reduce ghosting and alignment artifacts that arise from the fact that different points in the scene are at different distances from the camera—the depth of field they were looking for in the first place.
“We’re really doing, if you will, a stitching algorithm that compensates for depth differences, and this produces a much more seamless composite that looks pretty great,” Seitz said.
If that all sounds like a heavy computing lift, that’s because it is. It would take months to process an hour of video shot with the Odyssey camera on a single computer. Luckily, Google happens to have a lot of computers.
“Using data centers and cloud computing, parallelism, all that kind of stuff, we can get this down from months to hours, which is really what makes this whole project feasible,” Seitz said. “This is a big deal.”
Since most people don’t have their own datacenters, Google is making its computing power available to those who buy a Jump-compatible camera, such as the Odyssey, more of which will be rolled out shortly, Seitz said. “It will be a full system from the camera to delivering pixels to you,” he said.
Google is using YouTube to distribute virtual reality video, of course. An icon at the bottom right of the video player shaped like a Google Cardboard launches it.
There are some limitations. You can’t move the position of your head through the scene—you’re limited to a stationary, 360-degree rotation—nor can you see up and down beyond the GoPro’s 120-degee vertical field of view. Asked about this, Seitz cited limitations of the Cardboard. “We wouldn’t have a [head mounted display] that could support it,” he said during the Jan. 28 conference. He added later: “We’re certainly interested in display devices beyond this. We haven’t announced any product plans.”
In any case, adding the ability to move your head will likely create more work for Seitz and his team. “I think moving your head, the quality bar is a lot higher and so the vision algorithms have to be more sophisticated to really work reliably all the time,” he said.
As with any other visual medium, virtual reality will feature a broad range of quality. Google Jump and the GoPro Odyssey are Google’s current entry for high-end, professional virtual reality video. Seitz’s team is also responsible for a newly launched product called Cardboard Camera, which puts limited virtual reality image capture capabilities in the hands of anyone with a modern Android smartphone.
“Wouldn’t it be great to have a VR camera in your pocket?” Seitz said.
In this case, you can only capture still images; video would require a phone with two cameras. The process is the same as taking a 360-degree panorama—you rotate your smartphone camera in a circle to capture the scene around you. Cardboard Camera takes one image for the right eye and one for the left, capturing the scene in stereo to achieve that depth necessary for a more immersive experience.
The software, running on the smartphone itself, detects and tracks features in the scene as you move the camera in a circle. It again employs advanced computer vision algorithms to calculate the position and orientation of your phone relative to those features.
“Using this information, we can then re-project the images to produce a stereoscopic panorama in the same kind of format as [the Odyssey] produces, but just using your cell phone,” Seitz said, adding: “We just launched it a month ago and there’s already about 750,000 photos taken.”