Researchers invent real-time 3D video conversion
Watching football could have an extra kick if technology to convert 2D footage into 3D takes off Christine Daniloff/MIT

Computer scientists have developed a system that is able to automatically convert 2D video footage of live sporting events, like football matches, into 3D video, so that they can be viewed in virtual reality.

At the moment, we are only able to watch movies in 3D, and that is only possible because most major Hollywood box office releases have employed multiple visual effects studios to manually paint and animate every single frame in the scenes that require computer graphics during post production, and this can involves hundreds of artists at a time.

Being able to make live televised events like sports matches in 3D would mean that the content could be viewed on virtual reality headsets, which could help to push consumer interest in VR technology.

So researchers from MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Qatar Computing Research Institute (QCRI) set out to convert the video footage of live events, such as football matches, from 2D on a TV into 3D, and they have come up with a system that does the trick using video game technology.

Studying Fifa to create 3D video conversion

The researchers noted that sports video games like EA's Fifa football simulator game look so realistic because they make use of incredibly detailed 3D maps of the virtual football pitch that the user has to navigate while playing the game.

For example, if the player initiates a move, the game is programmed to adjust the 3D map accordingly to generate a 2D projection of the scene that unfolds after the ball is kicked, and no matter how the user changes the viewing angle of the match, the 2D scene unfolding on the still corresponds to the 3D map.

So the researchers decided to replicate this effect in order to achieve a 3D projection, by replaying Fifa 13 over and over again so that they could capture screenshots using Microsoft's video-game analysis tool PIX, and then extracting the 3D map from each screenshot.

Using a standard algorithm that is able to detect the difference between two images, the researchers narrowed down the images to just the screenshots that best captured the range of possible viewing angles and player configurations that the game presented and put them in a database.

They then programmed a system to look at the tens of thousands of screenshots in the database and locate matches between the screenshots and every frame of 2D video footage from an actual televised football match.

When the system finds a match between a Fifa screenshot and the video frame of the actual football match, the computer automatically adds depth information to the corresponding sections of the video feed and stitches them together, resulting in footage that looks 3D.

3D conversion takes only a third of a second

For now, the researchers say that the system takes about a third of a second to process each frame of video, but successive frames could be processed in parallel, so the delay would only happen once. This would mean that a broadcast delay of one to two seconds would be enough to enable the 3D conversion to keep up with the video stream of the live match, but the researchers hope to reduce the delay even further.

"Any TV these days is capable of 3D. There's just no content. So we see that the production of high-quality content is the main thing that should happen. But sports is very hard," said Wojciech Matusik, an associate professor of electrical engineering and computer science at MIT and one of the system's co-developers.

"Our advantage is that we can develop it for a very specific problem domain. We are developing a conversion pipeline for a specific sport. We would like to do it at broadcast quality, and we would like to do it in real-time. What we have noticed is that we can leverage video games."