Historically, people have been obsessed with re-creating 3D images on a 2D plane and moviegoers wore the iconic red-and-blue glasses to experience the 3D effects as early as the 1910s. Within the past few decades, 3D has made big leaps from red-and-blue glasses, via polarized glasses and active shutter glasses, to the brand new generation of naked eye 3D display, available not only on TV, but also on tablets and smartphones. However, most 3D videos constructed from 2D frames do not impress with the simulated results. For this reason, Prof. Wing-bun Lee, Department of Industrial and Systems Engineering, led a research team to develop the Intelligent 3D Stereoscopic Imaging System on Plenoptic Camera, which captures depth information in a scene and displays captured images with compelling details and accuracy on VR goggles and 3D TVs. We were glad to have a member of the research team, Dr Lihua Li, explaining to us the ground-breaking technology.
Depth information from several angles
We perceive depth and 3D because each eye sees an object from a slightly different angle. There have also been stereo cameras in the market with two lenses that imitates human binocular vision. Then why do we need more than two images for stereoscopic image? “Theoretically speaking, you only need to take a picture of the same scene with two different cameras placed slightly apart from each other to capture a 3D image. But that would only provide depth relationships between the objects in a scene from one view and it only works when you can make sure the viewer stay at the sweet spot without moving his head. For this project, we aim to capture high-quality 3D images and display them on VR goggles and a naked-eye 3D TV designed to be watched from 28 different angles. Thus, more images captured from more perspectives will create a more comprehensive depth map, and hence more accurate and convincing 3D images,” explained Dr Li.
Lobster eye
Light-field photography (also known as plenoptic photography) captures many images of the same scene taken at the same time from different angles. There are several ways to do it, such as putting an array of micro-lenses in front of a camera, or using multiple cameras. However, the imaging system the team developed takes a different approach – it was inspired by how lobsters see. Instead of refracting light beams like the human eye, a lobster’s eye reflects them inside long and narrow square mirror tubes that act like a kaleidoscope. “The lens attachment we designed contains a rectangular mirror tube that generates a grid of nine images. It works like nine cameras placed slightly apat from each other and taking the same picture from different angles at the same time. Of course, it is much more portable than nine cameras which can also be quite difficult to sync perfectly,” added Dr Li.
The research team has developed two lens attachments – one for DSLR cameras and one for smartphones, essentially turning any camera into a light-field camera. The nine images or videos captured on the sensor will be analysed, processed and corrected by the software. Then, a depth map is constructed and spatial relationships among objects are worked out to re-enact the scene. 3D images or videos can be outputted in side-by-side two-view format for VR goggles, or in single-view format for 3D TVs. Dr Li said, “The native resolution of the captured images is full HD at 1920 x 1080 pixels. But the software is capable of upscaling the output to 4K (3,840 x 2,160 pixels).”
In April 2018, the Intelligent 3D Stereoscopic Imaging System on Plenoptic Camera won a Gold Medal with the Congratulations of Jury in the 46th International Exhibition of Inventions of Geneva, Switzerland.