The Plenoptic Paradigm
Posted on Wed, Oct 10, 2012 @ 01:01 PM
Zebra Imaging BlogAt MIT in the early 90’s, my fellow Media Lab colleague, Ted Adelson, introduced and developed the concept of the plenoptic function, the function that describes everything that can be seen (from plenus, complete or full, and optic). The plenoptic function embodies a way of thinking about light, not only as a series of images formed from 3D space onto 2D planes (whether retinal or cameras), but rather as a three-dimensional field of co-existing rays, with an extension into the fourth dimension of time as well. This is a way of formalizing the concept of the light-field proposed by Alex Gershun in the 60’s, with roots back to concepts introduced by Faraday in the mid 19th century. There is plenty to be found on the web concerning the genesis and development of the light field model and the plenoptic function itself.

Fast-forward to the present, when ubiquitous miniature cameras and real-time image processing have resulted in the ability to collect high-quality still and motion 2D imagery nearly effortlessly, and when the first plenoptic cameras are being introduced to the market that capture the world as more than a collapsed transformation onto a 2D sensor. Other visual information capture devices, such as those based on projecting patterned light (eg. Microsoft Kinect) and time-of-flight laser scanning technologies are also proliferating, providing the ability to capture higher density 3D surface samples of our world. Finally, software applications that can determine the position of a photographic recording, and then triangulate that location with myriad other photos taken of the same subject matter to produce a “photosynth” or even a 3D point-cloud “geosynth” model, intrinsically link the recording device location with a precise position in space, allowing for the recombination of the collected rays of light to form new images.

Crowd-source multi-perspectives: synchronized collection of images (ray intensities and directions, with GPS link) from ubiquitous sources can be assembled into point cloud or polygonal-based models for light-field display.
What are the implications for the future of information display, when a critical mass of ray directions and their relative intensities (and how these change over time) are captured and are available for broadcast and portrayal? And what if, in that process, a spatially co-registered set of rays were introduced representing the light-field of a synthesized entity, like a 3D computer graphic character or scene element? The result is a light-field database that can be presented to the viewer with capture-position and viewing position independence. If a display were able to visually re-create the light-field in physically-accessible space, the viewer could change position, change focus, reach in and interact, and perceive the scene as a realistic and tangible representation. Two-dimensional planes containing information (such as the one you’re reading this on right now) are just a small subset of the image-types that could be displayed. This plenoptic paradigm charts the trajectory of information portrayal and interaction, passive and active story-telling and gaming, and fundamental natural man-machine user-interface. It enables accurate scene portrayal in fully volumetric form, independent of viewer position, emission source orientation, or even emission source topography (think wavy displays producing coherent images despite their local deflections).

Plenoptic capture of a sporting event, processing to coherently combine captured rays and their intensities over time, and broadcast to a light-field display that mimics the stadium experience in reduced scale.
The Plenoptic Paradigm has other implications as well, which I will explore here in future posts. However, generally speaking, the powerful combination of prolific and ubiquitous “light-ray intensity capture devices” (mostly GPS-located cameras) in a dense network surrounding nearly any event, requires a processing and broadcast capability capable of handling high volumes of visual information. The transmitted information may be linked to meta-data perhaps to preserve a record of the collector, so that when the light-field is recomposed on a volumetric display, contributors can see their component. What repercussions would this have in the social media milieu? How does this change story-telling, and story-receiving? Does this paradigm provide a truly tangible and appealing manifestation of crowd-sourcing? The Plenoptic Paradigm describes a rich new frontier of visually-capturable and presentable information made possible by a combination of networked sensor systems, 3D computer graphics advancements in capability and speed, and new display technologies (such as those embodied in the ZScape® Motion Display) are transforming how we as humans experience and influence our world.
2013