A High-Resolution Monocular Panoramic Camera Using a Double Mirror Pyramid



High-resolution panoramic capture is highly desirable in many applications such as immersive virtual environments, tele-conferencing, surveillance, and robot navigation. In addition, a single viewpoint for all viewing directions, a large depth-of-field (omni-focus), and real-time acquisition are desired in some imaging applications (e.g. 3D reconstruction and rendering).  The FOV of a conventional camera is limited by the size of its sensor and the focal length of its lens. For example, a typical 16mm lens with 2/3" CCD sensor has a  FOV.  The number of pixels on the sensor ( for NTSC camera) determines the resolution. The depth-of-field is limited and is determined by various imaging parameters such as aperture, focal length, and the scene location of the object.

Many approaches have been presented to achieve various subsets of these properties: wide FOV, high resolution, large depth-of-field, a single viewpoint, and real-time acquisition. Among these, mirror-pyramid (MP)-based camera systems offer a promising approach to capturing high-resolution, wide-FOV panoramas as they provide single-viewpoint images at video rate. Such systems use planar mirrors assembled in pyramid or prism shapes, and as many cameras as the number of mirror faces, each located and oriented to capture the part of the scene reflected off one of the flat mirror faces. Images from the individual cameras are concatenated to yield a 360-degree wide panoramic image. Compared to designs using parabolic or hyperbolic mirrors, flat mirrors are easier to design and produce, and they introduce minimal optical aberrations.

We have developed a double-mirror-pyramid design that doubles the size of the visual field of the single-pyramid based systems. With this prototype, we have developed methods for optimally choosing the parameters of MP-based camera systems, e.g., camera placement, pyramid geometry, sensor usage, and uniformity of image resolution, and how the resultant image quality can be evaluated.


2. Overview of panoramic imaging

The existing methods of capturing panoramas fall into one of the two categories: dioptric methods, where only refractive elements (lenses) are employed, and catadioptric methods, where a combination of reflective and refractive components is used. Typical dioptric systems include: the camera cluster method where multiple cameras point in different directions to cover a wide FOV; the fisheye method where a single camera acquires a wide FOV image through a fisheye lens; and the rotating camera method where a conventional camera pans to generate mosaics, or a camera with a non-frontal, tilted sensor pans around its viewpoint to acquire panoramic omni-focused images. The catadioptric methods include: sensors in which a single camera captures the scene as reflected off a single curved mirror, or sensors in which multiple cameras image the scene as reflected off the planar mirror surfaces.

The dioptric camera clusters are capable of capturing high-resolution panoramas at video rate. However, the cameras in these clusters typically do not share a unique viewpoint due to physical constraints, which makes it difficult or even impossible to mosaic individual images to form a true panoramic view, while apparent continuity across images may be achieved by ad hoc image blending. The sensors with fisheye lens are able to deliver large FOV images at video rate, but suffer from low resolution, irreversible distortion for close-by objects, and non-unique viewpoints for different portions of the FOV.  The rotating cameras deliver high-resolution wide FOV via panning, as well as omni-focus when used in conjunction with non-frontal imaging, but they have limited vertical FOV. Furthermore, because they sequentially capture different parts of the FOV, moving objects may be imaged incorrectly.

The catadioptric sensors that use a parabolic- or a hyperbolic-mirror to map an omni-directional view onto a single sensor are able to achieve a single viewpoint at video rate, but the resolution of the acquired image is limited to that of the sensor used and varies significantly with the viewing direction across the visual fields. Analogous to the dioptric case, this resolution problem can be alleviated partially by replacing the simultaneous imaging of the entire FOV with panning and sequential imaging of its parts, followed by mosaicing the images, at the expense of video rate. Another category of the catadioptric sensors employs a number of planar mirrors assembled in the shape of right mirror-pyramids, together with as many cameras as the number of pyramid faces. Each of these cameras, capturing the part of the scene reflected off one of the faces, is located and oriented strategically such that the mirror images of their viewpoints are co-located at a single point inside the pyramid. Effectively, this creates a virtual camera that captures wide-FOV, high-resolution panorama at video rate.


Proposed Double-Mirror-Pyramid Camera

The main challenge in constructing a panoramic camera from multiple sensors is to co-locate the entrance pupils of the multiple cameras so that adjacent cameras cover contiguous FOV without obstructing the view of other cameras or their own. Nalwa first used a right mirror pyramid (MP) formed from planar mirrors for this purpose. He reported an implementation using a 4-sided right pyramid and 4 cameras. The pyramid stands on its horizontal base.  Each triangular face forms a 45-degree angle with the base. The cameras are positioned in the horizontal plane that contains the pyramid’s vertex such that the entrance pupil of each camera is equidistant from the vertex and the mirror images of the entrance pupils coincide at a common point, C, on the axis of the pyramid. The cameras are pointed vertically downward at the pyramid faces such that the virtual optical axes of the cameras are all contained in a plane parallel to the pyramid base, effectively viewing the world horizontally outward from the common virtual viewpoint .

The vertical dimension of the panoramic FOV in each of the aforementioned cases is the same as that of each of the cameras used – only their horizontal FOVs are concatenated to obtain a wider, panoramic view. We have developed a panoramic design that uses a dual mirror-pyramid (DMP), formed by joining two mirror-pyramids such that their bases coincide (Fig. 2), together with two layers of camera clusters.  Such a DMP-based design thus doubles the vertical FOV while preserving the ability to acquire panoramic high-resolution images from an apparent single viewpoint at video rate. 

In a MP-based panoramic system, it is critical to optimize the geometry of a pyramid or prism shape, the placement of a common viewpoint along with the mirror surfaces, and the selection of camera parameters to maximize the overall FOV, sensor usage, and image uniformity. We have analyzed both geometrical and optical constraints in a DMP-based system and established relationships that relate the design parameters to the resultant image quality. This analysis can be generalized and applied to other MP-based designs.


Text Box:    

Figure 2: A DMP-based panoramic camera: The physical positions of each layer of N cameras form a N-sided regular polygon where the apparent common viewpoint is centered at O.

As illustrated in Fig. 2-a, a DMP is formed by stacking two N-sided mirror-pyramids, back to back such that their bases coincide. Without loss of generality, we assume the use of a right pyramid in which the surfaces are symmetric to the pyramid axis.  A right DMP can be basically characterized by the number of mirror faces in a single pyramid, the slope angle, and its base and cap radii. The slope angle of a pyramid refers to the angle formed by a mirror surface with the pyramid base. We assume the two pyramids have the same slope angle, . The base or cap radii of a pyramid refer to the radii of the inscribed circles of the base or cap polygons. We consider a unit DMP which has a unit base radius, with cap radii of  and , for the pyramids A and D, respectively.  and  can also be interpreted as the ratios of the actual cap radii to the pyramid base.



6    Prototype

Using the constraints and relationships derived in the section 4 and 5, we designed a DMP panoramic camera with two right-hexagonal () truncated pyramids (Fig. 3). The pyramid has a base radius of  and a slope angle of 40 degrees.  The ratio of camera-pyramid size is about 0.2. The optimal cap radius of the pyramid is 56.3mm, and the shape factor of the pyramid is 0.43. This DMP design yields a total  non-occluded FOV. Each of the mirror surfaces effectively covers 60 degrees FOV horizontally and 41.2 degrees vertically. 

The cameras are tilted at 20.58 degrees relative to the pyramid base, yielding optimal sensor utilization. The field angles corresponding to corners  are ,  , and , respectively. The aspect ratio of the reflective visual field is 1.64. Assuming a sensor with an aspect ratio of 4:3, the minimal FOV requirement for the cameras are and  for the horizontal and vertical directions, respectively.

Text Box:            
(a)					(b)
Fig. 3: A six-sided DMP camera implementation: (a) A CAD model of a system with a 6-sided DMP and 12 cameras; (b) The implementation of the six-sided DMP with four cameras installed.

The cameras selected are Pulnix with 2/3",  black/white CCD sensors. Thus the maximum focal length is 7.13mm, which yields a sensor efficiency of 72% and image non-uniformity 27%. A 6.5mm lens is selected. This sensor-lens combination effectively provides a FOV of and  for the horizontal and vertical directions, respectively. The pyramid-camera combination yields a panoramic system with total 2.176 Million pixels, sensor efficiency of 60%, and image non-uniformity 27%.



Text Box:  	 
(a)					(b)
(c)					(d)
(e)			(f)					(g)
Figure 4. Sample images obtained by the DMP camera.  (a)-(d) Original images acquired by the four cameras; (e) The mosaic of vertically adjacent camera  images (a) and (c), after post-processing of keystone and radial distortions; (f) The mosaic of vertically adjacent camera images (b) and (d), after post-processing of keystone and radial distortions; (g) The cylindrical mosaic of the images (e)-(f).

Figure 4 shows results for a prototype containing only 4 physical cameras, instead of the capacity of 12. Figures 4-a through 4-d show four images acquired by four adjacent cameras, two horizontally adjacent in the upper layer and the other two horizontally adjacent in the lower layer, which also form two vertically adjacent pairs. After post-processing for keystone and radial distortions, images from vertically adjacent cameras are concatenated to form a vertical mosaic corresponding to double the FOV of the individual cameras. The resulting vertical mosaics are shown in Figures 4-e and 4-f, respectively.  The two vertical mosaics (six mosaics in if it were a full implementation) are projected onto a cylinder centered at the common viewpoint to form the seamless cylindrical mosaic. The final 4-camera mosaic of the two images 4-e and 4-f is shown in Figure 4-g.




Back to Theme I