In order to apply a multidimensional linear transform over an arbitrarily shaped support, the usual practice is to fill out the support to a hypercube by zero padding. The problem that we tackle is: how do we redefine the transform over an arbitrary shaped region suited to a given application? We present a novel iterative approach to define any multidimensional linear transform over an arbitrary shape given that we know its definition over a hypercube. The proposed solution is extensible to all possible shapes of supports and adaptable to the needs of a particular application. Applications with this method are segmentation based image compression, shape based video encoding and region merging.
Discrete linear transforms in two (or more) dimensions are in most cases defined over a rectangular (hypercubic) support. The usual practice when we want to apply the transform over an arbitrarily shaped support is to fill out the rest of the support with zeros to make up the rectangle (hypercube) and then use the natural definition of the transform over a rectangle (hypercube). This is an extension of the 1-D case where we fill out an arbitrary-length data set with zeros to form a data set of length 2neither to increase the computational speed (through FFTs for Fourier transforms) or to satisfy the definition of the transform (in the case of dyadic wavelets). This, however, does not lead to a satisfactory definition of the linear transform in two or more dimensions for many applications. An example can be used to illustrate this point. The Fourier transform of a function that is constant on a circular support in 2-D is a Jinc. As can be seen from the figure, the magnitude of the Fourier coefficients does not have any relation to the smoothness of the function, which is a constant within its support.
The above discussion leads us to the following question: What should be the values attributed to the sample points which lie within the rectangle but not within the support of the function? The answer is evidently not unique and depends upon the application. With each possible choice of the values for the pixels which lie outside the support but within the rectangular (hypercubic) region, we can associate a possible function-transform pair. The aim of the proposed scheme is to algorithmically constrain the choice of the possible function-transform pairs in such a way as to lead to the optimal choice of the function-transform pair for the particular application under consideration.
The applications that we consider assume that we have a smooth 2-D function defined on an arbitrarily shaped, connected support; we would like to define the free pixels (the pixels within the rectangular region but outside the support) so as to minimize the high frequency content in the Fourier domain. The problem is formulated in terms of a Projection onto Convex Sets formalism and incorporates a few more constraints such as the bounded variation of the free pixels . The variation in the free pixels is controlled by a parameter Z.
Results for the Fourier transform as Z is varied are presented . The example function is a smooth region taken out of a natural image (Lena). As can be seen, an increase of Z allows more variation in the free pixels in the spatial domain, thus destroying the ad hoc shape information (the pixels within the support are unchanged in all images). Similar results maybe obtained if we replace the Fourier transform with either DCT or wavelets.
Related Publications:
- K. Ratakonda and N. Ahuja, Discrete Multi-Dimensional Linear Transforms over Connected Arbitrarily Shaped Supports, IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. 4, Munich, Germany, April 1997, 3041-3044. Processing, Geneva, Switzerland, September 1996, 81-84.
- R. Dugad and N. Ahuja, A Fast Scheme for Downsampling and Unsampling in the DCT Domain, International Conference on Image Processing, Kobe, Japan, Oct. 1999, II-909-913.
- R. Dugad and N. Ahuja, A Fast Scheme for Altering Resolution in the Compressed Domain, EEE Conference on Computer Vision and Pattern Recognition, Ft. Collins, CO, June 1999, I-213-218.
- R. Dugad and N. Ahuja, A Fast Scheme for Image Size Change in the Compressed Domain, IEEE Trans. on Circuits and Systems for Video Technology, Vol. 11, No. 4, April 2001, 461-474.