We present a segmentation algorithm to detect low-level structure present in images. The algorithm is designed to partition a given image into regions, corresponding to image structures, regardless of their shapes, sizes, and levels of interior homogeneity. We model a region as a connected set of pixels that is surrounded by ramp edge discontinuities where the magnitude of these discontinuities is large compared to the variation inside the region.
Category: Low-Level Vision
Transform Domain Methods for Single Image Super-Resolution
Super-resolution of a single image is a highly ill-posed problem since the number of high resolution pixels to be be estimated far exceeds the number of low resolution pixels available. Therefore, appropriate regularization or priors play an important role in the quality of results. In this line of work, we propose a family of methods for learning transform domain priors for the single-image super-resolution problem.
Simultaneous Noise Removal and Super-Resolution of Natural Images
Our goal is to obtain a noise-free, high resolution (HR) image, from an observed, noisy, low resolution (LR) image. The conventional approach of preprocessing the image with a denoising algorithm, followed by applying a super-resolution (SR) algorithm, has an important limitation: Along with noise, some high frequency content of the image (particularly textural detail) is invariably lost during the denoising step.
Compressive Sampling
Compressive sampling (CS) is aimed at acquiring a signal or image from data which is deemed insufficient by Nyquist/Shannon sampling theorem. Its main idea is to recover a signal from limited measurements by exploring the prior knowledge that the signal is sparse or compressible in some domain. In this paper, we propose a CS approach using a new total-variation measure TVL1, or equivalently TVL1 , which enforces the sparsity and the directional continuity in the gradient domain.
Fusion of Median and Bilateral Filtering for Range Image Upsampling
We present a new upsampling method to enhance the spatial resolution of depth images. Given a low-resolution depth image from an active depth sensor and a potentially high-resolution color image from a passive RGB camera, we formulate it as an adaptive cost aggregation problem and solve it using the bilateral filter.
Structure Based Optical Flow
Classical optical flow objective functions consist of a data term that enforces brightness constancy, and a spatial smoothing term that encourages smooth flow fields. The use of structural information from images has been conventionally used for designing more robust regularizers, to prevent oversmoothing motion discontinuities. In this line of work, we are looking at exploiting image structure in a more detailed manner, as compated to conventionally used gradient filters.
Learning Human Preferences For Image Sharpening
We propose an image sharpening method that automatically optimizes the perceived sharpness of an image. Image sharpness is defined in terms of the one-dimensional contrast across region boundaries. Regions are automatically extracted for all natural scales present that are themselves identified automatically. Human judgments are collected and used to learn a function that determines the best sharpening parameter values at an image location as a function of certain local image properties.
Shadow Removal Using Bilateral Filtering
In this paper, we propose a simple but effective shadow removal method using a single input image. We first derive a 2-D intrinsic image from a single RGB camera image based solely on colors, particularly chromaticity. We next present a method to recover a 3-D intrinsic image based on bilateral filtering and the 2-D intrinsic image.
Real-time Specular Highlight Removal Using Bilateral Filtering
In this paper, we propose a simple but effective specular highlight removal method using a single input image. Our method is based on a key observation – the maximum fraction of the diffuse color component (so called maximum diffuse chromaticity in the literature) in local patches in color images changes smoothly.
SVM for Edge-Preserving Filtering
In this paper, we propose a new method to construct an edge-preserving filter which has very similar response to the bilateral filter. The bilateral filter is a normalized convolution in which the weighting for each pixel is determined by the spatial distance from the center pixel and its relative difference in intensity range.
Real-time O(1) Bilateral Filtering
We propose a new bilateral filtering algorithm with computational complexity invariant to filter kernel size, socalledO(1) or constant time in the literature. By showing that a bilateral filter can be decomposed into a number of constant time spatial filters, our method yields a new class of constant time bilateral filters that can have arbitrary spatial1and arbitrary range kernels.
Segmentation of periodically moving objects
We present a new approach for the identification and segmentation of objects undergoing periodic motion. Our method uses a combination of maximum likelihood estimation of the period, and segments moving objects using correlation of image segments over an estimated period of interest. Correlation provides the best locations of the moving objects in each frame.
Extraction and Analysis of Multiple Periodic Motions in Video Sequences
The analysis of periodic or repetitive motions is useful in many applications, both in the natural and the man-made world. An important example is the recognition of human and animal activities. Existing methods for the analysis of periodic motions first extract motion trajectories, e.g. via correlation, or feature point matching. We present a new approach, which takes advantage of both the frequency and spatial information of the video.
Image Ensembles/ Video analysis Using Image-As-Matrix Representation
Tensor Manipulation
We explore new algorithms for computer vision based on multilinear algebra. Firstly, we learn the expression subspace and person subspace from a corpus of images based on Higher-Order Singular Value Decomposition (HOSVD), and investigate their applications in facial expression synthesis, face recognition and facial expression recognition. Secondly, we explore new algorithms for image ensembles/video representation and recognition using tensor rank-one decomposition and tensor rank-R approximation.
Video Encoding using Coset Codes
Video Compression using Wyner-Ziv Codes
Predictive coding is posed as a variant of the Wyner-Ziv coding, and problems in source and channel coding of video are addressed in this framework.
Video Encoding using Coset Codes
This project deals with scalable coding and robust Internet streaming of predictively encoded media. We frame the problem of predictive coding as a variant of the Wyner-Ziv problem in Information theory.
Compression of Image-based Rendering Data
Video Compression using Wyner-Ziv Codes
Predictive coding is posed as a variant of the Wyner-Ziv coding, and problems in source and channel coding of video are addressed in this framework.
Compression of Image-based Rendering Data
The design of compression techniques for streaming of image-based rendering data to remote viewers. A compression algorithm based on the use of Wyner-Ziv codes is proposed, which satisfies the key constraints for IBR streaming, namely those of random access for interactivity, and pre-compression.
Facial Expression Decomposition
Tensor Manipulation
We explore new algorithms for computer vision based on multilinear algebra. Firstly, we learn the expression subspace and person subspace from a corpus of images based on Higher-Order Singular Value Decomposition (HOSVD), and investigate their applications in facial expression synthesis, face recognition and facial expression recognition. Secondly, we explore new algorithms for image ensembles/video representation and recognition using tensor rank-one decomposition and tensor rank-R approximation.
Predictive Multiple Description Coding using Wyner-Ziv Codes
Video Compression using Wyner-Ziv Codes
Predictive coding is posed as a variant of the Wyner-Ziv coding, and problems in source and channel coding of video are addressed in this framework.
Predictive Multiple Description Coding using Wyner-Ziv Codes
Two-channel predictive multiple description coding is posed as a variant of the Wyner-Ziv coding problem.
Human Computer Interaction
The second type of human-computer interface is a free-hand-sketch based interface for image editing (e.g., moving, size-scaling, color-transforming parts of an image) is developed. The sketches drawn by the user on top of the image serve as a natural way of specifying an image part and the editing (e.g., move, deletion) operation to be performed.
Block-based motion estimation for missing video frame interpolation, and spatially scalable (multi-resolution) video coding
Video frames are often dropped during compression at very low bit rates. At the decoder, a missing frame interpolation method synthesizes the missed frames. We propose a two step motion estimation method for the interoplation. More specifically, the coarse motion vector field is refined at the decoder using mesh-based motion estimation instead of using computationally intensive dense motion estimation.
Transform Domain Magnification or Super-resolution
In order to apply a multi-dimensional linear transform, over an arbitrarily shaped support, the usual practice is to fill out the support to a hypercube by zero padding. This does not however yield a satisfactory definition for transforms in two or more dimensions. The problem that we tackle is: how do we redefine the transform over an arbitrary shaped region suited to a given application?
Linear Transforms over arbitrary supports
In order to apply a multidimensional linear transform over an arbitrarily shaped support, the usual practice is to fill out the support to a hypercube by zero padding. The problem that we tackle is: how do we redefine the transform over an arbitrary shaped region suited to a given application? We present a novel iterative approach to define any multidimensional linear transform over an arbitrary shape given that we know its definition over a hypercube.
Transform-Domain Watermarking
A new method for digital image watermarking which does not require the original image for watermark detection is presented. Assuming that we are using a transform domain spread spectrum watermarking scheme, it is important to add the watermark in select coefficients with significant image energy in the transform domain in order to ensure non-erasability of the watermark.
Structure Based Image Denoising
Multiscale structure based image representation using a set of regions
The application fields are (i) of appropriate granularity for best image compression, (ii) of appropriately rescaled size for image magnification or superresolution, and (iii) for smoothing for image quality restoration through structure-preserving denoising.
Structure Based Image Denoising
This work addresses the problem of denoising of images corrupted by AWGN.
Efficient spatio-temporal filtering for video denoising
Video Denoising
This work proposes a computationally fast scheme for denoising a video sequence. Temporal processing is done separately from spatial processing and the two are then combined to get the denoised frame. The temporal redundancy is exploited using a scalar state 1D Kalman filter. A novel way is proposed to estimate the variance of the state noise from the noisy frames.
Multiscale structure based video compression, by estimating and coding region motion instead of pixel motion
Segmentation Based Video Coding
We develope a very low bit rate video compression algorithm using multiscale image segmentation based hierarchical motion compensation and residual coding. The proposed algorithm outperforms the H.261-like coder by 3 dB and the H.263 version 2 by 1 dB. Such gains come from the use of image segmentation and reversed motion prediction.
Detection of photometric distribution discontinuities in video to locate shot changes
Video Shot Detection
We present a novel improvement to existing schemes for abrupt shot change detection. Existing schemes declare a shot change whenever the frame to frame histogram difference (FFD) value is above a particular threshold. In such an approach, a high value for the threshold results in a small number of false alarms and a large number of missed detections while a low value for the threshold decreases the number of missed detections at the expense of increasing the false alarms.
Structure Based Image Magnification or Super-resolution
Multiscale structure based image representation using a set of regions
The application fields are (i) of appropriate granularity for best image compression, (ii) of appropriately rescaled size for image magnification or superresolution, and (iii) for smoothing for image quality restoration through structure-preserving denoising.
Structure Based Image Magnification or Super-resolution
Resolution enhancement involves the problem of magnifying a small image to several times its size while avoiding blurring, ringing and other artifacts.
Structure Based Image Compression
Multiscale structure based image representation using a set of regions
The application fields are (i) of appropriate granularity for best image compression, (ii) of appropriately rescaled size for image magnification or superresolution, and (iii) for smoothing for image quality restoration through structure-preserving denoising.
Structure Based Image Compression
Our novel reversible image compression method employs multiscale segmentation within a computationally efficient optimization framework to obtain consistently good performance over a wide variety of images.