Selected Publications
(for a full list, click here) |
1. PAMI 2010 Qi Zhao, Zhi Yang and Hai Tao, "Differential Earth Mover's Distance with Its Applications to Visual Tracking," IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), doi: http://doi.ieeecomputersociety.org/10.1109/TPAMI.2008.299, February 2010.
[pdf]
2. NIPS 2010 Zhi Yang, Qi Zhao, Edward Keefer and Wentai Liu, "Noise Characterization, Modeling, and Reduction for In Vivo Neural Recording," Advances in Neural Information Processing Systems (NIPS 22), 2010.
[pdf]
3. Neurocomputing 2009 Zhi Yang*, Qi Zhao* and Wentai Liu, "Neural Signal Classification Using a Simplified Feature Set with Energy Based Nonparametric Clustering," Neurocomputing, doi: 10.1016/j.neucom.2009.07.013, in press. *Equal authorship.
[pdf]
4. JNE 2009 Zhi Yang, Qi Zhao and Wentai Liu, "Improving Spike Separation Using Waveform Derivative," to appear in Journal of Neural Engineering (JNE), doi: 10.1088/1741-2560/6/4/046006, July 2009.
[pdf]
5. CVIU 2009 Qi Zhao and Hai Tao, "A Motion Observable Representation Using Color Correlogram and Its Applications to Visual Tracking," in Computer Vision and Image Understanding (CVIU), Volume 113, Issue 2, Pages 273-290, February 2009. [pdf]
6. NIPS 2009 Zhi Yang, Qi Zhao and Wentai Liu, "Spike Feature Extraction using Informative Samples,"Advances in Neural Information Processing Systems (NIPS Poster Spotlight Presentation), Pages 1865-1872, 2009. (acceptance rate = 123/1022 = 12.0%) [pdf]
7. ICCV 2007 Qi Zhao, Shane Brennan and Hai Tao, "Differential EMD Tracking," in IEEE International Conference on Computer Vision (ICCV Oral Presentation), Rio de Janeiro, Brazil, October 2007. (acceptance rate = 47/1190 = 3.9 %) [pdf]
8. ICCV 2007 Feng Tang, Shane Brennan, Qi Zhao and Hai Tao, "Co-Tracking Using Semi-Supervised Support Vector Machines," in IEEE International Conference on Computer Vision (ICCV) , Rio de Janeiro, Brazil, October 2007. (acceptance rate = 280/1190 = 23.5 %) [pdf]
|
|
|
|
Recent Projects
|
Differential Earth Mover's Distance Matching. The Earth Mover's Distance (EMD) is a similarity measure that captures perceptual difference between two distributions. Its computational complexity, however, prevents a direct use in many applications. This work proposes a novel Differential EMD (DEMD) algorithm based on the sensitivity analysis of the simplex method, and offers a speedup at orders of magnitude compared with its brute force counterparts. The DEMD algorithm is discussed and empirically verified in the visual tracking context. The deformations of the distributions for objects at different time instances are accommodated well by the EMD, and the differential algorithm makes the use of EMD in real-time tracking possible. To further reduce the computation, signatures, i.e., variable-size descriptions of distributions, are employed as an object representation. The new algorithm models and estimates local background scenes as well as foreground objects to handle scale changes in a principled way.
Evolving Mean Shift Clustering. This work presents a novel nonparametric clustering algorithm called evolving mean shift (EMS) algorithm. The algorithm iteratively shrinks a dataset and generates well formed clusters in just a couple of iterations. An energy function is defined to characterize the compactness of a dataset and we prove that the energy converges to zero at an exponential rate. The single but critical user parameter, i.e., the bandwidth (also referred to as scale), of the mean shift clustering family is adaptively updated to accommodate the evolving data density and alleviate the contradiction between global and local features. The algorithm has been applied and tested with image segmentation and neural spike sorting, where the improved accuracy can be obtained at a much faster performance, as demonstrated both qualitatively and quantitatively.
Neural Signal Feature Extraction. This work is co-developed through collaborations with the Integrated BioElectronics Research Lab. Most neurons in the brain transfer information by action potentials which can be recorded with microelectrodes. It is very likely that a single electrode records action potentials from several adjacent neurons and thus further signal processing to separate activities of individual neurons is required. We present a new spike feature extraction algorithm that targets real-time
spike sorting and facilitates miniaturized microchip implementation. The proposed theoretical framework includes neuronal geometry signatures,
noise shaping, and informative sample selection. The new
algorithm has been evaluated on synthesized waveforms and experimentally
recorded sequences. When compared with many spike sorting approaches
our algorithm demonstrates improved speed, accuracy and allows unsupervised
execution. A
preliminary integrated circuit implementation of the algorithm has been realized and tested.
A Motion Observable Representation Using Color Correlogram. This work presents a special form of color correlogram as representation for object tracking and carries out a motion observability analysis to obtain the optimal correlogram in a kernel based tracking framework. Compared with the color histogram, where the position information of each pixel is ignored, a simplified color correlogram (SCC) representation encodes the spatial information explicitly and enables an estimation algorithm to recover the object orientation. In this paper, based on the SCC representation, the mean shift algorithm is developed in a translation–rotation joint domain to track the positions and orientations of objects. The ability of the SCC in detecting and estimating object motion is analyzed and a principled way to obtain the optimal SCC as object representation is proposed to ensure reliable tracking.
Robust Face Tracking in Real-World Videos. The algorithm is designed and implemented during my 3-month internship (mentors: Sanjiv Kumar and Henry Rowley) at Google Research, NYC in the summer of 2008. The objective of the project is to track faces in large-scale real-world videos with applications to event recognition and face sequence indexing from YouTube data. The main novelty of the method is the use of importance sampling technique to incorporate independent tracking modules to the particle filtering framework in a principled manner. The new algorithm naturally combines the merits of both face-specific and generic trackers while achieving a speedup at tens of times compared with its particle filtering based counterpart.
Real-Time Tracking Using Camera Combo for Remote Collaboration. I participated this project during my 3-month internship (manager: Zhengyou Zhang; mentor: Cha Zhang) at Microsoft Research, Redmond, WA in the summer of 2007. The goal of the project is for personal remote collaboration. A camera combo with one fisheye camera and one Pan-Tilt-Zoom (PTZ) camera is used to capture general objects of interests. The fisheye camera has a wide field of view, and the PTZ camera can pan, tilt and zoom based on analysis of the images captured by the wide angle camera. At the core of the system is a semantic saliency map that overcomes many limitations of low-level saliency maps computed from preliminary image features. The map is used for PTZ camera control with a novel information loss optimization based virtual director. The effectiveness of the proposed method is demonstrated with real-world sequences.
Part based Human Tracking in a Multiple Cue Fusion Framework. This project is developed during my 3-month internship (mentors: Jinman Kang and Wei Hua) at Vidient, Inc., Sunnyvale, CA in the summer of 2005. The objective of this project is for a real time video surveillance system to handle various challenging issues in multiple human tracking such as occlusions, sharp motion changes and multi-person confusions. Toward this goal, we propose to intelligently fuse multiple cues, which include human body decomposition results based on a head detector, color information, and motion information, etc. Part based methods are adopted to provide a second-level information fusion in that parts with bad observability can be compensated by tracking other more visible ones.
|
|
|
Past Projects |
Ink Cleanup. I was working on this “Digital Ink Cleanup” project during my 4-month internship (mentor: Zhouchen Lin) at Microsoft Research Asia at Beijing, China in 2003. The Digital Ink technology is one of the novelties in Tablet PC. It enables people to truly "write" on computers. We analyzed user inputs and designed algorithms to remove redundant strokes and clarify input words prior to processing by digital ink recognizers. The system is effective in cleaning the ink note as well as increasing the recognition rate.
Texture Mapping on Talking Faces for Portable Devices. This work is part of the Chinese National Science Foundation project “Real-time 3-D Reconstruction of Speech-Driven Expression Animation”. I proposed a method using one single frontal view face image for efficient texture mapping on talking faces. The algorithm does not require exact match between the model and the presented texture. Satisfactory mapping can be achieved by interactive adjustment scheme, where users define correspondence for feature locations through editing the key points and their influence regions. Efficiency and realism are well balanced using the new method.
Efficient Belief Propagation for Image Restoration. The Markov Random Field (MRF) theory provides a consistent way for modeling context dependent entities such as image pixels. Trying to solve the image restoration problem in the MRF framework is an optimization problem that is NP hard, and approximation techniques like the belief propagation methods are proposed. The problem of the belief propagation is its inefficiency. In this project, I implemented the efficient belief propagation method proposed by Felzenszwalb and Huttenlocher, applying it to additive noise removal and image inpainting. Further, other methods for additive noise removal like the total variation based, the bilateral based and the mean shift based methods are studied and compared with the efficient belief propagation based one.
Efficient Multiple Object Trajectory Tracking. Most tracking algorithms are based on the maximum a posteriori (MAP) solution of a probabilistic framework called Hidden Markov Model, where the distribution of the object state at current time instance is estimated based on current and previous observations. However, this approach is prone to errors caused by temporal distractions such as occlusion, background clutter and multi-object confusion. Trajectory tracking algorithms seek the optimal state sequence which maximizes the joint state-observation probability. In this research topic, we proposed a probabilistic framework where the trajectory tracking is more mathematically sound. Recovery mechanism is incorporated for this purpose, which prevents the algorithm from being stuck at local maximum. Efforts are also put into improving the efficiency of trajectory tracking by using a hierarchical scheme and other techniques including hypothesis pruning and backward checking.
|