Image <=> Imagine <=> Intelligence



This webpage tries to organize the major topics in image processing, computer vision and machine learning in a view of brain science, along with my related project experiences. Hopefully it will shed light on deeper understanding of the relationships among these topics. My dissertation, on the other hand, is organized and presented in a view of computer science. Any comments or feedback will be more than welcome.

Preface:


My primary research interest is on finding a unified theory to explain how brain, especially visual perception system, works. The philosophy behind my research is not only to build computational models relying on the understanding of brain for application purposes like dimension reduction and image segmentation but, even more important, to reveal unknown functions of brain and to refine brain theories via comparing the results of neurobiologically-motivated computational model with those of the experimental psychology. This philosophy can cooperate with a manner how research in physics moves forward: 1) building a theory to explain the observations; 2) discovering some phenomena that cannot be explained by current theory; 3) refining the theory to explain these phenomena.


Chapter 1: Retina and LGN -- the Factory of INCEPTION?


Project 1. Receptive Field for Salient Object Segmentation in Nano Scale

  • Description: Existing segmentation methods often fail in noisy data (SNR ranging from 0.01 to 0.1), such as nano-scale data with intrinsic imaging limitations. This work explores the functions of simple receptive field model in retina and LGN for three-dimensional (3D) salient object localization in nano scale. The efficiency of this model is demonstrated on 600x1400x432 cry-electron tomograms, for which our method yields satisfied membrane segmentation that is similar to the segmentation of the state of the art specifically designed for nano-scale membrane segmentation. (Note: orientation-sensitive receptive field from V1 is involved to produce accurate segmentation)
  • Publication: Chapter 3 of my dissertation

  • Project 2. Depth of Field for Three-Dimentional Reconstruction

  • Description: In LGN, temporal correlations/decorrelations (i.e.: focus-of-attension shift) and spatial correlations (i.e.: receptive field with different scale/depth) from both eyes are explored to benefit a 3D representation of object space. In this work, we explore the depth of field by developing a prototype system that allows efficient acquisition of 3D drosophila reconstraction. The system first acquires (multiple) microscopic image stacks and then estimates the underlying surface by estimating a range image for each stack based on the estimation of focusing. The range image is then segmented into different pieces of biological significance and parametric shape models (thin plate spline models) are derived to characterize the underlying surface of each component. We demonstrate the effectiveness of the proposed method by extracting the 3D shape of eyes and other parts from images stacks.
  • Publication: Section 4.2 of my dissertation

  • Hot Keywords in Literature: Salient Map, Clustering

    Chapter 2: Visual Cortex -- the Battle between BRIGHT and PREJUDICE?


    Project 3. Reducing the Dimensionality of Data with a Hybrid Model

  • Description: The feaure space in the visual cortex consists of feature responses from receptive fields organized in a hierachical or deep architecture. On one hand, simple receptive fields in low level capture information from visual input; on the other hand, more complicated receptive fields in higher level are more adaptive to prior information from Inferotemporal Stream and Parietal Stream. Thus the feature space is maintained in an optimized manner. In order to explicitly optimize discrimination performance in a more generative way, a hybrid dimension reduction model combining principle component analysis (PCA) and linear discriminant analysis (LDA) is proposed in this work. We also present a dimension reduction algorithm correspondingly and illustrate the method with several classification experiments. Our results have shown that there exists an optimized solution from the hybrid model that outperforms PCA, LDA and the combination of them in two separate stages.
  • Publication: Nan Zhao, Washington Mio, and Xiuwen Liu. "A hybrid PCA-LDA model for dimension reduction." Neural Networks (IJCNN), The 2011 International Joint Conference on. IEEE, 2011. [Link][PDF][Poster]

  • Project 4. Re-initilization Free Level Set Segmentation Based on Prior Propagation

  • Description: Similar to Project 3, this work explores feature space optimization based on a hybrid model consisting of both low level features from visual input and high level features from prior information. However, the feature optimization is implicitly carried out on feature-based energy and is tested on segmentation of HIV membrane in 3D cryo-electron tomogram. Such energy consists of contributions from gradient information, shape prior of ideal level set function and spatial prior from nearby slices. Our results have shown that the optimizatoin of the feature based energy function also allows us to extract ideal segmentation in 3D.
  • Publication: Sections 4.3, 4.4 of my dissertation

  • Hot Keywords in Literature: Deep Learning

    Chapter 3: Inferotemporal Stream (AIT/CIT/PIT) -- the library of objects?



    Hot Keywords in Literature: ImageNet, Classification

    Chapter 4: Parietal Stream (PP/MST/MT) -- THE MATRIX that correlates objects?


    Project 5. Nano-Scale Context-Sensitive Semantic Segmentation

  • Description: Small object segmentation is still a challenging problem, especially when encountering low SNR, low contrast and large data size. It is hence a typical problem of big data. Following Project 1, this work proposes a new context-sensitive method for segmenting 3D volumes. By using robust context cues (relationships/interactions) between objects to efficiently narrow the search space of the target object, we achieve tractable and reliable nano-scale semantic segmentation. We demonstrate our method on a 600x1400x432 tomogram for segmenting microvilli spikes (12x12x20 in general), for which our method yields accurate spike segmentation in 1~2 hours, whereas the state-of-the-art semantic segmentation methods fail due to their inability to handle problems mentioned above.
  • Publication: Nan Zhao and Xiuwen Liu. "Nano-Scale Context-Sensitive Semantic Segmentation." International Conference on Image Processing, 2015. (Oral Presentation) [Link][PDF]

  • Project 6. Context-Sensitive Tattoo Segmentation and Classification

  • Description: In object-centered semantic segmentation, information from the background is considered as noise. However, for some objects, it is difficult to described them in terms of their own appearance characters. In this work, we developed a context-sensitive semantic segmentation algorithm for extracting tattoos in an image. Segmentation of tattoo with unknown number of connected components and arbitrary shapes is transferred to a figure-ground (tattoo-skin) segmentation. We also applied our segmentation results on multiple tasks of tattoo classification and demonstrated the state-of-the-art performance using our algorithm.
  • Publication: Allen, Josef D., Nan Zhao, Jiangbo Yuan, Xiuwen Liu. "Unsupervised tattoo segmentation combining bottom-up and top-down cues." SPIE Defense, Security, and Sensing. International Society for Optics and Photonics, 2011. [Link][PDF]

  • Hot Keywords in Literature: Context, Inference, Big Data

    [Home] - [Calendar] - [Research] - [Courses] - [Notes]


    Copyright © 2015 by the Florida State University, CAVIS and Nan Zhao