Collaborative Research: Support Efficient Similarity Searches for Multidimen= sional Non-ordered Discrete Data Spaces

Supported by the National Science Foundation= Research Grants
            =         

Principal Investigator at MSU

Dr. Sakti Pramanik
Department of Computer Science and Engineering,
Michigan State University, East Lansing, MI, USA
pramanik@msu.edu
(NSF Grant for MSU: IIS-0414576)
 

Principal Investigator at UM-D

Dr. Qiang Zhu
Department of Computer and Information Science,
The University of Michigan, Dearborn, MI, USA
qzhu@umich.edu
(NSF Grant for UM-D: IIS-0414594)
 

Graduate Students

Changqing Chen
Dashiell Matthews Kolbe
Hyun-Jeong Seok
Alok Watve
Qiang Xue

Undergraduate Students

Chad Klochko
Rachel Radziszewski

Alumni

Gang Qian

Project Overview

This collaborative research project conducted jointly by the Michigan State = University and the University of Michigan at Dearborn is to investigate the = issues and techniques for supporting efficient similarity searches in multid= imensional Non-ordered Discrete Data Spaces (NDDS). Similarity searches in N= DDSs are becoming increasingly important for application areas such as genom= e sequence databases. Efficient similarity searches require robust indexing = techniques. Unfortunately, existing indexing methods are either not suitable= for an NDDS (e.g., the R*-tree) or too generic to provide good performance = for an NDDS (e.g., metric trees). The main goal of this project is to study = the fundamental properties of NDDSs and develop indexing methods exploiting = these properties to support efficient similarity searches in NDDSs. We intro= duce a set of essential geometric concepts for an NDDS based on some similar= concepts for a traditional (ordered) Continuous Data Spaces (CDS). Using th= em, we explore and compare promising data-partitioning (DP) based and space-= partitioning (SP) based indexing techniques for NDDSs, including index tree = structures, building strategies, search algorithms and performance models. W= e also study other related issues including supporting various types of quer= ies, adopting different distance measures, indexing hybrid data spaces with = mixed ordered and non-ordered dimensions, developing efficient bulk loading = techniques, and utilizing effective compression schemes. This research will = provide contemporary database indexing techniques to solve relevant issues i= n the emerging fields such as bioinformatics, biometrics and E-commerce.

Project References

Project Web Sites

  • Web site at MSU
  • Web sit= e at UM-D

  • --=_2byo1chufy6p--