Collaborative Research: Support Efficient Similarity Searches for
Multidimensional Non-ordered Discrete Data Spaces
Supported by the National Science Foundation
Research Grants
Principal Investigator at MSU
Dr. Sakti Pramanik
Department of Computer Science and Engineering,
Michigan State University, East Lansing, MI, USA
pramanik@msu.edu
(NSF Grant for MSU: IIS-0414576)
Principal Investigator at UM-D
Dr. Qiang Zhu
Department of Computer and Information Science,
The University of Michigan, Dearborn, MI, USA
qzhu@umich.edu
(NSF Grant for UM-D: IIS-0414594)
Graduate Students
Changqing Chen
Dashiell Matthews Kolbe
Hyun-Jeong Seok
Alok Watve
Qiang Xue
Undergraduate Students
Chad Klochko
Rachel Radziszewski
Alumni
Gang Qian
Project Overview
This collaborative research project conducted jointly by the Michigan State
University and the University of Michigan at Dearborn is to investigate the
issues and techniques for supporting efficient similarity searches in multid
imensional Non-ordered Discrete Data Spaces (NDDS). Similarity searches in N
DDSs are becoming increasingly important for application areas such as genom
e sequence databases. Efficient similarity searches require robust indexing
techniques. Unfortunately, existing indexing methods are either not suitable
for an NDDS (e.g., the R*-tree) or too generic to provide good performance
for an NDDS (e.g., metric trees). The main goal of this project is to study
the fundamental properties of NDDSs and develop indexing methods exploiting
these properties to support efficient similarity searches in NDDSs. We intro
duce a set of essential geometric concepts for an NDDS based on some similar
concepts for a traditional (ordered) Continuous Data Spaces (CDS). Using th
em, we explore and compare promising data-partitioning (DP) based and space-
partitioning (SP) based indexing techniques for NDDSs, including index tree
structures, building strategies, search algorithms and performance models. W
e also study other related issues including supporting various types of quer
ies, adopting different distance measures, indexing hybrid data spaces with
mixed ordered and non-ordered dimensions, developing efficient bulk loading
techniques, and utilizing effective compression schemes. This research will
provide contemporary database indexing techniques to solve relevant issues i
n the emerging fields such as bioinformatics, biometrics and E-commerce.
Project References
-
Dashiell Kolbe, Qiang Zhu and Sakti Pramanik,
Reducing Non-Determinism
of k-NN Searching in Non-Ordered Discrete Data Spaces,
Journal of the Information Processing Letters (IPL), to appear.
-
Hyun-Jeong Seok, Gang Qian, Qiang Zhu, Alexander Oswald and Sakti Pramanik,
"Bulk-Loading the ND-Tree in Non-ordered Discrete Data Spaces",
Proc. of 13th International Conference on Database Systems for Advanced
Applications (DASFAA'08-Best Paper), pp 156-171, New Delhi, India, March 19 - 22,
2008.
Project Web Sites
Web site
at MSU
Web site
at UM-D