ACCEPTED TUTORIALS FOR SDM 2007

April 26 - 28, 2007

Radisson University Hotel

Minneapolis, Minnesota

 

We are pleased to announce the following accepted tutorials for presentation at SDM 2007.

 

INVITED TUTORIAL 1: Data Analytics for Marketing Decision Support

Presenters: Saharon Rosset (IBM) and Naoki Abe (IBM)

 

Abstract: In this tutorial, we will give an overview of the issues in applying data mining and analytics tools to marketing decision support, primarily for marketing/sales optimization and CRM purposes. In particular, we will review some challenges involved in designing analytics tools for real-life marketing decision support; discuss different analytical approaches to addressing these problems in a practically feasible manner; present case studies from our own experiences in customer lifetime value modeling and customer wallet estimation. This tutorial is intended for data miners, whether from industry or academia, who have an interest in moving beyond generic tools and methods to design real, practically useful solutions to decision making problems in marketing and related areas.

 

Biodata:

Saharon Rosset is a Research Staff Member in the Predictive Modeling Group, in the Mathematical Sciences Department at IBM Research.  He received his B.Sc. degree in Mathematics and his M.Sc. degree in Statistics from Tel Aviv University and his Ph.D. degree from Stanford University in Statistics.  He joined IBM Research in 2003. He has received a best paper award at KDD 2002 and two IBM Outstanding Technical Achievement Awards for his work in revenue and opportunity modeling.  His recent research has focused on predictive modeling approaches to solving a broad range of operational business problems as well as scientific problems, especially in Computational Biology.

 

Naoki Abe has been a research staff member in the Predictive Modeling/Data Analytics Research group since June, 2001, and is engaged in research in machine learning and data mining. Naoki obtained his B.S. and M.S. in computer science from MIT in 1984, and a Ph.D. in Computer and Information Science from the University of Pennsylvania in 1989. He worked at IBM T. J. Watson Research Center from 1984 to 1985, and was also a post-doctoral researcher at U.C. Santa Cruz from 1989 to 1990, where he conducted research in computational learning theory. During the 1990s, he was with NEC research laboratories in Japan. From 1998 to 2000, he was adjunct Associate Professor at the Tokyo Institute of Technology. Naoki has served on program committees for ICML, COLT, ALT, KDD, ICDM and SDM conferences, and is currently on the editorial board of Journal of Machine Learning Research and Data Mining and Knowledge Discovery Journal.  His current research interests are in the applications of novel machine learning methods to business analytics and optimization.

 

TUTORIAL 2: Mining Large Time-evolving Data Using Matrix and Tensor Tools

Presenters: Christos Faloutsos (CMU), Tamara G Kolda (Sandia National Labs), and Jimeng Sun (CMU)

 

Abstract: How can we find patterns in sensor streams (eg., a sequence of temperatures, water-pollutant measurements, or machine room measurements)? How can we mine Internet traffic graph over time? Further, how can we make the process incremental? We review the state of the art in four related fields: (a) numerical analysis and linear algebra (b) multi-linear/tensor analysis (c) graph mining and (d) stream mining. We will present both theoretical results and algorithms as well as case studies on several real applications. Our emphasis is on the intuition behind each method, and on guidelines for the practitioner.

 

Biodata:

Christos Faloutsos is a Professor at Carnegie Mellon University. He has received the Presidential Young Investigator Award by the National Science Foundation (1989), seven ``best paper'' awards, and several teaching awards. He has served as a member of the executive committee of SIGKDD; he has published over 140 refereed articles, one monograph, and holds five patents. His research interests include data mining for streams and networks, fractals, indexing for multimedia and bio-informatics data bases, and performance.

 

Tamara G. Kolda is a researcher at Sandia National Laboratories in Livermore, California and has received the Presidential Early Career Award for Scientists and Engineers (2003). She has published over 25 refereed articles and released several software packages including the MATLAB Tensor Toolbox. She is an associate editor for the SIAM Journal on Scientific Computing. Her research interests include multilinear algebra and tensor decompositions, data mining, optimization, nonlinear solvers, graph algorithms, parallel computing and the design of scientific software.

 

Jimeng Sun is a PhD candidate in Computer Science Department at Carnegie Mellon University. His research interests include data mining on streams, graphs and tensors, anomaly detection. He has been actively applying data mining techniques for different applications such as water sensor work monitoring, financial fraud detection, data center monitoring, network anomaly detection.

 

 

TUTORIAL 3: Dimensionality Reduction for Data Mining - Techniques, Applications, and Trends

Presenters: Lei Yu (Binghamton U), Jieping Ye (Arizona State U), and Huan Liu (Arizona State U)

 

Abstract: The objective of this tutorial is to give a comprehensive overview of the techniques, applications, and recent developments in the broad field of dimensionality reduction. The increasingly large dimensionality of data from many application domains has posed unprecedented challenges to data mining; in the meantime, new types of data are evolving such as Web, biological, and streaming data. Dimensionality reduction is an essential step in successful data mining applications such as text and Web document categorization, image classification and clustering, and microarray data analysis. This tutorial will introduce the necessary background for dimensionality reduction, present two major techniques for dimensionality reduction (feature selection and feature extraction), demonstrate successful applications of these techniques in various domains, and discuss recent advances and current trends.

 

Biodata:

Lei Yu is an Assistant Professor of the Department of Computer Science at Binghamton University. He received his Ph.D. in Computer Science from Arizona State University. His research areas are machine learning, data mining, and bioinformatics. He is the author of two book chapters and many publications in prestigious forums on feature selection and high-dimensional data preprocessing.

 

Jieping Ye is an Assistant Professor of the Department of Computer Science and Engineering at the Arizona State University. He received his Ph.D. in Computer Science from University of Minnesota, Twin Cities. His research interests lie in machine learning, data mining, and bioinformatics. In 2004, his paper on generalized low rank approximations of matrices won the outstanding student paper award at the Twenty-First International Conference on Machine Learning.

 

Huan Liu is an Associate Professor of the Department of Computer Science and Engineering at the Arizona State University (ASU). Before he joined ASU, he worked in Telecom Australia Research Labs (Telstra) and taught at National University of Singapore. He received his Ph.D. in Computer Science from University of Southern California. He has published books, journal and conference papers in the areas of data preprocessing for mining (including feature selection and discretization), data (including Web, image and text) mining, and has extensive experience of real-world applications.

 

 

TUTORIAL 4: A Statistical Framework for Mining Data Streams

Presenters: Simon Urbanek (AT&T Labs) and Tamraparni Dasu (AT&T Labs)

 

Abstract: Data streams are a predominant form of information today, arising in areas and applications ranging from telecommunications, meteorology and rocketry, to the monitoring and support of e-commerce sites. Data streams are characterized by large volumes and high rates of accumulation. They pose unique analytical, statistical and computing challenges that are just beginning to be addressed. It is an important area that researchers in data mining can make significant contributions to, an area rife with open research problems as well as important industrial and scientific applications. In this tutorial, we give an introduction and overview of the analysis and monitoring of data streams, with an emphasis on a statistical framework. We discuss the analytical and computing challenges posed by the unique constraints associated with data streams. There are a wide variety of problems – data reduction, characterizing constantly changing distributions, detecting changes in these distributions, computing and updating models for evolving data streams, identifying outliers, tracking rare events, “correlating” multiple data streams and others.

 

Biodata:

Tamraparni Dasu is a Member of Technical Staff in the Department of Statistics at AT&T Labs – Research specializing in data mining, data quality and nonparametric statistics. She received her Ph. D. in Statistics from the University of Rochester in 1991. She has been at AT&T Labs – Research (Bell Labs until 1996) since then and has extensive experience in mining massive telecommunication data, data streams and network data. She has given tutorials at KDD, SDM and JSM in the past.

 

Simon Urbanek is a member of Technical Staff in the Department of Statistics at AT&T Labs – Research specializing in visualization, exploratory model analysis and data mining. He received his Ph. D. in Statistics from the University of Augsburg in 2004. He joined AT&T Labs – Research in 2004. Simon has been working on visualization and analysis of large datasets and has developed among others interactive visualization software iPlots and Klimt.