We are pleased to announce the following accepted tutorials for presentation at SDM 2007.
INVITED TUTORIAL 1: Data Analytics for Marketing Decision Support
Presenters: Saharon Rosset (IBM) and Naoki Abe (IBM)
Abstract: In
this tutorial, we will give an overview of the issues in applying data mining
and analytics tools to marketing decision support, primarily for marketing/sales
optimization and CRM purposes. In particular, we will review some challenges
involved in designing analytics tools for real-life marketing decision support;
discuss different analytical approaches to addressing these problems in a
practically feasible manner; present case studies from our own experiences in
customer lifetime value modeling and customer wallet estimation. This tutorial
is intended for data miners, whether from industry or academia, who have an
interest in moving beyond generic tools and methods to design real, practically
useful solutions to decision making problems in marketing and related areas.
Biodata:
Saharon Rosset is a Research
Staff Member in the Predictive Modeling Group, in the Mathematical Sciences
Department at IBM Research. He received
his B.Sc. degree in Mathematics and his M.Sc. degree in Statistics from
Naoki Abe has been a research
staff member in the Predictive Modeling/Data Analytics Research group since
June, 2001, and is engaged in research in machine learning and data mining.
Naoki obtained his B.S. and M.S. in computer science from MIT in 1984, and a
Ph.D. in Computer and Information Science from the
TUTORIAL 2: Mining Large Time-evolving Data Using Matrix and Tensor Tools
Presenters: Christos Faloutsos (CMU), Tamara G Kolda (Sandia National Labs), and Jimeng Sun (CMU)
Abstract: How can we find patterns in sensor streams (eg., a
sequence of temperatures, water-pollutant measurements, or machine room measurements)?
How can we mine Internet traffic graph over time? Further, how can we make the
process incremental? We review the state of the art in four related fields: (a)
numerical analysis and linear algebra (b) multi-linear/tensor analysis (c)
graph mining and (d) stream mining. We will present both theoretical results
and algorithms as well as case studies on several real applications. Our
emphasis is on the intuition behind each method, and on guidelines for the
practitioner.
Biodata:
Christos Faloutsos is a
Professor at
Tamara G. Kolda is a
researcher at Sandia National Laboratories in
Jimeng Sun is a PhD candidate
in Computer Science Department at
TUTORIAL 3: Dimensionality
Reduction for Data Mining - Techniques, Applications, and Trends
Presenters: Lei Yu (Binghamton U), Jieping Ye (Arizona State U), and Huan Liu (Arizona State U)
Abstract: The objective of this tutorial is to give a
comprehensive overview of the techniques, applications, and recent developments
in the broad field of dimensionality reduction. The increasingly large
dimensionality of data from many application domains has posed unprecedented
challenges to data mining; in the meantime, new types of data are evolving such
as Web, biological, and streaming data. Dimensionality reduction is an
essential step in successful data mining applications such as text and Web
document categorization, image classification and clustering, and microarray
data analysis. This tutorial will introduce the necessary background for
dimensionality reduction, present two major techniques for dimensionality
reduction (feature selection and feature extraction), demonstrate successful
applications of these techniques in various domains, and discuss recent
advances and current trends.
Biodata:
Lei Yu is an Assistant
Professor of the Department of Computer Science at
Jieping Ye is an Assistant
Professor of the Department of Computer Science and Engineering at the
Huan Liu is an Associate
Professor of the Department of Computer Science and Engineering at the Arizona
State University (ASU). Before he joined ASU, he worked in Telecom Australia
Research Labs (Telstra) and taught at National University of Singapore. He
received his Ph.D. in Computer Science from
TUTORIAL 4: A Statistical Framework for Mining Data Streams
Presenters: Simon Urbanek (AT&T Labs) and Tamraparni Dasu (AT&T Labs)
Abstract: Data streams are a predominant form of information
today, arising in areas and applications ranging from telecommunications,
meteorology and rocketry, to the monitoring and support of e-commerce sites.
Data streams are characterized by large volumes and high rates of accumulation.
They pose unique analytical, statistical and computing challenges that are just
beginning to be addressed. It is an important area that researchers in data
mining can make significant contributions to, an area rife with open research
problems as well as important industrial and scientific applications. In this
tutorial, we give an introduction and overview of the analysis and monitoring
of data streams, with an emphasis on a statistical framework. We discuss the
analytical and computing challenges posed by the unique constraints associated
with data streams. There are a wide variety of problems – data reduction,
characterizing constantly changing distributions, detecting changes in these
distributions, computing and updating models for evolving data streams,
identifying outliers, tracking rare events, “correlating” multiple data streams
and others.
Biodata:
Tamraparni Dasu is a Member
of Technical Staff in the Department of Statistics at AT&T Labs – Research
specializing in data mining, data quality and nonparametric statistics. She
received her Ph. D. in Statistics from the
Simon Urbanek is a member of
Technical Staff in the Department of Statistics at AT&T Labs – Research
specializing in visualization, exploratory model analysis and data mining. He
received his Ph. D. in Statistics from the