Air Pollution and Cancer Rate Data Analysis
CSE891, Spring 2013



The purpose of this project is to use computational techniques for large-scale data analysis. We decided to analyze air pollution and cancer data to see if we could find a relationship between them. Our project is comprised of four parts:


Association rule mining seems to suggest that there are some relationships between certain pollutants and certain types of cancer. Cluster analysis suggested that low sulfur dioxide pollution was associated with low cancer rates. Our data only ranged from 1999 to 2009, and this was a limitation. A larger range of data may have made it possible to perform regression. Deeper analysis and richer data is needed to find more interesting relationships between air pollution and cancer.