I am broadly interested in data mining, machine learning, and optimization with applications in recommender systems, social graph analysis, text mining, and natural language processing. My PhD thesis, titled A Framework for Semantic and Social Recommender Systems, introduces a unified framewrok for recommendation with multiple sources of side information. In particular, it introduces matrix factiorization with trsut and distrust relations between users, distance metric learning from constriant graphs, semantic Levenshtein distance, weighted association rules, binary data clustering and few other algorithms for recommender systems.
Machine Learning for Recommendation
Recommender systems have become ubiquitous in recent years, and are applied in a variety of applications such as Netflix, Amazon, etc. But there are few challenges such as data sparsity and cold-start users/items problems that need to be addressed. We are interested in utilizing and developing machine learning methods such as matrix factorization, ranking, and clustering to resolve these issues.
Document Clustering, Semantic Analysis, and Feature Selection
Document clustering is the application of cluster analysis to textual documents with numerous applications in automatic document organization, topic extraction, recommender systems and fast information retrieval or filtering. In last coupl of years, we have been working on developing efficient algorithms for document clustering. Also, due to computational burden of high-dimensional textual data, selecting informative feature, a problem known as feature selection, has been a focus of our reserch in last couple of years. Extracting semantic of documents, ontlogy mapping, and part-of-speech tagging are other related interesting problems we are interested to explore.
Convex and Non-convex Optimization
Many problems in data mining, machine learning, communication can be cast as optimization problem. Although, in some cases these problems are convex and can be solved efficiently by off-the-shelf convex optimization methods, but in general most of these problems are non-convex or combinatorial. One of our main line of research is developing effieicnt meta-heuristic methods to tackle these hard problems. In particular, developing efficient optimization methods with balanced exploration-exploitation in searching the solution space has been our main research focus.
Many machine learning approaches rely on some similarity metric including: unsupervised learning such as clustering, information retrieval for learning to rank, in face verification or face identification, and in recommendation systems. Also, aggregation and utilizing of different rich sources of information, or covariates, has been a challenging problem in many machine learning applications. This problem is of great importance, especially when a single view of the data is sparse or incomplete. Despite the recent developments in hybrid methods, the general problem of integrating and aggregating data from various sources due to the diversities still remains. We are interested in learning similarity metric by aggregating multiple sources of side information into a single distance metric that can be used in different applications.
Social Graph Analysis and Colloborative Ranking
In recent years there has been an upsurge of interest in understanding and exploiting social information such as trust and distrust relations among users along with rating data to improve the performance of recommender systems and resolve sparsity and cold-start problems. We research to design novel algorithms to exploit social relations bwetween users, and better understand problems such as propogation of trust/distrust relations, link prediction, influence diffisuion in social networks.