**Scaling Parallel Graph Analysis and Machine Learning using Sparse Matrix Operations**

Aydin Buluc

Staff Scientist

Lawrence Berkeley National Lab

Friday, April 27, 2018

11 AM - 12 PM

EB 3105

Abstract:

Data-intensive applications from many scientific domains rely on algorithms with irregular data access patterns, such as those found in graph analysis and machine learning. Data from these scientific application domains are often sparse with many missing entries. Due to this sparsity, hand-written implementations of common graph and machine learning routines are not able to efficiently harness the capabilities of large-scale parallel computers.
Many graph algorithms have been defined in the language of linear algebra. Mapping sparse matrix algorithms onto modern architectures is relatively well understood, and several groups have built high performance graph libraries based on sparse linear algebra. A group of researchers formed the GraphBLAS Forum to standardize linear-algebraic building blocks for graph algorithms.
In this talk, I will describe the GraphBLAS initiative and summarize our work on building efficient parallel algorithms for some of the most challenging GraphBLAS operations. I will demonstrate how novel concepts of GraphBLAS such as masks enable the most efficient implementations of various graph algorithms, including those that utilize the celebrated direction-optimized traversal. I will then show how sparse-matrix primitives in GraphBLAS can also be used to implement much needed functionality for machine learning algorithms. In particular, I will describe our recent work on distributed-memory parallelization of two prevalent machine learning problems: graphical model estimation and flow-based clustering. I will highlight the importance of communication-avoiding sparse matrix operations to achieve scalability in these problems. I will conclude with open problems and future directions.

Biography:

Aydin Buluc is a Staff Scientist at the Lawrence Berkeley National Laboratory (LBNL) and an Adjunct Assistant Professor of EECS at UC Berkeley. His research interests include parallel computing, combinatorial scientific computing, high performance graph analysis and machine learning, sparse matrix computations, and computational genomics. Previously, he was a Luis W. Alvarez postdoctoral fellow at LBNL and a visiting scientist at the Simons Institute for the Theory of Computing. He received his PhD in Computer Science from the University of California, Santa Barbara and his BS in Computer Science and Engineering from Sabanci University, Turkey. Dr. Buluc is a recipient of the DOE Early Career Award in 2013 and the IEEE TCSC Award for Excellence for Early Career Researchers in 2015. He is a founding associate editor of the ACM Transactions on Parallel Computing.

Host:

Dr. H. Metin Aktulga