Lecture: Prof. Yi Zhu, 14:30-15:30, 8, June, 2018

Date:2018-05-29Views:194

Title: Network cross-validation by edge sampling

Prof. Yi Zhu

Abstract: Many models and methods are now available for network analysis, but model selection and tuning remain challenging. Cross-validation is a useful general tool for these tasks in many settings, but is not directly applicable to networks since splitting network nodes into groups requires deleting edges and destroys some of the network structure. Here we propose a new network cross-validation strategy based on splitting edges rather than nodes, which avoids losing information and is applicable to a wide range of network problems. We provide a theoretical justification for our method in a general setting, and in particular show that the method has good asymptotic properties under the stochastic block model Numerical results on simulated networks show that our approach performs well for a number of model selection and parameter tuning tasks. We also analyze a citation network of statisticians, with meaningful research communities emerging from the analysis. This is joint work with Tianxi Li and Elizaveta Levina.


Brief bio: Dr. Zhu received his B.Sc. in Physics from Peking University in China, and his Ph.D. in Statistics from Stanford University in 2003. He is now a Professor in the Department of Statistics at the University of Michigan. Dr. Zhu is recognized as a leading researcher in the areas of statistical machine learning and statistical network analysis. He is also interested in applications in science, health, engineering and business. Dr. Zhu has published more than 100+ research papers, including 90+ journal articles, 5 refereed conference articles and 7 discussions. He received a CAREER award from the National Science Foundation (USA) in 2008, and he was elected as a Fellow of the American Statistical Association in 2013 and a Fellow of the Institute of Mathematical Statistics in 2015; he also served as the Chair-Elect and Chair of the Statistical Learning and Data Mining Section of the American Statistical Association from 2011-2013.