Supervised Link Prediction Using Multiple Sources
- Zhengdong Lu ,
- Berkant Savas ,
- Wei Tang ,
- Inderjit Dhillon
TR-10-35 |
SIGGRAPH 2008
Link prediction is a fundamental problem in social network analysis and modern-day commercial applications such as Facebook and Myspace. Most existing research approaches this problem by exploring the topological structure of a social network using only one source of information in an unsupervised and heuristic manner. However, in many application domains, in addition to the social network of interest, there are a number of auxiliary social networks and/or derived proximity networks available. In this paper we propose a general framework of supervised link prediction from multiple heterogeneous sources. The contribution of the paper is twofold: (1) a supervised learning framework that can effectively and efficiently learn the dynamics of social networks in the presence of auxiliary networks; (2) a feature design scheme for constructing a rich variety of path-based features using multiple sources, and an effective feature selection strategy based on structured sparsity. Extensive experiments on three real world collaboration networks show that our model can effectively learn to predict new links using multiple sources, yielding higher prediction accuracy than unsupervised and single-source supervised models.