Matrix Completion with Covariate Information

Matrix Completion With Covariate Information

Journal of the American Statistical Association, 114(525), 198-210 2019

X. Mao, S. X. Chen and R. K. W. Wong

Abstract

This paper investigates the problem of matrix completion from corrupted data, when additional covariates are available. Despite being seldomly considered in the matrix completion literature, these covariates often provide valuable information for completing the unobserved entries of the high-dimensional target matrix $\boldsymbol{A}_0$. Given a covariate matrix $\boldsymbol{X}$ with its rows representing the row covariates of $\boldsymbol{A}_0$, we consider a column-space-decomposition model $\boldsymbol{A}_0=\boldsymbol{X}\boldsymbol{\beta}_0+\boldsymbol{B}_0$ where $\boldsymbol{\beta}_0$ is a coefficient matrix and $\boldsymbol{B}_0$ is a low-rank matrix orthogonal to $\boldsymbol{X}$ in terms of column space. This model facilitates a clear separation between the interpretable covariate effects ($\boldsymbol{X}\boldsymbol{\beta}_0$) and the flexible hidden factor effects ($\boldsymbol{B}_0$). Besides, our work allows the probabilities of observation to depend on the covariate matrix, and hence a missing-at-random mechanism is permitted. We propose a novel penalized estimator for $\boldsymbol{A}_0$ by utilizing both Frobenius-norm and nuclear-norm regularizations with an efficient and scalable algorithm. Asymptotic convergence rates of the proposed estimators are studied. The empirical performance of the proposed methodology is illustrated via both numerical experiments and a real data application.