Practical Leverage-Based Sampling for Low-Rank Tensor Decomposition

Abstract

The low-rank canonical polyadic tensor decomposition is useful in data analysis and can be computed by solving a sequence of overdetermined least squares subproblems. Motivated by consideration of sparse tensors, we propose sketching each subproblem using leverage scores to select a subset of the rows, with probabilistic guarantees of the solution accuracy. We randomly sample rows proportional to leverage score upper bounds that can be efficiently computed using the special Khatri-Rao subproblem structure inherent in tensor decomposition. Crucially, the number of rows in the sketched system is independent of the number of nonzeros in the full tensor and the number of rows in the full system. Along the way, we provide a practical solution to the generic matrix sketching problem of sampling overabundance for high-leverage-score rows, proposing to include such rows deterministically and combine repeated samples in the sketched system; we conjecture that this can lead to improved theoretical bounds. Numerical results on real-world large-scale tensors show the method is significantly faster than deterministic methods without sacrificing accuracy.

Publication
arXiv
Date
Citation
B. W. Larsen, T. G. Kolda. Practical Leverage-Based Sampling for Low-Rank Tensor Decomposition. arXiv:2006.16438, 2020. http://arxiv.org/abs/2006.16438

Keywords

tensor decomposition, CANDECOMP/PARAFAC (CP), canonical polyadic (CP), matrix sketching, leverage score sampling, randomized numerical linear algebra (RandNLA)

Comments

Initial version posted June 30, 2020. Revised December 10, 2020 and submitted for publication at that time.

BibTeX

@misc{arXiv-LaKo20,  
author = {Brett W. Larsen and Tamara G. Kolda}, 
title = {Practical Leverage-Based Sampling for Low-Rank Tensor Decomposition}, 
month = {December}, 
year = {2020},
eprint = {2006.16438},
eprintclass = {math.NA},
}