[arXiv] BigDataFr recommends: Preconditioned Data Sparsification for Big Data with Applications to PCA and K-means

Publié le 9 novembre 20158 novembre 2015 par Big Data

BigDataFr recommends: Preconditioned Data Sparsification for Big Data with Applications to PCA and K-means

Excerpt

We analyze a compression scheme for large data sets that randomly keeps a small percentage of the components of each data sample. The benefit is that the output is a sparse matrix and therefore subsequent processing, such as PCA or K-means, is significantly faster, especially in a distributed-data setting. Furthermore, the sampling is single-pass and applicable to streaming data. The sampling mechanism is a variant of previous methods proposed in the literature combined with a randomized preconditioning to smooth the data. [..]

Read paper
By Farhad Pourkamali-Anaraki, Stephen Becker
Source: arxiv.org

Laisser un commentaire Annuler la réponse

Vous devez vous connecter pour publier un commentaire.

Related Posts:

Laisser un commentaire Annuler la réponse