Fully distributed robust singular value decomposition

    Low-rank matrix approximation is an important tool in data mining with a wide range of applications including recommender systems, clustering, and identifying topics in documents. The problem we tackle is implementing singular value decomposition (SVD)---a popular method for low rank approximation-in large fully distributed P2P systems in a robust and scalable manner. We assume that the matrix to be approximated is stored in a large network where each node knows one row of the matrix (personal attributes, documents, media ratings, etc). In our P2P model, we do not allow this personal information to leave the node, yet we want the nodes to collaboratively compute the SVD.
    Methods applied in large scale distributed systems such as synchronized parallel gradient search or distributed iterative methods are not preferable in our system model due to their requirements of synchronized rounds or their inherent issues with load balancing.
    Our approach overcomes these limitations with the help of a distributed stochastic gradient search in which the personal part of the decomposition remains local, and the global part (e.g., movie features) converges at all nodes to the correct value.
    We present a theoretical derivation of our algorithm, as well as a thorough experimental evaluation of real and synthetic data as well. We demonstrate that the convergence speed of our method is competitive while not relying on synchronization and being robust to extreme failure scenarios.

    Év: 
    2014
    Szerzők: 
    Hegedus, I., Jelasity, M., Kocsis, L., Benczúr, A. A.
    Kiadvány: 
    Peer-to-Peer Computing (P2P), 14-th IEEE International Conference