OnlineNMF.jl Documentation
Overview
OnlineNMF.jl binarizes CSV file, summarizes the information of data matrix and, performs some online-NMF functions for extreamly large scale matrix in an out-of-core manner without loading whole data on memory space.
Online NMF methods are performed as following two steps.
- Step.1 Binarization : We asssume that the data is matrix filled in integar count value and saved as comma-separated CSV, Matrix Market (MM), or Binary COO (BinCOO) file. Using
OnlinePCA
package, These files are converted to Julia binary file bycsv2bin
,mm2bin
orbincoo2bin
, respectively. This step extremely accelerates I/O speed. - Step.2 Online NMF :
nmf
anddnmf
can be perfomed against the binary file generated bycsv2bin
.sparse_nmf
andsparse_dnmf
can be perfomed against the binary file generated bymm2bin
.bincoo_nmf
andbincoo_dnmf
can be perfomed against the binary file generated bybincoo2bin
.
All programs are available as Julia API (OnlineNMF.jl (Julia API)) and command line tool (OnlineNMF.jl (Command line tool)).
Reference
- Multiplicative Update (MU)
- Alpha-divergence: Cichocki, A. et al., 2008
- Alpha=2 : Pearson divergence-based NMF
- Alpha=0 or 1 : Kullback–Leibler (KL) divergence-based NMF
- Alpha=0.5 : Hellinger divergence-based NMF
- Beta-divergence: Févotte, C. et al., 2011, Nakano, M. et al., 2010
- Beta=2 : Euclidean distance-based NMF with Gaussian distribution
- Beta=1 : Kullback–Leibler divergence-based NMF with Poisson distribution
- Beta=0 : Itakura-Saito divergence-based NMF with Gamma distribution
- Alpha-divergence: Cichocki, A. et al., 2008
- Discretized Non-negative Matrix Factorization (DNMF): Koki Tsuyuzaki, 2023