Report a bug
If you spot a problem with this page, click here to create a GitHub issue.
Improve this page
Quickly fork, edit online, and submit a pull request for this page.
Requires a signed-in GitHub account. This works well for small changes.
If you'd like to make larger changes you may want to consider using
a local clone.
mir.model.lda.hoffman
Online variational Bayes for latent Dirichlet allocation
References Hoffman, Matthew D., Blei, David M. and Bach, Francis R.. "Online Learning for Latent Dirichlet Allocation.." Paper presented at the meeting of the NIPS, 2010.
License:
Authors:
Ilya Yaroshenko
- struct
LdaHoffman
(F) if (isFloatingPoint!F); - Batch variational Bayes for LDA with mini-batches.
- this(size_t
K
, size_tW
, size_tD
, Falpha
, Feta
, Ftau0
, Fkappa
, Feps
= 1e-05, TaskPooltp
= taskPool()); - Parameters:
size_t K
theme count size_t W
dictionary size size_t D
approximate total number of documents in a collection. F alpha
Dirichlet document-topic prior (0.1) F eta
Dirichlet word-topic prior (0.1) F tau0
tau0 ≧ 0 slows down the early iterations of the algorithm. F kappa
kappa
belongs to (0.5, 1], controls the rate at which old values of lambda are forgotten. lambda = (1 - rho(tau)) lambda + rho lambda', rho(tau) = (tau0
+ tau)^(-kappa
). Usekappa
= 0 for Batch variational Bayes LDA.F eps
Stop iterations if ||lambda - lambda'||_l1 < s * eps
, where s is a documents count in a batch.TaskPool tp
task pool - void
updateBeta
(); - @property Slice!(F*, 2)
beta
(); - Posterior over the topics
- @property Slice!(F*, 2)
lambda
(); - Parameterized posterior over the topics.
- const @property F
tau
();
@property voidtau
(Fv
); - Count of already seen documents. Slows down the iterations of the algorithm.
- size_t
putBatch
(SliceKind kind, C, I, J)(Slice!(ChopIterator!(J*, Series!(I*, C*)), 1, kind)n
, size_tmaxIterations
); - Accepts mini-batch and performs multiple E-step iterations for each document and single M-step.This implementation is optimized for sparse documents, which contain much less unique words than a dictionary.Parameters:
Slice!(ChopIterator!(J*, Series!(I*, C*)), 1, kind) n
mini-batch, a collection of compressed documents. size_t maxIterations
maximal number of iterations for s This implementation is optimized for sparse documents, ingle document in a batch for E-step.
Copyright © 2016-2020 by Ilya Yaroshenko | Page generated by
Ddoc on Sun Nov 15 09:37:38 2020