# Sparse binary matrix matlab

They are a cornerstone of the deep learning craze a la Geoff Hinton. A mixture of Bernoullis is a Bayesian, that is probabilistic, technique. Bernoulli is a fancy word for binary. The idea is that each example comes from one of a handful of Bernoulli distributions, or components - a component is like a topic in LDA. The downside of this method is that each row is supposed to come from exactly one component. Therefore it will give you a row of probabilities that an example row comes from a given component columns , which might be OK after all.

Also, as is often the case with Bayesian methods, probably not the fastest. We have used the aspect Bernoulli model. The code is available as a single Matlab file.

Results are on par with other methods. The supervised score apparently is better when these techniques are applied on TF-IDF transformed data.

For numbers, refer to the updated results. We provide a script using Gensim. Inverse document frequency means that rare features weigh more than frequent ones. For example, you can just apply vanilla PCA. The sparsity stays the same, meaning that zeros are still zeros. Principal components of adult as shown by FastICA.

To sum up, one guy called LiangJie Hong, a research scientist at Yahoo! Labs, reviewed a few papers on binary matrix decomposition and his punch line is this:. In all, it seems that the performance advantages of specifically designed binary data models are small. We would generally agree. Bernoulli is a fancy word for binary. The idea is that each example comes from one of a handful of Bernoulli distributions, or components - a component is like a topic in LDA.

The downside of this method is that each row is supposed to come from exactly one component. Therefore it will give you a row of probabilities that an example row comes from a given component columns , which might be OK after all. Also, as is often the case with Bayesian methods, probably not the fastest. We have used the aspect Bernoulli model. The code is available as a single Matlab file.

Results are on par with other methods. The supervised score apparently is better when these techniques are applied on TF-IDF transformed data. For numbers, refer to the updated results. We provide a script using Gensim. Inverse document frequency means that rare features weigh more than frequent ones. For example, you can just apply vanilla PCA. The sparsity stays the same, meaning that zeros are still zeros. Principal components of adult as shown by FastICA.

To sum up, one guy called LiangJie Hong, a research scientist at Yahoo! Labs, reviewed a few papers on binary matrix decomposition and his punch line is this:. In all, it seems that the performance advantages of specifically designed binary data models are small. We would generally agree. All methods we have tried seem to be rather close to each other as regards the supervised score they achieve. The main difference is in speed and scalability.