CMStatistics 2018: Start Registration
View Submission - CMStatistics
B0332
Title: Scalable bayesian non-linear SVMs for big data problems Authors:  Sounak Chakraborty - University of Missouri, Columbia (United States) [presenting]
Abstract: Bayesian non-linear SVM models are developed for Big Data platforms. In Big Data platforms, nonlinear SVMs are not very popular due to the difficulties in calculating and using the Gram/Kernel matrix. We employ a MCMC and Quasi-MCMC based solution to extract low dimensional random features and use them for approximating the Kernel matrix very efficiently and then use it in the model for faster and more accurate calculations. Our Bayesian SVM model is primarily for solving classification problems (binary and multiclass support vector machines). The feature selection is integrated in the framework Gaussian spike and slab priors. We propose a computationally scalable Gibbs sampling algorithm, which has linear computational complexity for covariate selections. In addition to that, we also consider Bayesian semi-supervised learning and propose a novel Bayesian approach for variable selection with scalable Gibbs algorithm. Our proposed novel Gibbs sampler called Skinny Gibbs which is much more scalable to high dimensional problems, both in memory and in computational efficiency. It can also avoid large matrix computations needed in standard Gibbs sampling algorithms. In terms of computational complexity for our Skinny Gibbs, it grows only linearly in the number of predictors. Efficiency of our method for supervised and semi-supervised SVM models are demonstrated based on several simulation studies and data analysis.