# Scalable Bayesian Regression in High Dimensions With Multiple Data Sources

@article{Perrakis2019ScalableBR, title={Scalable Bayesian Regression in High Dimensions With Multiple Data Sources}, author={Konstantinos Perrakis and Sach Mukherjee and the Alzheimer’s Disease Neuroimaging Initiative}, journal={Journal of Computational and Graphical Statistics}, year={2019}, volume={29}, pages={28 - 39} }

Abstract Applications of high-dimensional regression often involve multiple sources or types of covariates. We propose methodology for this setting, emphasizing the “wide data” regime with large total dimensionality p and sample size . We focus on a flexible ridge-type prior with shrinkage levels that are specific to each data type or source and that are set automatically by empirical Bayes. All estimation, including setting of shrinkage levels, is formulated mainly in terms of inner product… Expand

#### 3 Citations

High-dimensional regression in practice: an empirical study of finite-sample prediction, variable selection and ranking

- Computer Science, Mathematics
- Stat. Comput.
- 2020

A large-scale comparison of penalized regression methods is presented, with no unambiguous winner across all scenarios or goals, even in this restricted setting where all data align well with the assumptions underlying the methods. Expand

Fast Cross-validation for Multi-penalty High-dimensional Ridge Regression

- Computer Science
- 2021

High-dimensional prediction with multiple data types needs to account for potentially strong differences in predictive signal. Ridge regression is a simple model for high-dimensional data that has ...

Fast cross-validation for multi-penalty ridge regression

- Mathematics, Computer Science
- 2020

A very flexible framework that includes prediction of several types of response, allows for unpenalized covariates, can optimize several performance criteria and implements repeated CV is developed. Expand

#### References

SHOWING 1-10 OF 54 REFERENCES

Consistent High-Dimensional Bayesian Variable Selection via Penalized Credible Regions

- Computer Science, Medicine
- Journal of the American Statistical Association
- 2012

This work proposes a conjugate prior only on the full model parameters and use sparse solutions within posterior credible regions to perform selection, and shows that these sparse solutions can be computed via existing algorithms. Expand

An Information Matrix Prior for Bayesian Analysis in Generalized Linear Models with High Dimensional Data.

- Mathematics, Medicine
- Statistica Sinica
- 2009

A novel specification for a general class of prior distributions, called Information Matrix (IM) priors, for high-dimensional generalized linear models, based on a broad generalization of Zellner's g-prior for Gaussian linear models is developed. Expand

A Sparse-Group Lasso

- Mathematics
- 2013

For high-dimensional supervised learning problems, often using problem-specific assumptions can lead to greater accuracy. For problems with grouped covariates, which are believed to have sparse… Expand

Penalized regression, standard errors, and Bayesian lassos

- Mathematics
- 2010

Penalized regression methods for simultaneous variable selection and coe-cient estimation, especially those based on the lasso of Tibshirani (1996), have received a great deal of attention in recent… Expand

Inference with normal-gamma prior distributions in regression problems

- Mathematics
- 2010

This paper considers the efiects of placing an absolutely continuous prior distribution on the regression coe-cients of a linear model. We show that the posterior expectation is a matrix-shrunken… Expand

Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties

- Mathematics
- 2001

Variable selection is fundamental to high-dimensional statistical modeling, including nonparametric regression. Many approaches in use are stepwise selection procedures, which can be computationally… Expand

Model uncertainty and variable selection in Bayesian lasso regression

- Mathematics, Computer Science
- Stat. Comput.
- 2010

This paper describes how the marginal likelihood can be accurately computed when the number of predictors in the model is not too large, allowing for model space enumeration when the total number of possible predictors is modest. Expand

GENERALIZED DOUBLE PARETO SHRINKAGE.

- Computer Science, Mathematics
- Statistica Sinica
- 2013

The properties of the maximum a posteriori estimator are investigated, as sparse estimation plays an important role in many problems, connections with some well-established regularization procedures are revealed, and some asymptotic results are shown. Expand

On Bayesian lasso variable selection and the specification of the shrinkage parameter

- Mathematics, Computer Science
- Stat. Comput.
- 2013

A Bayesian implementation of the lasso regression that accomplishes both shrinkage and variable selection through Bayes factors that evaluate the inclusion of each covariate in the model formulation is proposed. Expand

Sparsity and smoothness via the fused lasso

- Mathematics
- 2005

Summary. The lasso penalizes a least squares regression by the sum of the absolute values (L1-norm) of the coefficients. The form of this penalty encourages sparse solutions (with many coefficients… Expand