This paper addresses the investment decisions considering the presence of financial constraints of 373 large Brazilian firms from 1997 to 2004, using panel data. A Bayesian econometric model was used considering ridge regression for multicollinearity problems among the variables in the model. Prior distributions are assumed for the parameters, classifying the model into random or fixed effects. We used a Bayesian approach to estimate the parameters, considering normal and Student t distributions for the error and assumed that the initial values for the lagged dependent variable are not fixed, but generated by a random process. The recursive predictive density criterion was used for model comparisons. Twenty models were tested and the results indicated that multicollinearity does influence the value of the estimated parameters. Controlling for capital intensity, financial constraints are found to be more important for capital-intensive firms, probably due to their lower profitability indexes, higher fixed costs and higher degree of property diversification.; Fundacao de Amparo a Pesquisa do Estado de Sao Paulo - FAPESP
Objetivou-se obter estimativas de efeitos genéticos aditivos e não-aditivos para as características pré e pós-desmama de animais Hereford x Nelore por meio de análises de regressão linear múltipla, com e sem o uso da técnica de regressão de cumeeira. Avaliaram-se as características ganho médio diário do nascimento à desmama, conformação, precocidade e musculatura à desmama, ganho médio diário da desmama ao sobreano, conformação, precocidade e musculatura ao sobreano e perímetro escrotal ajustado para idade e para idade e peso. Os resultados obtidos sem o uso da técnica indicaram valores acentuados dos fatores de inflação da variância. Para melhor interpretar os efeitos estimados, foram preditos os desempenhos de cinco gerações na formação do Braford ½ em relação à raça Hereford, partindo de vacas da raça Nelore. Os animais da geração F1 apresentaram alto desempenho, em razão do benefício máximo da heterose direta e do efeito aditivo materno. A manifestação completa da epistasia direta reduziu significativamente os desempenhos dos animais da geração F2. Para as características de desmama, os animais da geração F3 mostraram desempenhos menores, em virtude do efeito epistático materno máximo...
This paper addresses the investment decisions considering the presence of financial constraints of 373 large Brazilian firms from 1997 to 2004, using panel data. A Bayesian econometric model was used considering ridge regression for multicollinearity problems among the variables in the model. Prior distributions are assumed for the parameters, classifying the model into random or fixed effects. We used a Bayesian approach to estimate the parameters, considering normal and Student t distributions for the error and assumed that the initial values for the lagged dependent variable are not fixed, but generated by a random process. The recursive predictive density criterion was used for model comparisons. Twenty models were tested and the results indicated that multicollinearity does influence the value of the estimated parameters. Controlling for capital intensity, financial constraints are found to be more important for capital-intensive firms, probably due to their lower profitability indexes, higher fixed costs and higher degree of property diversification.
In multivariate linear regression, it is often assumed that the response matrix is intrinsically of lower rank. This could be because of the correlation structure among the prediction variables or the coefficient matrix being lower rank. To accommodate both, we propose a reduced rank ridge regression for multivariate linear regression. Specifically, we combine the ridge penalty with the reduced rank constraint on the coefficient matrix to come up with a computationally straightforward algorithm. Numerical studies indicate that the proposed method consistently outperforms relevant competitors. A novel extension of the proposed method to the reproducing kernel Hilbert space (RKHS) set-up is also developed.
Computational efficiency is important for learning algorithms operating in the “large p, small n” setting. In computational biology, the analysis of data sets containing tens of thousands of features (“large p”), but only a few hundred samples (“small n”), is nowadays routine, and regularized regression approaches such as ridge-regression, lasso, and elastic-net are popular choices. In this paper we propose a novel and highly efficient Bayesian inference method for fitting ridge-regression. Our method is fully analytical, and bypasses the need for expensive tuning parameter optimization, via cross-validation, by employing Bayesian model averaging over the grid of tuning parameters. Additional computational efficiency is achieved by adopting the singular value decomposition re-parametrization of the ridge-regression model, replacing computationally expensive inversions of large p × p matrices by efficient inversions of small and diagonal n × n matrices. We show in simulation studies and in the analysis of two large cancer cell line data panels that our algorithm achieves slightly better predictive performance than cross-validated ridge-regression while requiring only a fraction of the computation time. Furthermore, in comparisons based on the cell line data sets...
Ridge regression with heteroscedastic marker variances provides an alternative to Bayesian genome-wide prediction methods. Our objectives were to suggest new methods to determine marker-specific shrinkage factors for heteroscedastic ridge regression and to investigate their properties with respect to computational efficiency and accuracy of estimated effects. We analyzed published data sets of maize, wheat, and sugar beet as well as simulated data with the new methods. Ridge regression with shrinkage factors that were proportional to single-marker analysis of variance estimates of variance components (i.e., RRWA) was the fastest method. It required computation times of less than 1 sec for medium-sized data sets, which have dimensions that are common in plant breeding. A modification of the expectation-maximization algorithm that yields heteroscedastic marker variances (i.e., RMLV) resulted in the most accurate marker effect estimates. It outperformed the homoscedastic ridge regression approach for best linear unbiased prediction in particular for situations with high marker density and strong linkage disequilibrium along the chromosomes, a situation that occurs often in plant breeding populations. We conclude that the RRWA and RMLV approaches provide alternatives to the commonly used Bayesian methods...
To date, numerous genetic variants have been identified as associated with diverse phenotypic traits. However, identified associations generally explain only a small proportion of trait heritability and the predictive power of models incorporating only known-associated variants has been small. Multiple regression is a popular framework in which to consider the joint effect of many genetic variants simultaneously. Ordinary multiple regression is seldom appropriate in the context of genetic data, due to the high dimensionality of the data and the correlation structure among the predictors. There has been a resurgence of interest in the use of penalised regression techniques to circumvent these difficulties. In this paper, we focus on ridge regression, a penalised regression approach that has been shown to offer good performance in multivariate prediction problems. One challenge in the application of ridge regression is the choice of the ridge parameter that controls the amount of shrinkage of the regression coefficients. We present a method to determine the ridge parameter based on the data, with the aim of good performance in high-dimensional prediction problems. We establish a theoretical justification for our approach, and demonstrate its performance on simulated genetic data and on a real data example. Fitting a ridge regression model to hundreds of thousands to millions of genetic variants simultaneously presents computational challenges. We have developed an R package...
In recent years, there has been a considerable amount of research on the use of regularization methods for inference and prediction in quantitative genetics. Such research mostly focuses on selection of markers and shrinkage of their effects. In this review paper, the use of ridge regression for prediction in quantitative genetics using single-nucleotide polymorphism data is discussed. In particular, we consider (i) the theoretical foundations of ridge regression, (ii) its link to commonly used methods in animal breeding, (iii) the computational feasibility, and (iv) the scope for constructing prediction models with nonlinear effects (e.g., dominance and epistasis). Based on a simulation study we gauge the current and future potential of ridge regression for prediction of human traits using genome-wide SNP data. We conclude that, for outcomes with a relatively simple genetic architecture, given current sample sizes in most cohorts (i.e., N < 10,000) the predictive accuracy of ridge regression is slightly higher than the classical genome-wide association study approach of repeated simple regression (i.e., one regression per SNP). However, both capture only a small proportion of the heritability. Nevertheless, we find evidence that for large-scale initiatives...
Multiple linear regression model plays a key role in statistical inference and it has extensive applications in business, environmental, physical and social sciences. Multicollinearity has been a considerable problem in multiple regression analysis. When the regressor variables are multicollinear, it becomes difficult to make precise statistical inferences about the regression coefficients. There are some statistical methods that can be used, which are discussed in this thesis are ridge regression, Liu, two parameter biased and LASSO estimators. Firstly, an analytical comparison on the basis of risk was made among ridge, Liu and LASSO estimators under orthonormal regression model. I found that LASSO dominates least squares, ridge and Liu estimators over a significant portion of the parameter space for large dimension. Secondly, a simulation study was conducted to compare performance of ridge, Liu and two parameter biased estimator by their mean squared error criterion. I found that two parameter biased estimator performs better than its corresponding ridge regression estimator. Overall, Liu estimator performs better than both ridge and two parameter biased estimator.
Anomalies persist in the foundations of ridge regression as set forth in
Hoerl and Kennard (1970) and subsequently. Conventional ridge estimators and
their properties do not follow on constraining lengths of solution vectors
using LaGrange's method, as claimed. Estimators so constrained have singular
distributions; the proposed solutions are not necessarily minimizing; and
heretofore undiscovered bounds are exhibited for the ridge parameter. None of
the considerable literature on estimation, prediction, cross--validation,
choice of ridge parameter, and related issues, collectively known as ridge
regression, is consistent with constrained optimization, nor with corresponding
inequality constraints. The problem is traced to a misapplication of LaGrange's
principle, failure to recognize the singularity of distributions, and misplaced
links between constraints and the ridge parameter. Other principles, based on
condition numbers, are seen to validate both conventional ridge and surrogate
ridge regression to be defined. Numerical studies illustrate that ridge
analysis often exhibits some of the same pathologies it is intended to redress.
We establish optimal convergence rates for a decomposition-based scalable
approach to kernel ridge regression. The method is simple to describe: it
randomly partitions a dataset of size N into m subsets of equal size, computes
an independent kernel ridge regression estimator for each subset, then averages
the local solutions into a global predictor. This partitioning leads to a
substantial reduction in computation time versus the standard approach of
performing kernel ridge regression on all N samples. Our two main theorems
establish that despite the computational speed-up, statistical optimality is
retained: as long as m is not too large, the partition-based estimator achieves
the statistical minimax rate over all estimators using the set of N samples. As
concrete examples, our theory guarantees that the number of processors m may
grow nearly linearly for finite-rank kernels and Gaussian kernels and
polynomially in N for Sobolev spaces, which in turn allows for substantial
reductions in computational cost. We conclude with experiments on both
simulated data and a music-prediction task that complement our theoretical
results, exhibiting the computational and statistical benefits of our approach.
Unlike the ordinary least-squares (OLS) estimator for the linear model, a
ridge regression linear model provides coefficient estimates via shrinkage,
usually with improved mean-square and prediction error. This is true especially
when the observed design matrix is ill-conditioned or singular, either as a
result of highly-correlated covariates or the number of covariates exceeding
the sample size. This paper introduces novel and fast marginal maximum
likelihood (MML) algorithms for estimating the shrinkage parameter(s) for the
Bayesian ridge and power ridge regression models, and an automatic plug-in MML
estimator for the Bayesian generalized ridge regression model. With the aid of
the singular value decomposition of the observed covariate design matrix, these
MML estimation methods are quite fast even for data sets where either the
sample size (n) or the number of covariates (p) is very large, and even when
p>n. On several real data sets varying widely in terms of n and p, the
computation times of the MML estimation methods for the three ridge models,
respectively, are compared with the times of other methods for estimating the
shrinkage parameter in ridge, LASSO and Elastic Net (EN) models, with the other
methods based on minimizing prediction error according to cross-validation or
information criteria. Also...
Cross-correlation techniques provide a promising avenue for calibrating
photometric redshifts and determining redshift distributions using spectroscopy
which is systematically incomplete (e.g., current deep spectroscopic surveys
fail to obtain secure redshifts for 30-50% or more of the galaxies targeted).
In this paper we improve on the redshift distribution reconstruction methods
presented in Matthews & Newman (2010) by incorporating full covariance
information into our correlation function fits. Correlation function
measurements are strongly covariant between angular or spatial bins, and
accounting for this in fitting can yield substantial reduction in errors.
However, frequently the covariance matrices used in these calculations are
determined from a relatively small set (dozens rather than hundreds) of
subsamples or mock catalogs, resulting in noisy covariance matrices whose
inversion is ill-conditioned and numerically unstable. We present here a method
of conditioning the covariance matrix known as ridge regression which results
in a more well behaved inversion than other techniques common in large-scale
structure studies. We demonstrate that ridge regression significantly improves
the determination of correlation function parameters. We then apply these
improved techniques to the problem of reconstructing redshift distributions. By
incorporating full covariance information...
We provide a unified analysis of the predictive risk of ridge regression and
regularized discriminant analysis in a dense random effects model. We work in a
high-dimensional asymptotic regime where $p, n \to \infty$ and $p/n \to \gamma
\in (0, \, \infty)$, and allow for arbitrary covariance among the features. For
both methods, we provide an explicit and efficiently computable expression for
the limiting predictive risk, which depends only on the spectrum of the
feature-covariance matrix, the signal strength, and the aspect ratio $\gamma$.
Especially in the case of regularized discriminant analysis, we find that
predictive accuracy has a nuanced dependence on the eigenvalue distribution of
the covariance matrix, suggesting that analyses based on the operator norm of
the covariance matrix may not be sharp. Our results also uncover several
qualitative insights about both methods: for example, with ridge regression,
there is an exact inverse relation between the limiting predictive risk and the
limiting estimation risk given a fixed signal strength. Our analysis builds on
recent advances in random matrix theory.; Comment: Added a section on prediction versus estimation for ridge regression.
Rewrote introduction. Other results unchanged
We consider the application of a popular penalised regression method, Ridge
Regression, to data with very high dimensions and many more covariates than
observations. Our motivation is the problem of out-of-sample prediction and the
setting is high-density genotype data from a genome-wide association or
resequencing study. Ridge regression has previously been shown to offer
improved performance for prediction when compared with other penalised
regression methods. One problem with ridge regression is the choice of an
appropriate parameter for controlling the amount of shrinkage of the
coefficient estimates. Here we propose a method for choosing the ridge
parameter based on controlling the variance of the predicted observations in
Using simulated data, we demonstrate that our method outperforms subset
selection based on univariate tests of association and another penalised
regression method, HyperLasso regression, in terms of improved prediction
error. We extend our approach to regression problems when the outcomes are
binary (representing cases and controls, as is typically the setting for
genome-wide association studies) and demonstrate the method on a real data
example consisting of case-control and genotype data on Bipolar Disorder...
We introduce single-set spectral sparsification as a deterministic sampling
based feature selection technique for regularized least squares classification,
which is the classification analogue to ridge regression. The method is
unsupervised and gives worst-case guarantees of the generalization power of the
classification function after feature selection with respect to the
classification function obtained using all features. We also introduce
leverage-score sampling as an unsupervised randomized feature selection method
for ridge regression. We provide risk bounds for both single-set spectral
sparsification and leverage-score sampling on ridge regression in the fixed
design setting and show that the risk in the sampled space is comparable to the
risk in the full-feature space. We perform experiments on synthetic and
real-world datasets, namely a subset of TechTC-300 datasets, to support our
theory. Experimental results indicate that the proposed methods perform better
than the existing feature selection methods.; Comment: To appear in Neural Computation. A shorter version of this paper
appeared at ECML-PKDD 2014 under the title "Deterministic Feature Selection
for Regularized Least Squares Classification."
The linear regression model cannot be fitted to high-dimensional data, as the
high-dimensionality brings about empirical non-identifiability. Penalized
regression overcomes this non-identifiability by augmentation of the loss
function by a penalty (i.e. a function of regression coefficients). The ridge
penalty is the sum of squared regression coefficients, giving rise to ridge
regression. Here many aspect of ridge regression are reviewed e.g. moments,
mean squared error, its equivalence to constrained estimation, and its relation
to Bayesian regression. Finally, its behaviour and use are illustrated in
simulation and on omics data.
Regression-based adjusted plus-minus statistics were developed in basketball
and have recently come to hockey. The purpose of these statistics is to provide
an estimate of each player's contribution to his team, independent of the
strength of his teammates, the strength of his opponents, and other variables
that are out of his control. One of the main downsides of the ordinary least
squares regression models is that the estimates have large error bounds. Since
certain pairs of teammates play together frequently, collinearity is present in
the data and is one reason for the large errors. In hockey, the relative lack
of scoring compared to basketball is another reason. To deal with these issues,
we use ridge regression, a method that is commonly used in lieu of ordinary
least squares regression when collinearity is present in the data. We also
create models that use not only goals, but also shots, Fenwick rating (shots
plus missed shots), and Corsi rating (shots, missed shots, and blocked shots).
One benefit of using these statistics is that there are roughly ten times as
many shots as goals, so there is much more data when using these statistics and
the resulting estimates have smaller error bounds. The results of our ridge
regression models are estimates of the offensive and defensive contributions of
forwards and defensemen during even strength...
Regularization aims to improve prediction performance of a given statistical
modeling approach by moving to a second approach which achieves worse training
error but is expected to have fewer degrees of freedom, i.e., better agreement
between training and prediction error. We show here, however, that this
expected behavior does not hold in general. In fact, counter examples are given
that show regularization can increase the degrees of freedom in simple
situations, including lasso and ridge regression, which are the most common
regularization approaches in use. In such situations, the regularization
increases both training error and degrees of freedom, and is thus inherently
without merit. On the other hand, two important regularization scenarios are
described where the expected reduction in degrees of freedom is indeed
guaranteed: (a) all symmetric linear smoothers, and (b) linear regression
versus convex constrained linear regression (as in the constrained variant of
ridge regression and lasso).; Comment: Main text: 15 pages, 2 figures; Supplementary material is included at
the end of the main text: 9 pages, 7 figures
In this paper we review some existing and propose some new estimators for estimating the ridge parameter. All in all 19 different estimators have been studied. The investigation has been carried out using Monte Carlo simulations. A large number of different models have been investigated where the variance of the random error, the number of variables included in the model, the correlations among the explanatory variables, the sample size and the unknown coefficient vector were varied. For each model we have performed 2000 replications and presented the results both in term of figures and tables. Based on the simulation study, we found that increasing the number of correlated variable, the variance of the random error and increasing the correlation between the independent variables have negative effect on the mean squared error. When the sample size increases the mean squared error decreases even when the correlation between the independent variables and the variance of the random error are large. In all situations, the proposed estimators have smaller mean squared error than the ordinary least squares and other existing estimators.