# A melhor ferramenta para a sua pesquisa, trabalho e TCC!

Página 1 dos resultados de 4019 itens digitais encontrados em 0.018 segundos

- Sociedade Brasileira de Química
- Oxford University Press
- National Academy of Sciences
- Nature Publishing Group
- Hindawi Publishing Corporation
- Blackwell
- Universidade Carlos III de Madrid
- Universidade Cornell
- Chapman & Hall
- Elsevier
- Universidade Federal de Santa Catarina. Florianópolis, SC. Brasil
- Mais Publicadores...

## Cross-validation for the selection of spectral variables using the successive projections algorithm

Fonte: Sociedade Brasileira de Química
Publicador: Sociedade Brasileira de Química

Tipo: Artigo de Revista Científica
Formato: text/html

Publicado em 01/01/2007
Português

Relevância na Pesquisa

66.1%

#multiple linear regression#variable selection#successive projections algorithm#cross-validation#near-infrared spectrometry

This work compares the use of a separate validation set and leave-one-out cross-validation to guide the selection of variables in the Successive Projections Algorithm (SPA) for multivariate calibration. Two case studies involving diesel and corn analysis by NIR spectrometry are presented. A graphical interface for SPA is available at www.ele.ita.br/~kawakami/spa/

Link permanente para citações:

## Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data

Fonte: Oxford University Press
Publicador: Oxford University Press

Tipo: Artigo de Revista Científica

Português

Relevância na Pesquisa

46.22%

Developments in whole genome biotechnology have stimulated statistical focus on prediction methods. We review here methodology for classifying patients into survival risk groups and for using cross-validation to evaluate such classifications. Measures of discrimination for survival risk models include separation of survival curves, time-dependent ROC curves and Harrell’s concordance index. For high-dimensional data applications, however, computing these measures as re-substitution statistics on the same data used for model development results in highly biased estimates. Most developments in methodology for survival risk modeling with high-dimensional data have utilized separate test data sets for model evaluation. Cross-validation has sometimes been used for optimization of tuning parameters. In many applications, however, the data available are too limited for effective division into training and test sets and consequently authors have often either reported re-substitution statistics or analyzed their data using binary classification methods in order to utilize familiar cross-validation. In this article we have tried to indicate how to utilize cross-validation for the evaluation of survival risk models; specifically how to compute cross-validated estimates of survival distributions for predicted risk groups and how to compute cross-validated time-dependent ROC curves. We have also discussed evaluation of the statistical significance of a survival risk model and evaluation of whether high-dimensional genomic data adds predictive accuracy to a model based on standard covariates alone.

Link permanente para citações:

## Empirical Performance of Cross-Validation With Oracle Methods in a Genomics Context

Fonte: PubMed
Publicador: PubMed

Tipo: Artigo de Revista Científica

Publicado em 01/11/2011
Português

Relevância na Pesquisa

46.19%

When employing model selection methods with oracle properties such as the smoothly clipped absolute deviation (SCAD) and the Adaptive Lasso, it is typical to estimate the smoothing parameter by m-fold cross-validation, for example, m = 10. In problems where the true regression function is sparse and the signals large, such cross-validation typically works well. However, in regression modeling of genomic studies involving Single Nucleotide Polymorphisms (SNP), the true regression functions, while thought to be sparse, do not have large signals. We demonstrate empirically that in such problems, the number of selected variables using SCAD and the Adaptive Lasso, with 10-fold cross-validation, is a random variable that has considerable and surprising variation. Similar remarks apply to non-oracle methods such as the Lasso. Our study strongly questions the suitability of performing only a single run of m-fold cross-validation with any oracle method, and not just the SCAD and Adaptive Lasso.

Link permanente para citações:

## Cross-Validation for Nonlinear Mixed Effects Models

Fonte: PubMed
Publicador: PubMed

Tipo: Artigo de Revista Científica

Português

Relevância na Pesquisa

46.23%

Cross-validation is frequently used for model selection in a variety of applications. However, it is difficult to apply cross-validation to mixed effects models (including nonlinear mixed effects models or NLME models) due to the fact that cross-validation requires “out-of-sample” predictions of the outcome variable, which cannot be easily calculated when random effects are present. We describe two novel variants of cross-validation that can be applied to nonlinear mixed effects models. One variant, where out-of-sample predictions are based on post hoc estimates of the random effects, can be used to select the overall structural model. Another variant, where cross-validation seeks to minimize the estimated random effects rather than the estimated residuals, can be used to select covariates to include in the model. We show that these methods produce accurate results in a variety of simulated data sets and apply them to two publicly available population pharmacokinetic data sets.

Link permanente para citações:

## Cross-validation in cryo-EM–based structural modeling

Fonte: National Academy of Sciences
Publicador: National Academy of Sciences

Tipo: Artigo de Revista Científica

Português

Relevância na Pesquisa

46.23%

Single-particle cryo-EM is a powerful approach to determine the structure of large macromolecules and assemblies thereof in many cases at subnanometer resolution. It has become popular to refine or flexibly fit atomic models into density maps derived from cryo-EM experiments. These density maps are typically significantly lower in resolution than electron density maps obtained from X-ray diffraction experiments, such that the number of parameters that need to be determined is much larger than the number of experimental observables. Overfitting and misinterpretation of the density, thus, become a serious problem. For diffraction data, a cross-validation approach was introduced almost 20 y ago; however, no such approach has been described yet for structure refinement against cryo-EM density maps, although the overfitting problem is, because of the lower resolution, significantly larger. We present a cross-validation approach for real-space refinement against cryo-EM density maps in analogy to cross-validation typically used in crystallography. Our approach is able to detect overfitting and allows for optimizing the choice of restraints used in the refinement. The approach is shown on three protein structures with simulated data and experimental data of the rotavirus double-layer particle. Because cross-validation requires splitting the dataset into at least two independent sets...

Link permanente para citações:

## Cross-validation in association mapping and its relevance for the estimation of QTL parameters of complex traits

Fonte: Nature Publishing Group
Publicador: Nature Publishing Group

Tipo: Artigo de Revista Científica

Português

Relevância na Pesquisa

46.21%

Association mapping has become a widely applied genomic approach to identify quantitative
trait loci (QTL) and dissect the genetic architecture of complex traits. However,
approaches to assess the quality of the obtained QTL results are lacking. We therefore
evaluated the potential of cross-validation in association mapping based on a large sugar
beet data set. Our results show that the proportion of the population that should be used
as estimation and validation sets, respectively, depends on the size of the mapping
population. Generally, a fivefold cross-validation, that is, 20% of the lines as
independent validation set, appears appropriate for commonly used population sizes. The
predictive power for the proportion of genotypic variance explained by QTL was
overestimated by on average 38% indicating a strong bias in the estimated QTL
effects. The cross-validated predictive power ranged between 4 and 50%, which are
more realistic estimates of this parameter for complex traits. In addition, QTL frequency
distributions can be used to assess the precision of QTL position estimates and the
robustness of the detected QTL. In summary, cross-validation can be a valuable tool to
assess the quality of QTL parameters in association mapping.

Link permanente para citações:

## Prediction of Maize Single Cross Hybrids Using the Total Effects of Associated Markers Approach Assessed by Cross-Validation and Regional Trials

Fonte: Hindawi Publishing Corporation
Publicador: Hindawi Publishing Corporation

Tipo: Artigo de Revista Científica

Português

Relevância na Pesquisa

46.21%

The present study aimed to predict the performance of maize hybrids and assess whether the total effects of associated markers (TEAM) method can correctly predict hybrids using cross-validation and regional trials. The training was performed in 7 locations of Southern Brazil during the 2010/11 harvest. The regional assays were conducted in 6 different South Brazilian locations during the 2011/12 harvest. In the training trial, 51 lines from different backgrounds were used to create 58 single cross hybrids. Seventy-nine microsatellite markers were used to genotype these 51 lines. In the cross-validation method the predictive accuracy ranged from 0.10 to 0.96, depending on the sample size. Furthermore, the accuracy was 0.30 when the values of hybrids that were not used in the training population (119) were predicted for the regional assays. Regarding selective loss, the TEAM method correctly predicted 50% of the hybrids selected in the regional assays. There was also loss in only 33% of cases; that is, only 33% of the materials predicted to be good in training trial were considered to be bad in regional assays. Our results show that the predictive validation of different crop conditions is possible, and the cross-validation results strikingly represented the field performance.

Link permanente para citações:

## Local Cross-validation for Spectrum Bandwidth Choice

Fonte: Blackwell
Publicador: Blackwell

Tipo: Artigo de Revista Científica
Formato: application/pdf

Publicado em /05/2000
Português

Relevância na Pesquisa

66.05%

#Bandwidth selection#Nonparametric spectral estimation#Cross-validation#Time series#Periodogram#Economía

We investigate an automatic method of determining a local bandwidth for non-parametric kernel spectral density estimates at a single frequency. This procedure is a modification of a cross-validation technique for global bandwidth choices, avoiding the computation of any pilot estimate based on initial bandwidths or on approximate parametric models. Only local conditions on the spectral density around the frequency of interest are assumed. We illustrate with a Monte Carlo study the performance in finite samples of the bandwidth estimates proposed.

Link permanente para citações:

## Local cross validation for spectrum bandwidth choice

Fonte: Universidade Carlos III de Madrid
Publicador: Universidade Carlos III de Madrid

Tipo: Trabalho em Andamento
Formato: application/pdf

Publicado em /02/1998
Português

Relevância na Pesquisa

66.05%

#Bandwidth selection#Nonparametric spectral estimation#Cross-validation#Time series#Periodogram#Estadística

We investigate an automatic method of determining a local bandwidth for nonparametric kernel spectral density estimates at a single frequency. This procedure is a modification of a cross-validation tecnique for global bandwidth choices, avoiding the computation of any pilot estimate based on initial bandwidths or on approximate parametric models. Only local conditions on the spectral density around the frequency of interest are assumed. We illustrate with a Monte CarIo study the performance in finite samples of the bandwidth estimates proposed.

Link permanente para citações:

## Concentration inequalities of the cross-validation estimator for Empirical Risk Minimiser

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Publicado em 30/10/2010
Português

Relevância na Pesquisa

46.34%

In this article, we derive concentration inequalities for the
cross-validation estimate of the generalization error for empirical risk
minimizers. In the general setting, we prove sanity-check bounds in the spirit
of \cite{KR99} \textquotedblleft\textit{bounds showing that the worst-case
error of this estimate is not much worse that of training error estimate}
\textquotedblright . General loss functions and class of predictors with finite
VC-dimension are considered. We closely follow the formalism introduced by
\cite{DUD03} to cover a large variety of cross-validation procedures including
leave-one-out cross-validation, $k$% -fold cross-validation, hold-out
cross-validation (or split sample), and the leave-$\upsilon$-out
cross-validation.
In particular, we focus on proving the consistency of the various
cross-validation procedures. We point out the interest of each cross-validation
procedure in terms of rate of convergence. An estimation curve with transition
phases depending on the cross-validation procedure and not only on the
percentage of observations in the test sample gives a simple rule on how to
choose the cross-validation. An interesting consequence is that the size of the
test sample is not required to grow to infinity for the consistency of the
cross-validation procedure.; Comment: 24 pages...

Link permanente para citações:

## Choice of V for V-Fold Cross-Validation in Least-Squares Density Estimation

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Português

Relevância na Pesquisa

46.23%

This paper studies V-fold cross-validation for model selection in
least-squares density estimation. The goal is to provide theoretical grounds
for choosing V in order to minimize the least-squares loss of the selected
estimator. We first prove a non-asymptotic oracle inequality for V-fold
cross-validation and its bias-corrected version (V-fold penalization). In
particular, this result implies that V-fold penalization is asymptotically
optimal in the nonparametric case. Then, we compute the variance of V-fold
cross-validation and related criteria, as well as the variance of key
quantities for model selection performance. We show that these variances depend
on V like 1+4/(V-1), at least in some particular cases, suggesting that the
performance increases much from V=2 to V=5 or 10, and then is almost constant.
Overall, this can explain the common advice to take V=5---at least in our
setting and when the computational power is limited---, as supported by some
simulation experiments. An oracle inequality and exact formulas for the
variance are also proved for Monte-Carlo cross-validation, also known as
repeated cross-validation, where the parameter V is replaced by the number B of
random splits of the data.

Link permanente para citações:

## Cross-Validation for Nonlinear Mixed Effects Models

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Publicado em 09/04/2013
Português

Relevância na Pesquisa

46.23%

Cross-validation is frequently used for model selection in a variety of
applications. However, it is difficult to apply cross-validation to mixed
effects models (including nonlinear mixed effects models or NLME models) due to
the fact that cross-validation requires "out-of-sample" predictions of the
outcome variable, which cannot be easily calculated when random effects are
present. We describe two novel variants of cross-validation that can be applied
to nonlinear mixed effects models. One variant, where out-of-sample predictions
are based on post hoc estimates of the random effects, can be used to select
the overall structural model. Another variant, where cross-validation seeks to
minimize the estimated random effects rather than the estimated residuals, can
be used to select covariates to include in the model. We show that these
methods produce accurate results in a variety of simulated data sets and apply
them to two publicly available population pharmacokinetic data sets.; Comment: 38 pages, 15 figures To be published in the Journal of
Pharmacokinetics and Pharmacodynamics

Link permanente para citações:

## Concentration inequalities of the cross-validation estimate for stable predictors

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Publicado em 23/11/2010
Português

Relevância na Pesquisa

46.32%

In this article, we derive concentration inequalities for the
cross-validation estimate of the generalization error for stable predictors in
the context of risk assessment. The notion of stability has been first
introduced by \cite{DEWA79} and extended by \cite{KEA95}, \cite{BE01} and
\cite{KUNIY02} to characterize class of predictors with infinite VC dimension.
In particular, this covers $k$-nearest neighbors rules, bayesian algorithm
(\cite{KEA95}), boosting,... General loss functions and class of predictors are
considered. We use the formalism introduced by \cite{DUD03} to cover a large
variety of cross-validation procedures including leave-one-out
cross-validation, $k$-fold cross-validation, hold-out cross-validation (or
split sample), and the leave-$\upsilon$-out cross-validation.
In particular, we give a simple rule on how to choose the cross-validation,
depending on the stability of the class of predictors. In the special case of
uniform stability, an interesting consequence is that the number of elements in
the test set is not required to grow to infinity for the consistency of the
cross-validation procedure. In this special case, the particular interest of
leave-one-out cross-validation is emphasized.

Link permanente para citações:

## Estimating Subagging by cross-validation

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Publicado em 23/11/2010
Português

Relevância na Pesquisa

46.26%

In this article, we derive concentration inequalities for the
cross-validation estimate of the generalization error for subagged estimators,
both for classification and regressor. General loss functions and class of
predictors with both finite and infinite VC-dimension are considered. We
slightly generalize the formalism introduced by \cite{DUD03} to cover a large
variety of cross-validation procedures including leave-one-out
cross-validation, $k$-fold cross-validation, hold-out cross-validation (or
split sample), and the leave-$\upsilon$-out cross-validation.
\bigskip
\noindent An interesting consequence is that the probability upper bound is
bounded by the minimum of a Hoeffding-type bound and a Vapnik-type bounds, and
thus is smaller than 1 even for small learning set. Finally, we give a simple
rule on how to subbag the predictor. \bigskip

Link permanente para citações:

## Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Português

Relevância na Pesquisa

46.28%

In regular statistical models, the leave-one-out cross-validation is
asymptotically equivalent to the Akaike information criterion. However, since
many learning machines are singular statistical models, the asymptotic behavior
of the cross-validation remains unknown. In previous studies, we established
the singular learning theory and proposed a widely applicable information
criterion, the expectation value of which is asymptotically equal to the
average Bayes generalization loss. In the present paper, we theoretically
compare the Bayes cross-validation loss and the widely applicable information
criterion and prove two theorems. First, the Bayes cross-validation loss is
asymptotically equivalent to the widely applicable information criterion as a
random variable. Therefore, model selection and hyperparameter optimization
using these two values are asymptotically equivalent. Second, the sum of the
Bayes generalization error and the Bayes cross-validation error is
asymptotically equal to $2\lambda/n$, where $\lambda$ is the real log canonical
threshold and $n$ is the number of training samples. Therefore the relation
between the cross-validation error and the generalization error is determined
by the algebraic geometrical structure of a learning machine. We also clarify
that the deviance information criteria are different from the Bayes
cross-validation and the widely applicable information criterion.

Link permanente para citações:

## A computationally fast alternative to cross-validation in penalized Gaussian graphical models

Fonte: Universidade Cornell
Publicador: Universidade Cornell

Tipo: Artigo de Revista Científica

Português

Relevância na Pesquisa

46.23%

We study the problem of selection of regularization parameter in penalized
Gaussian graphical models. When the goal is to obtain the model with good
predicting power, cross validation is the gold standard. We present a new
estimator of Kullback-Leibler loss in Gaussian Graphical model which provides a
computationally fast alternative to cross-validation. The estimator is obtained
by approximating leave-one-out-cross validation. Our approach is demonstrated
on simulated data sets for various types of graphs. The proposed formula
exhibits superior performance, especially in the typical small sample size
scenario, compared to other available alternatives to cross validation, such as
Akaike's information criterion and Generalized approximate cross validation. We
also show that the estimator can be used to improve the performance of the BIC
when the sample size is small.; Comment: 16 pages, 5 figures

Link permanente para citações:

## Cross-validation for choosing resolution level for nonlinear wavelet curve estimators

Fonte: Chapman & Hall
Publicador: Chapman & Hall

Tipo: Artigo de Revista Científica

Português

Relevância na Pesquisa

66.23%

#Keywords: Curve estimation#Density estimation#Generalized kernel methods#Kernel estimator#Leastsquares cross-validation#Linear wavelet estimator#Nonparametric regression#Primary resolution level#Thresholding

We show that unless the target density is particularly smooth, cross-validation applied directly to nonlinear wavelet estimators produces an empirical value of primary resolution which fails, by an order of magnitude, to give asymptotic optimality. We note, too, that in the same setting, but for different reasons, cross-validation of the linear component of a wavelet estimator fails to give asymptotic optimality, if the primary resolution level that it suggests is applied to the nonlinear form of the estimator. We propose an alternative technique, based on multiple cross-validation of the linear component. Our method involves dividing the region of interest into a number of subregions, choosing a resolution level by cross-validation of the linear part of the estimator in each subregion, and taking the final empirically chosen level to be the minimum of the subregion values. This approach exploits the relative resistance of wavelet methods to over-smoothing: the final resolution level is too small in some parts of the main region, but that has a relatively minor effect on performance of the final estimator. The fact that we use the same resolution level throughout the region, rather than a different level in each subregion, means that we do not need to splice together different estimates and remove artificial jumps where the subregions abut.

Link permanente para citações:

## On Gauss quadrature and partial cross validation

Fonte: Elsevier
Publicador: Elsevier

Tipo: Artigo de Revista Científica

Português

Relevância na Pesquisa

66.14%

#Keywords: Approximation theory#Data reduction#Estimation#Numerical methods#Optimization#Computational costs#Finite intervals#Computational methods#regression analysis#statistical analysis Asymptotic distributions#Cross-validation

New estimators of expected values Ew(X) of functions of a random variable X are introduced. The new estimators are based on Gauss quadrature, a numerical method frequently used to approximate integrals over finite intervals. The estimators need a small number of numerical evaluations and hence are useful in partial cross validation (PCV) a numerical method for finding optimal smoothing parameters in nonparametric curve estimation. The PCV can considerably reduce the computational cost of the generalized cross validation method typically used to determine the optimal smoothing parameter.

Link permanente para citações:

## Body fat in judokas: cross-validation of lohman’s equation; Gordura corporal em judocas: validação cruzada da equação de Lohman

Fonte: Universidade Federal de Santa Catarina. Florianópolis, SC. Brasil
Publicador: Universidade Federal de Santa Catarina. Florianópolis, SC. Brasil

Tipo: info:eu-repo/semantics/article; info:eu-repo/semantics/publishedVersion; "Avaliado por Pares",; Avaliado por Pares; Descritiva
Formato: application/pdf; application/pdf

Publicado em 05/09/2007
Português

Relevância na Pesquisa

56.09%

#Cineantropometria#Composição corporal#Antropometria#Validade dos testes#Dobras cutâneas#Estudos de validação#Artes. Body composition#Anthropometry#Validity of tests#Skinfold thickness#Validation studies

Combat sports are disputed in weight categories. The greater the proportion of lean mass per kilogram of body mass, the greater a fi ghter’s capacity to exert force will be. Therefore, estimating percentage body fat (%F) is of fundamental importance for deciding in which category a fi ghter will compete. Therefore, the objective of this study was to verify the cross-validity of Lohman’s equation (LE) 7 for the estimation of %F in fi ghters. The sample comprised 30 male judokas, resident in the Distrito Federal, Brazil and with a mean age of 25.1±4.5 years, mean body mass of 81.8±12.5 kg and mean height of 176.3±7.1 cm. Hydrostatic weighing (HW) was used as the gold standard for cross-validation. The statistical criteria employed were those proposed by Lohman7 with the addition of residual score analysis.17 Correlation was high (r= 0.80) and signifi cant (p≤0.0005). Both the constant error and the standard error of estimation were less than 3.5%. The %FLE (15.1±4.7) was signifi cantly different (p≤0.0005) from the %FHW (11.9±4.2). Lohman’s equation signifi cantly overestimated the %F. The residual scores demonstrated a lack of agreement between %FLE and %FHW, of up to 8.5%F. This being so, Lohman’s equation does not exhibit cross-validity for this sample of judokas.; Esportes de combate são disputados por categorias de peso. Quanto maior a proporção de massa magra por quilogramas de massa corporal...

Link permanente para citações:

## PARAMETER SELECTION IN LEAST SQUARES-SUPPORT VECTOR MACHINES REGRESSION ORIENTED, USING GENERALIZED CROSS-VALIDATION

Fonte: DYNA
Publicador: DYNA

Tipo: Artigo de Revista Científica
Formato: text/html

Publicado em 01/02/2012
Português

Relevância na Pesquisa

66.05%

#parameter selection#least squares-support vector machines#multidimensional generalized cross validation#regression

In this work, a new methodology for automatic selection of the free parameters in the least squares-support vector machines (LS-SVM) regression oriented algorithm is proposed. We employ a multidimensional generalized cross-validation analysis in the linear equation system of LS-SVM. Our approach does not require prior knowledge about the influence of the LS-SVM free parameters in the results. The methodology is tested on two artificial and two real-world data sets. According to the results, our methodology computes suitable regressions with competitive relative errors.

Link permanente para citações: