Bayesian Learning for Neural NetworksSpringer Science & Business Media, 6 gru 2012 - 204 Artificial "neural networks" are widely used as flexible models for classification and regression applications, but questions remain about how the power of these models can be safely exploited when training data is limited. This book demonstrates how Bayesian methods allow complex neural network models to be used without fear of the "overfitting" that can occur with traditional training methods. Insight into the nature of these complex Bayesian models is provided by a theoretical investigation of the priors over functions that underlie them. A practical implementation of Bayesian neural network learning using Markov chain Monte Carlo methods is also described, and software for it is freely available over the Internet. Presupposing only basic knowledge of probability and statistics, this book should be of interest to researchers in statistics, engineering, and artificial intelligence. |
Spis treści
1 | |
Priors for Infinite Networks | 15 |
Monte Carlo Implementation | 55 |
Evaluation of Neural Network Models | 104 |
Conclusions and Further Work 145 | 144 |
A Details of the Implementation | 153 |
B Obtaining the software | 168 |
177 | |
Inne wydania - Wyświetl wszystko
Kluczowe wyrazy i wyrażenia
ARD model autocorrelations average squared Bayesian inference Bayesian learning Bayesian neural network canonical distribution Cauchy prior chain Monte Carlo Chapter complex conditional distribution converge data sets energy equation estimate Figure fractional Brownian Functions drawn Gamma distribution Gamma prior Gaussian approximation Gaussian distributions Gaussian priors Gaussian process Gibbs sampling updates given guess hidden units hidden-to-output weights hybrid Monte Carlo hyperparameters initial phase input unit irrelevant inputs likelihood MacKay magnitudes Markov chain Monte Monte Carlo implementation Monte Carlo method Monte Carlo updates multilayer perceptron network parameters neural network neural network models noise non-Gaussian number of hidden obtained output units overfitting partial gradients performance posterior distribution predictive distribution prior distribution probability density procedure random walk robot arm problem sampling phase Section squared error stable distribution standard deviation stepsize adjustment factor super-transitions t-distribution tanh hidden units test set training data training set vague priors variance weights and biases zero