Artificial Neural Networks Learning Process
Another day, another surprise from the New York Times! This time it's a front page article on "deep-learning, " an integral part of my own work and something that defies many attempts at simple explanation. Sadly, that's also true of the Times article, which never actually explains what deep learning is! Indeed, the reader is left to wonder if in this context "deep" refers to the nature of the philosophical problem that artificial intelligence presents.
The closest we get is this:
But what is new in recent months is the growing speed and accuracy of deep-learning programs, often called artificial neural networks or just “neural nets” for their resemblance to the neural connections in the brain.
The same sentence could have been written about the perceptron networks in the 1960's, "classic" neural networks in the 1980's, or spiking networks in the past decade. In fact, it was -- the article references the "AI winters" that followed the first two highly-publicized advances.
Ironically, "deep" learning has nothing at all to do with any similarity to neural structure, but rather to the structure of the AI model. The technique was popularized by an excellent piece of work by Geoffrey Hinton at the University of Toronto -- and Hinton is deservedly mentioned in the Times article. (Hinton is a star of this field; his work on training algorithms enabled the neural networks of the 80's in the first place!) Hinton was working with a class of stochastic models called Restricted Boltzmann Machines. Like many other networks, RBM's have two layers, one for input and one for processing (some other networks have a third layer for output). Scientists know that adding more layers could improve their results -- and their models' perceived "intelligence" -- but for a variety of reasons those layers made teaching the models extremely difficult. Hinton reasoned that perhaps one could fully train the first two layers, then "lock" them, add a third layer, and begin training it. Once the third layer was trained, one would lock it as well and add a fourth layer as desired. By initializing all the layers of the network in this way, it became possible to use classical algorithms to train all the layers together in a "fine-tuning" procedure. The result was a model consisting of many layers -- a "deep" model -- in contrast to the simple or "shallow" models that were commonplace to that time. Hinton called this "greedy layer-wise training, " referencing the surprising fact that each layer did its learning without knowing it would pass its knowledge on to another and nonetheless all the layers came to represent a cohesive representation of the data.
Source: This is the Green Room
Intelligent Data Mining in Law Enforcement Analytics: New Neural Networks Applied to Real Problems
You might also like:
For anyone with an actual interest in AI2008-03-02 20:38:48 by MsLoree
I found this book intresting:
Massively Parallel Artificial Intelligence
Edited by Hiroaki Kitano and James A. Hendler
"The increased sophistication and availability of massively parallel supercomputers has had two major impacts on research in artificial intelligence, both of which are addressed in this collection of exciting new AI theories and experiments. Massively parallel computers have been used to push forward research in traditional AI topics such as vision, search, and speech. More important, these machines allow AI to expand in exciting new ways by taking advantage of research in neuroscience and developing new models and paradigms, among them associate memory, neural networks, genetic algorithms, artificial life, society-of-mind models, and subsumption...
Modification proposed for SRK equation of state — Oil & Gas Journal
Osman, EA, and Al-Marhoun, MA, "Artificial neural networks models for predicting PVT properties of oil field brines," proceedings, 14th SPE Middle East Oil and Gas Show and Conference, Mar. 12-15, 2005, Manama, Bahrain. 5. Sunday, OO, Ali, S., …
Optimi prototype effectively predicts people at risk of depression — News-Medical.net
The Optimi prototype is based on artificial neural networks, capable of predicting if a person is at risk of becoming depressed with a reliability of around 85% in the subjects studied. The initial hypothesis states that the central problem and …
Networks of Learning Automata: Techniques for Online Stochastic Optimization