Introduction to Neural Networks

What is ML?

In traditional programming we write a program with data and rules, and we get out answers. In ML, we write a program with data and answers and we get out rules. So ML is good for problems where we can't easily define the rules. But to learn how it works, we'll have to use some simpler examples that are easily solved by traditional programming.

Machine learning is the simulation of the architecture of the neuron, arguably nature and evolution's greatest achievement. Neuron's appear in many forms of life and all do the same thing. They take in input and they make a choice, fire or don't.

When you connect many neuron's together, each individual neuron deciding based on its input what you see, for instance, and the strength or weight of the connects, you get emergent behavior. That is, behavior that's more complex that each of it's parts. This emergent behavior is called thinking or predicting, or classifying something as this or that. These are things we do all the time with out brains. To see it emerge from such a simple concept is amazing.

What is a Neural Network

Perceptron simulates a neuron. A neuron accepts chemical signals from it's dendrites. The cell body itself performs a calculation of these chemicals. If the chemical signals are over a certain threshold, they fire a signal down the axon and that signal is passed to other neurons to be the input for another calculation.

Artificial Neural Networks are a software representation of that process. It's goal is to simulate the neurons in the brain. if we had weights to the connections. In so doing we create a data structure that is capable of learning.

Activation Function

And activation function is the maker of the decision. If the inputs calculated achieve a certain threshold, the neuron fires, if it's under it doesn't. There are 3 common types, step, sigmoid and signed.

Prioritizing Inputs with Weights

What is a CNN

The short answer, it's the kind of network we use to process images and do image recognition tasks. That doesn't do it justice though. ConvNets or CNNs were inspired by out own visual cortex, the one you're using to watch this now.

The visual cortex has specialize neurons that only fire when they see something specific. Like vertical edges, horizontal edges or diagonal edges. So very smart people decided to simulate that in code.

Bias

But just Weights and input isn't really enough.

Overfitting

When the network gets it wrong, but too many epochs reinforce that wrongness.

Dropout

An overfitting strategy that removes 50% of neurons randomly at each Epoch. Forces the model to spread out the learning.

Model Zoos

Advice on how many layers/neurons you need to solve a particular problem. This is called your topology.

Tuning topology

Trial and error. Smaller network with less neurons in the hidden layers. More layers yield faster models.

Classification

Is when the label's are discrete. Black or white.

Regression

Labels are real numbers. Trying to predict the line.

ReLU

Rectifier function in a CNN. Increases non-linearity. Images are highly non-linear.

Pooling (Mean/Max/Sum)

Removes information but keeps the features. By stepping through the grid 2 at a time (stride) and recording the Max number. Helps in terms of processing and number of parameters to prevent over-fitting.

Important to see features, and not the noise that surrounds.

http://ais.uni-bonn.de/papers/icann2010_maxpool.pdf

MNIST CNN Example

http://scs.ryenrson.ca/~aharley/vis/conv/flat.html