{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to Neural Networks\n", "\n", "A deep feedforward neural network consists of multiple nodes which mimic the biological neurons of a human brain. \n", "\n", "**Network:** The model is associated with a directed acyclic graph describing how the functions (layers) are composed together.\n", "\n", "**Feedforward:** The information flows from inputs to outputs through the layers of the network.\n", "\n", "**Neural:** The inspiration originated from neuroscience. Each element of a layer plays a role analogous to a neuron.\n", "\n", "The goal of a feedforward neural network is to approximate some function $\\large f^*$. For example, for a classifier, $\\large y = f^*(x)$ maps an input $\\large x$ to a category $\\large y$. A feedforward network defines a mapping $\\large y = f(x; \\theta)$ and learns the value of the parameters $\\large \\theta$ that result in the best function approximation for $\\large f^*$.\n", "\n", "![python-scistack](https://icdn5.digitaltrends.com/image/artificial_neural_network_1-791x388.jpg)\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Timeline of Deep Learning\n", "\n", "Some of the biggest accomplishments and moments in deep learning:\n", "
• 1943: First mathematical model of a neural network.
• \n", "
• 1957: Perceptron model.
• \n", "
• 1969: First AI Winter - Perceptron model was incapable of learning the simple XOR function.
• \n", "
• 1986: Backpropagation allowed training of deep networks.
• \n", "
• 1989: Universal approximation theorem.
• \n", "
• 1995: Second AI Winter - Learning didn't scale for larger problems and SVMs became the method of choice.
• \n", "
• \n", "
• 2010: Imagenet dataset was created and the annual Imagenet competition kicked off.
• \n", "
• 2012: Neural network model halved existing Imagenet competition error.
• \n", "
• 2014: Generative adversarial neural networks.
• \n", "
• 2017: Google's DeepMind AlphaGo beat world number-one Go player.
• \n", "\n", "\n", "![python-scistack](http://beamandrew.github.io//images/deep_learning_101/imagenet_progress.png)\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Universal Approximation Theorem\n", "\n", "**Informal Statement:**\n", "\n", "A feedforward network with a single hidden layer containing a finite number of neurons can approximate \"any\" function on $\\large \\mathbb{R}^n$, under mild assumptions on the activation function and the target function.\n", "\n", "**Formal Statement:**\n", "\n", "![python-scistack](http://neuralnetworksanddeeplearning.com/images/tikz10.png)\n", "\n", "\n", "Let ${ \\large \\varphi :\\mathbb {R} \\to \\mathbb {R} }$ be a nonconstant, bounded, and continuous function. Let ${\\large I_{m}}$ denote a compact subset of ${\\large \\mathbb{R}^m}$. The space of real-valued continuous functions on ${\\large \\displaystyle I_{m}}$ is denoted by ${\\large \\displaystyle C(I_{m})}$. Then, given any ${\\large \\displaystyle \\varepsilon >0}$ and any function ${\\large \\displaystyle f\\in C(I_{m})}$, there exist an integer ${\\large \\displaystyle N}$, real constants ${\\large \\displaystyle v_{i},b_{i}\\in \\mathbb {R} }$ and real vectors ${\\large \\displaystyle w_{i}\\in \\mathbb {R} ^{m}}$ for ${\\large \\displaystyle i=1,\\ldots ,N}$ such that we may define:\n", "\n", "$${\\large \\displaystyle F(x)=\\sum _{i=1}^{N}v_{i}\\varphi \\left(w_{i}^{T}x+b_{i}\\right)}$$\n", "\n", "as an approximate realization of the function ${\\large \\displaystyle f}$; that is,\n", "\n", "$${\\large \\displaystyle |F(x)-f(x)|<\\varepsilon }$$ \n", "\n", "for all ${\\large \\displaystyle x\\in I_{m}}$. In other words, functions of the form ${\\large \\displaystyle F(x)}$ are dense in ${\\large \\displaystyle C(I_{m})}$.\n", "\n", "### No Free Lunch Theorem\n", "\n", "Averaged over all possible data-generating distributions, every classification algorithm has the same error rate when classifying previously unobserved points. In other words, no machine learning algorithm is **universally** any better than any other. \n", "\n", "### Conclusion\n", "\n", "Feedforward networks provide a universal system for representing functions. However, there is no universal procedure for examining a training set of specific examples and choosing a function that will **generalize to points not in the training set**.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Types of Neural Networks\n", "\n", "**Convolutional Neural Networks (CNNs):** \n", "\n", "
• They are translation-invariant, i.e., features are robust to translations.
• \n", "
• Convolutional layers are more efficient than fully-connected (dense) layers for images due to the lower number of parameters needed to represent convolutions.
• \n", "
• Applications: Image classification, object detection, segmnetation, etc.
• \n", "\n", "\n", "![python-scistack](https://s3.amazonaws.com/cdn.ayasdi.com/wp-content/uploads/2018/06/21100605/Fig2GCNN1.png)\n", "\n", "**Recurrent Neural Networks (RNNs):**\n", "\n", "![python-scistack](https://cdn-images-1.medium.com/max/1280/1*xLcQd_xeBWHeC6CeYSJ9bA.png)\n", "\n", "
• Unlike feedforward networks, RNNs are networks with loops in them. This allows information to persist (memory).
• \n", "
• Recurrent neural networks are great at processing sequences of data such as sentences and time series.
• \n", "
• Applications: Speech recognition, handwriting recognition, language models, time series analysis, etc.
• \n", "
• Long Short Term Memory (LSTM) networks are exceptional at adding short-term memory features and long-term features.\n", "\n", "![python-scistack](https://i.ytimg.com/vi/kMLl-TKaEnc/maxresdefault.jpg)\n", "\n", "\n", "**Autoencoders:**\n", "\n", "
• Unsupervised method.
• \n", "
• Encoder: Compresses data. Decoder: Reconstructs original data from compressed features.
• \n", "
• Applications: Compression, feature extraction, transfer learning, etc.
• \n", "\n", "\n", "![python-scistack](https://cdn-images-1.medium.com/max/1600/1*44eDEuZBEsmG_TCAKRI3Kw@2x.png)\n", "\n", "\n", "**Generative Adversarial Neural Networks (GANs):**\n", "\n", "
• Applications: Image (faces, anime characters) generation, style transfer, tranfer learning, etc.