Ch14. Deep Neural Networks

Challenges of Image Classification

  • Viewpoint variation
  • Illumination
  • Deformation
  • Occlusion
  • Background Clutter

Object detection/ Semantic segmentation/ Instance segmentation

Before Neural Nets, Classical Approach

Template-based Approach

  • Find a sub-image of the target image that matches a “template” image
  • Straightforward result, we know what we are looking for

Feature-based Approach

  • Find strong, representative features on images, like edges or corners
  • How to classify the result? Let a classifier “learn” the obtained features

Data-driven approach

Use Machine Learning to train a classifier

Use a feature descriptor combined with a classifier

  • Feature descriptor is a representation of useful information on an image
    • Points, edges, curves, color ...

    Histogram of Oriented Gradients (HOG)

    • Characterizes object appearance and shape
    • Gradient : Changes in pixel intensity across an image
    • Image is divided into small connected regions called cells, calculate gradient for the pixels within each cell
    • Normalization of histogram : group adjacent cells into regions called blocks, for invariance in illumination and shadowing
    • Famous application : Histograms of Oriented Gradients for Human Detection Untitled
      • b : each pixel shows the maximu, positive SVM weight in the block centred on the pixel
      • c : likewise for the negative SVM weights
      • e : computed R(rectengular)-HOG descriptor
      • f, g : the R-HOG descriptor weighted by respectively the positive and the negative SVM weights
  • Classifier is a method for determining class of unknown object
    • Nearest Neighbours, K-Nearest Neighbours, Support Vector Machine (SVM) ..

    Nearest Neighbour

    • Simple training, just memorize all data and labels
    • Compare images using Distance metric

    K-Nearest Neighbour

    • Consider majority vote from the K closest points
    • Compare images using distance metric Untitled

    Support Vector Machine (SVM)

    • Suppose we have points in an n-dimensional space and class labels attached to those points. SVM will divide the space such that different classes ends on different sides of the plane
    • Two types : linear and non-linear Untitled

Neural Network (NN)

Untitled

Different options for Activation functions

Untitled

Sigmoid / tanh → problem of vanishing gradient

ReLU : adventageous when back propagation

Deep Neural Network (DNN)

  • Shallow Neural Newrorks : 1 hidden layer
  • Deep Neural Networks : more than 2 hidden layer

Convolutional Neural Network (CNN)

  • Local connectivity
  • Weight sharing

Convolutional Layers

  • Slide over the image computing dot products with a filter/weight (represent features)
  • Produces a feature map

ReLU Layers

Allows CNN to account for non-linear relationships

Can use other activation functions

  • ReLU generally works better in practice
  • Sigmoid not recommended

Pooling Layers

  • Translational Invariance ⇒ Output remains the same even if feature is moved a little
  • Reduce the size of feature map
  • Different ways to implement
    • Max Pooling
    • Average Pooling
    • Min Pooling

Fully-Connected Layers

  • Final layer
  • Returns probability of class for the objects in image