Neural Networks

Motivation

Neural Networks: General-purpose learning algorithm for modeling non-linearity

... if you train it with "enough" data

Non-linear inputs

  • Images
  • Text
  • Speech
  • XOR

Limitations of linear models

Not "linearly separable"

xor

Can't draw boundary to separate x's and o's

Modeling non-linearity

Transform $x$ into $\phi(x)$ to become linearly separable

xor

$\phi(x)$ is the basis for a "neuron"

Neuron

$$y = W\phi(x) + b$$

$$\phi(x) = g(W'x + b')$$

Trainable: $W', b', W, b$

$g(x)$ is a non-linear function, e.g. Sigmoid

$$y = sigmoid(W(x) + b)$$

$$y = relu(W(x) + b)$$

Neuron (Perceptron)

neuron

(image: Neural Network Methods in Natural Language Processing, Goldberg, 2017)

Neural Network

Multiple neurons in 1 layer make up an "Artificial Neural Network"

neural network

(image: Wikipedia)

Neural Network (Deep)

Multiple "hidden" layers of neurons make up a "Deep Neural Network"

multi-layer perceptron

(image: Goldberg, 2017)

Properties of a Neural Network

Term Description Examples
Input dimension How many inputs 4
Output dimension How many outputs 3
Number of hidden layers Number of layers, excluding input and output 2
Activation type Type of non-linear function sigmoid, ReLU, tanh
Hidden layer type How the neurons are connected together Fully-connected, Convolutional

Activation types

What non-linearity is applied

dnn

(image: Goldberg, 2017)

Layer types

How the neurons are connected together, and what operations are performed with x, W, and b:

  • Dense
  • Convolutional
  • Recurrent
  • Residual

More detail to come...

Walkthrough: Neural Network Architectures in keras

In this walkthrough, we will use Keras to examine the architecture of some well-known neural networks.

Setup - Conda environment (from scratch)

  1. Create a new conda environment called mldds03 a. Launch an Anaconda Python command window b. conda create -n mldds03 python=3
  2. Activate the conda environment: conda activate mldds03
  3. Install: conda install jupyter numpy pandas matplotlib keras pydot python-graphviz
  4. Navigate to the courseware folder: cd mldds-courseware
  5. Launch Jupyter: jupyter notebook and open this notebook

Setup - Conda environment (from existing)

Install: conda install keras

Install: conda install pydot python-graphviz

Pre-trained Neural Networks in Keras

"Pre-trained" neural networks are available under keras.applications

https://keras.io/applications/

These are trained on the ImageNet dataset (http://www.image-net.org/), which contains millions of images.

The neural network architectures from keras are previous years submissions to the ImageNet annual challenge.

In [1]:
import keras

print(keras.__version__)
Using TensorFlow backend.
2.2.0

MobileNet

MobileNet is a pre-trained ImageNet DNN optimized to run on smaller devices.

Documentation: https://keras.io/applications/#mobilenet

Implementation: https://github.com/keras-team/keras-applications/blob/master/keras_applications/mobilenet.py

In [2]:
from keras.applications import mobilenet

mobilenet_model = mobilenet.MobileNet(weights='imagenet')
mobilenet_model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
conv1_pad (ZeroPadding2D)    (None, 226, 226, 3)       0         
_________________________________________________________________
conv1 (Conv2D)               (None, 112, 112, 32)      864       
_________________________________________________________________
conv1_bn (BatchNormalization (None, 112, 112, 32)      128       
_________________________________________________________________
conv1_relu (Activation)      (None, 112, 112, 32)      0         
_________________________________________________________________
conv_pad_1 (ZeroPadding2D)   (None, 114, 114, 32)      0         
_________________________________________________________________
conv_dw_1 (DepthwiseConv2D)  (None, 112, 112, 32)      288       
_________________________________________________________________
conv_dw_1_bn (BatchNormaliza (None, 112, 112, 32)      128       
_________________________________________________________________
conv_dw_1_relu (Activation)  (None, 112, 112, 32)      0         
_________________________________________________________________
conv_pw_1 (Conv2D)           (None, 112, 112, 64)      2048      
_________________________________________________________________
conv_pw_1_bn (BatchNormaliza (None, 112, 112, 64)      256       
_________________________________________________________________
conv_pw_1_relu (Activation)  (None, 112, 112, 64)      0         
_________________________________________________________________
conv_pad_2 (ZeroPadding2D)   (None, 114, 114, 64)      0         
_________________________________________________________________
conv_dw_2 (DepthwiseConv2D)  (None, 56, 56, 64)        576       
_________________________________________________________________
conv_dw_2_bn (BatchNormaliza (None, 56, 56, 64)        256       
_________________________________________________________________
conv_dw_2_relu (Activation)  (None, 56, 56, 64)        0         
_________________________________________________________________
conv_pw_2 (Conv2D)           (None, 56, 56, 128)       8192      
_________________________________________________________________
conv_pw_2_bn (BatchNormaliza (None, 56, 56, 128)       512       
_________________________________________________________________
conv_pw_2_relu (Activation)  (None, 56, 56, 128)       0         
_________________________________________________________________
conv_pad_3 (ZeroPadding2D)   (None, 58, 58, 128)       0         
_________________________________________________________________
conv_dw_3 (DepthwiseConv2D)  (None, 56, 56, 128)       1152      
_________________________________________________________________
conv_dw_3_bn (BatchNormaliza (None, 56, 56, 128)       512       
_________________________________________________________________
conv_dw_3_relu (Activation)  (None, 56, 56, 128)       0         
_________________________________________________________________
conv_pw_3 (Conv2D)           (None, 56, 56, 128)       16384     
_________________________________________________________________
conv_pw_3_bn (BatchNormaliza (None, 56, 56, 128)       512       
_________________________________________________________________
conv_pw_3_relu (Activation)  (None, 56, 56, 128)       0         
_________________________________________________________________
conv_pad_4 (ZeroPadding2D)   (None, 58, 58, 128)       0         
_________________________________________________________________
conv_dw_4 (DepthwiseConv2D)  (None, 28, 28, 128)       1152      
_________________________________________________________________
conv_dw_4_bn (BatchNormaliza (None, 28, 28, 128)       512       
_________________________________________________________________
conv_dw_4_relu (Activation)  (None, 28, 28, 128)       0         
_________________________________________________________________
conv_pw_4 (Conv2D)           (None, 28, 28, 256)       32768     
_________________________________________________________________
conv_pw_4_bn (BatchNormaliza (None, 28, 28, 256)       1024      
_________________________________________________________________
conv_pw_4_relu (Activation)  (None, 28, 28, 256)       0         
_________________________________________________________________
conv_pad_5 (ZeroPadding2D)   (None, 30, 30, 256)       0         
_________________________________________________________________
conv_dw_5 (DepthwiseConv2D)  (None, 28, 28, 256)       2304      
_________________________________________________________________
conv_dw_5_bn (BatchNormaliza (None, 28, 28, 256)       1024      
_________________________________________________________________
conv_dw_5_relu (Activation)  (None, 28, 28, 256)       0         
_________________________________________________________________
conv_pw_5 (Conv2D)           (None, 28, 28, 256)       65536     
_________________________________________________________________
conv_pw_5_bn (BatchNormaliza (None, 28, 28, 256)       1024      
_________________________________________________________________
conv_pw_5_relu (Activation)  (None, 28, 28, 256)       0         
_________________________________________________________________
conv_pad_6 (ZeroPadding2D)   (None, 30, 30, 256)       0         
_________________________________________________________________
conv_dw_6 (DepthwiseConv2D)  (None, 14, 14, 256)       2304      
_________________________________________________________________
conv_dw_6_bn (BatchNormaliza (None, 14, 14, 256)       1024      
_________________________________________________________________
conv_dw_6_relu (Activation)  (None, 14, 14, 256)       0         
_________________________________________________________________
conv_pw_6 (Conv2D)           (None, 14, 14, 512)       131072    
_________________________________________________________________
conv_pw_6_bn (BatchNormaliza (None, 14, 14, 512)       2048      
_________________________________________________________________
conv_pw_6_relu (Activation)  (None, 14, 14, 512)       0         
_________________________________________________________________
conv_pad_7 (ZeroPadding2D)   (None, 16, 16, 512)       0         
_________________________________________________________________
conv_dw_7 (DepthwiseConv2D)  (None, 14, 14, 512)       4608      
_________________________________________________________________
conv_dw_7_bn (BatchNormaliza (None, 14, 14, 512)       2048      
_________________________________________________________________
conv_dw_7_relu (Activation)  (None, 14, 14, 512)       0         
_________________________________________________________________
conv_pw_7 (Conv2D)           (None, 14, 14, 512)       262144    
_________________________________________________________________
conv_pw_7_bn (BatchNormaliza (None, 14, 14, 512)       2048      
_________________________________________________________________
conv_pw_7_relu (Activation)  (None, 14, 14, 512)       0         
_________________________________________________________________
conv_pad_8 (ZeroPadding2D)   (None, 16, 16, 512)       0         
_________________________________________________________________
conv_dw_8 (DepthwiseConv2D)  (None, 14, 14, 512)       4608      
_________________________________________________________________
conv_dw_8_bn (BatchNormaliza (None, 14, 14, 512)       2048      
_________________________________________________________________
conv_dw_8_relu (Activation)  (None, 14, 14, 512)       0         
_________________________________________________________________
conv_pw_8 (Conv2D)           (None, 14, 14, 512)       262144    
_________________________________________________________________
conv_pw_8_bn (BatchNormaliza (None, 14, 14, 512)       2048      
_________________________________________________________________
conv_pw_8_relu (Activation)  (None, 14, 14, 512)       0         
_________________________________________________________________
conv_pad_9 (ZeroPadding2D)   (None, 16, 16, 512)       0         
_________________________________________________________________
conv_dw_9 (DepthwiseConv2D)  (None, 14, 14, 512)       4608      
_________________________________________________________________
conv_dw_9_bn (BatchNormaliza (None, 14, 14, 512)       2048      
_________________________________________________________________
conv_dw_9_relu (Activation)  (None, 14, 14, 512)       0         
_________________________________________________________________
conv_pw_9 (Conv2D)           (None, 14, 14, 512)       262144    
_________________________________________________________________
conv_pw_9_bn (BatchNormaliza (None, 14, 14, 512)       2048      
_________________________________________________________________
conv_pw_9_relu (Activation)  (None, 14, 14, 512)       0         
_________________________________________________________________
conv_pad_10 (ZeroPadding2D)  (None, 16, 16, 512)       0         
_________________________________________________________________
conv_dw_10 (DepthwiseConv2D) (None, 14, 14, 512)       4608      
_________________________________________________________________
conv_dw_10_bn (BatchNormaliz (None, 14, 14, 512)       2048      
_________________________________________________________________
conv_dw_10_relu (Activation) (None, 14, 14, 512)       0         
_________________________________________________________________
conv_pw_10 (Conv2D)          (None, 14, 14, 512)       262144    
_________________________________________________________________
conv_pw_10_bn (BatchNormaliz (None, 14, 14, 512)       2048      
_________________________________________________________________
conv_pw_10_relu (Activation) (None, 14, 14, 512)       0         
_________________________________________________________________
conv_pad_11 (ZeroPadding2D)  (None, 16, 16, 512)       0         
_________________________________________________________________
conv_dw_11 (DepthwiseConv2D) (None, 14, 14, 512)       4608      
_________________________________________________________________
conv_dw_11_bn (BatchNormaliz (None, 14, 14, 512)       2048      
_________________________________________________________________
conv_dw_11_relu (Activation) (None, 14, 14, 512)       0         
_________________________________________________________________
conv_pw_11 (Conv2D)          (None, 14, 14, 512)       262144    
_________________________________________________________________
conv_pw_11_bn (BatchNormaliz (None, 14, 14, 512)       2048      
_________________________________________________________________
conv_pw_11_relu (Activation) (None, 14, 14, 512)       0         
_________________________________________________________________
conv_pad_12 (ZeroPadding2D)  (None, 16, 16, 512)       0         
_________________________________________________________________
conv_dw_12 (DepthwiseConv2D) (None, 7, 7, 512)         4608      
_________________________________________________________________
conv_dw_12_bn (BatchNormaliz (None, 7, 7, 512)         2048      
_________________________________________________________________
conv_dw_12_relu (Activation) (None, 7, 7, 512)         0         
_________________________________________________________________
conv_pw_12 (Conv2D)          (None, 7, 7, 1024)        524288    
_________________________________________________________________
conv_pw_12_bn (BatchNormaliz (None, 7, 7, 1024)        4096      
_________________________________________________________________
conv_pw_12_relu (Activation) (None, 7, 7, 1024)        0         
_________________________________________________________________
conv_pad_13 (ZeroPadding2D)  (None, 9, 9, 1024)        0         
_________________________________________________________________
conv_dw_13 (DepthwiseConv2D) (None, 7, 7, 1024)        9216      
_________________________________________________________________
conv_dw_13_bn (BatchNormaliz (None, 7, 7, 1024)        4096      
_________________________________________________________________
conv_dw_13_relu (Activation) (None, 7, 7, 1024)        0         
_________________________________________________________________
conv_pw_13 (Conv2D)          (None, 7, 7, 1024)        1048576   
_________________________________________________________________
conv_pw_13_bn (BatchNormaliz (None, 7, 7, 1024)        4096      
_________________________________________________________________
conv_pw_13_relu (Activation) (None, 7, 7, 1024)        0         
_________________________________________________________________
global_average_pooling2d_1 ( (None, 1024)              0         
_________________________________________________________________
reshape_1 (Reshape)          (None, 1, 1, 1024)        0         
_________________________________________________________________
dropout (Dropout)            (None, 1, 1, 1024)        0         
_________________________________________________________________
conv_preds (Conv2D)          (None, 1, 1, 1000)        1025000   
_________________________________________________________________
act_softmax (Activation)     (None, 1, 1, 1000)        0         
_________________________________________________________________
reshape_2 (Reshape)          (None, 1000)              0         
=================================================================
Total params: 4,253,864
Trainable params: 4,231,976
Non-trainable params: 21,888
_________________________________________________________________

ResNet50

ResNet50 is another pre-trained ImageNet DNN. This is a larger network than MobileNet (almost 26 million parameters). It improves accuracy by introducing residual connections, which are connections that skip layers.

Documentation: https://keras.io/applications/#resnet50

Implementation: https://github.com/keras-team/keras-applications/blob/master/keras_applications/resnet50.py

In [3]:
from keras.applications import resnet50

resnet_model = resnet50.ResNet50(weights='imagenet')
resnet_model.summary()
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_2 (InputLayer)            (None, 224, 224, 3)  0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 230, 230, 3)  0           input_2[0][0]                    
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 112, 112, 64) 9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 112, 112, 64) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 112, 112, 64) 0           bn_conv1[0][0]                   
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 55, 55, 64)   0           activation_1[0][0]               
__________________________________________________________________________________________________
res2a_branch2a (Conv2D)         (None, 55, 55, 64)   4160        max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
bn2a_branch2a (BatchNormalizati (None, 55, 55, 64)   256         res2a_branch2a[0][0]             
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 55, 55, 64)   0           bn2a_branch2a[0][0]              
__________________________________________________________________________________________________
res2a_branch2b (Conv2D)         (None, 55, 55, 64)   36928       activation_2[0][0]               
__________________________________________________________________________________________________
bn2a_branch2b (BatchNormalizati (None, 55, 55, 64)   256         res2a_branch2b[0][0]             
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 55, 55, 64)   0           bn2a_branch2b[0][0]              
__________________________________________________________________________________________________
res2a_branch2c (Conv2D)         (None, 55, 55, 256)  16640       activation_3[0][0]               
__________________________________________________________________________________________________
res2a_branch1 (Conv2D)          (None, 55, 55, 256)  16640       max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
bn2a_branch2c (BatchNormalizati (None, 55, 55, 256)  1024        res2a_branch2c[0][0]             
__________________________________________________________________________________________________
bn2a_branch1 (BatchNormalizatio (None, 55, 55, 256)  1024        res2a_branch1[0][0]              
__________________________________________________________________________________________________
add_1 (Add)                     (None, 55, 55, 256)  0           bn2a_branch2c[0][0]              
                                                                 bn2a_branch1[0][0]               
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 55, 55, 256)  0           add_1[0][0]                      
__________________________________________________________________________________________________
res2b_branch2a (Conv2D)         (None, 55, 55, 64)   16448       activation_4[0][0]               
__________________________________________________________________________________________________
bn2b_branch2a (BatchNormalizati (None, 55, 55, 64)   256         res2b_branch2a[0][0]             
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 55, 55, 64)   0           bn2b_branch2a[0][0]              
__________________________________________________________________________________________________
res2b_branch2b (Conv2D)         (None, 55, 55, 64)   36928       activation_5[0][0]               
__________________________________________________________________________________________________
bn2b_branch2b (BatchNormalizati (None, 55, 55, 64)   256         res2b_branch2b[0][0]             
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 55, 55, 64)   0           bn2b_branch2b[0][0]              
__________________________________________________________________________________________________
res2b_branch2c (Conv2D)         (None, 55, 55, 256)  16640       activation_6[0][0]               
__________________________________________________________________________________________________
bn2b_branch2c (BatchNormalizati (None, 55, 55, 256)  1024        res2b_branch2c[0][0]             
__________________________________________________________________________________________________
add_2 (Add)                     (None, 55, 55, 256)  0           bn2b_branch2c[0][0]              
                                                                 activation_4[0][0]               
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 55, 55, 256)  0           add_2[0][0]                      
__________________________________________________________________________________________________
res2c_branch2a (Conv2D)         (None, 55, 55, 64)   16448       activation_7[0][0]               
__________________________________________________________________________________________________
bn2c_branch2a (BatchNormalizati (None, 55, 55, 64)   256         res2c_branch2a[0][0]             
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 55, 55, 64)   0           bn2c_branch2a[0][0]              
__________________________________________________________________________________________________
res2c_branch2b (Conv2D)         (None, 55, 55, 64)   36928       activation_8[0][0]               
__________________________________________________________________________________________________
bn2c_branch2b (BatchNormalizati (None, 55, 55, 64)   256         res2c_branch2b[0][0]             
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 55, 55, 64)   0           bn2c_branch2b[0][0]              
__________________________________________________________________________________________________
res2c_branch2c (Conv2D)         (None, 55, 55, 256)  16640       activation_9[0][0]               
__________________________________________________________________________________________________
bn2c_branch2c (BatchNormalizati (None, 55, 55, 256)  1024        res2c_branch2c[0][0]             
__________________________________________________________________________________________________
add_3 (Add)                     (None, 55, 55, 256)  0           bn2c_branch2c[0][0]              
                                                                 activation_7[0][0]               
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 55, 55, 256)  0           add_3[0][0]                      
__________________________________________________________________________________________________
res3a_branch2a (Conv2D)         (None, 28, 28, 128)  32896       activation_10[0][0]              
__________________________________________________________________________________________________
bn3a_branch2a (BatchNormalizati (None, 28, 28, 128)  512         res3a_branch2a[0][0]             
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 28, 28, 128)  0           bn3a_branch2a[0][0]              
__________________________________________________________________________________________________
res3a_branch2b (Conv2D)         (None, 28, 28, 128)  147584      activation_11[0][0]              
__________________________________________________________________________________________________
bn3a_branch2b (BatchNormalizati (None, 28, 28, 128)  512         res3a_branch2b[0][0]             
__________________________________________________________________________________________________
activation_12 (Activation)      (None, 28, 28, 128)  0           bn3a_branch2b[0][0]              
__________________________________________________________________________________________________
res3a_branch2c (Conv2D)         (None, 28, 28, 512)  66048       activation_12[0][0]              
__________________________________________________________________________________________________
res3a_branch1 (Conv2D)          (None, 28, 28, 512)  131584      activation_10[0][0]              
__________________________________________________________________________________________________
bn3a_branch2c (BatchNormalizati (None, 28, 28, 512)  2048        res3a_branch2c[0][0]             
__________________________________________________________________________________________________
bn3a_branch1 (BatchNormalizatio (None, 28, 28, 512)  2048        res3a_branch1[0][0]              
__________________________________________________________________________________________________
add_4 (Add)                     (None, 28, 28, 512)  0           bn3a_branch2c[0][0]              
                                                                 bn3a_branch1[0][0]               
__________________________________________________________________________________________________
activation_13 (Activation)      (None, 28, 28, 512)  0           add_4[0][0]                      
__________________________________________________________________________________________________
res3b_branch2a (Conv2D)         (None, 28, 28, 128)  65664       activation_13[0][0]              
__________________________________________________________________________________________________
bn3b_branch2a (BatchNormalizati (None, 28, 28, 128)  512         res3b_branch2a[0][0]             
__________________________________________________________________________________________________
activation_14 (Activation)      (None, 28, 28, 128)  0           bn3b_branch2a[0][0]              
__________________________________________________________________________________________________
res3b_branch2b (Conv2D)         (None, 28, 28, 128)  147584      activation_14[0][0]              
__________________________________________________________________________________________________
bn3b_branch2b (BatchNormalizati (None, 28, 28, 128)  512         res3b_branch2b[0][0]             
__________________________________________________________________________________________________
activation_15 (Activation)      (None, 28, 28, 128)  0           bn3b_branch2b[0][0]              
__________________________________________________________________________________________________
res3b_branch2c (Conv2D)         (None, 28, 28, 512)  66048       activation_15[0][0]              
__________________________________________________________________________________________________
bn3b_branch2c (BatchNormalizati (None, 28, 28, 512)  2048        res3b_branch2c[0][0]             
__________________________________________________________________________________________________
add_5 (Add)                     (None, 28, 28, 512)  0           bn3b_branch2c[0][0]              
                                                                 activation_13[0][0]              
__________________________________________________________________________________________________
activation_16 (Activation)      (None, 28, 28, 512)  0           add_5[0][0]                      
__________________________________________________________________________________________________
res3c_branch2a (Conv2D)         (None, 28, 28, 128)  65664       activation_16[0][0]              
__________________________________________________________________________________________________
bn3c_branch2a (BatchNormalizati (None, 28, 28, 128)  512         res3c_branch2a[0][0]             
__________________________________________________________________________________________________
activation_17 (Activation)      (None, 28, 28, 128)  0           bn3c_branch2a[0][0]              
__________________________________________________________________________________________________
res3c_branch2b (Conv2D)         (None, 28, 28, 128)  147584      activation_17[0][0]              
__________________________________________________________________________________________________
bn3c_branch2b (BatchNormalizati (None, 28, 28, 128)  512         res3c_branch2b[0][0]             
__________________________________________________________________________________________________
activation_18 (Activation)      (None, 28, 28, 128)  0           bn3c_branch2b[0][0]              
__________________________________________________________________________________________________
res3c_branch2c (Conv2D)         (None, 28, 28, 512)  66048       activation_18[0][0]              
__________________________________________________________________________________________________
bn3c_branch2c (BatchNormalizati (None, 28, 28, 512)  2048        res3c_branch2c[0][0]             
__________________________________________________________________________________________________
add_6 (Add)                     (None, 28, 28, 512)  0           bn3c_branch2c[0][0]              
                                                                 activation_16[0][0]              
__________________________________________________________________________________________________
activation_19 (Activation)      (None, 28, 28, 512)  0           add_6[0][0]                      
__________________________________________________________________________________________________
res3d_branch2a (Conv2D)         (None, 28, 28, 128)  65664       activation_19[0][0]              
__________________________________________________________________________________________________
bn3d_branch2a (BatchNormalizati (None, 28, 28, 128)  512         res3d_branch2a[0][0]             
__________________________________________________________________________________________________
activation_20 (Activation)      (None, 28, 28, 128)  0           bn3d_branch2a[0][0]              
__________________________________________________________________________________________________
res3d_branch2b (Conv2D)         (None, 28, 28, 128)  147584      activation_20[0][0]              
__________________________________________________________________________________________________
bn3d_branch2b (BatchNormalizati (None, 28, 28, 128)  512         res3d_branch2b[0][0]             
__________________________________________________________________________________________________
activation_21 (Activation)      (None, 28, 28, 128)  0           bn3d_branch2b[0][0]              
__________________________________________________________________________________________________
res3d_branch2c (Conv2D)         (None, 28, 28, 512)  66048       activation_21[0][0]              
__________________________________________________________________________________________________
bn3d_branch2c (BatchNormalizati (None, 28, 28, 512)  2048        res3d_branch2c[0][0]             
__________________________________________________________________________________________________
add_7 (Add)                     (None, 28, 28, 512)  0           bn3d_branch2c[0][0]              
                                                                 activation_19[0][0]              
__________________________________________________________________________________________________
activation_22 (Activation)      (None, 28, 28, 512)  0           add_7[0][0]                      
__________________________________________________________________________________________________
res4a_branch2a (Conv2D)         (None, 14, 14, 256)  131328      activation_22[0][0]              
__________________________________________________________________________________________________
bn4a_branch2a (BatchNormalizati (None, 14, 14, 256)  1024        res4a_branch2a[0][0]             
__________________________________________________________________________________________________
activation_23 (Activation)      (None, 14, 14, 256)  0           bn4a_branch2a[0][0]              
__________________________________________________________________________________________________
res4a_branch2b (Conv2D)         (None, 14, 14, 256)  590080      activation_23[0][0]              
__________________________________________________________________________________________________
bn4a_branch2b (BatchNormalizati (None, 14, 14, 256)  1024        res4a_branch2b[0][0]             
__________________________________________________________________________________________________
activation_24 (Activation)      (None, 14, 14, 256)  0           bn4a_branch2b[0][0]              
__________________________________________________________________________________________________
res4a_branch2c (Conv2D)         (None, 14, 14, 1024) 263168      activation_24[0][0]              
__________________________________________________________________________________________________
res4a_branch1 (Conv2D)          (None, 14, 14, 1024) 525312      activation_22[0][0]              
__________________________________________________________________________________________________
bn4a_branch2c (BatchNormalizati (None, 14, 14, 1024) 4096        res4a_branch2c[0][0]             
__________________________________________________________________________________________________
bn4a_branch1 (BatchNormalizatio (None, 14, 14, 1024) 4096        res4a_branch1[0][0]              
__________________________________________________________________________________________________
add_8 (Add)                     (None, 14, 14, 1024) 0           bn4a_branch2c[0][0]              
                                                                 bn4a_branch1[0][0]               
__________________________________________________________________________________________________
activation_25 (Activation)      (None, 14, 14, 1024) 0           add_8[0][0]                      
__________________________________________________________________________________________________
res4b_branch2a (Conv2D)         (None, 14, 14, 256)  262400      activation_25[0][0]              
__________________________________________________________________________________________________
bn4b_branch2a (BatchNormalizati (None, 14, 14, 256)  1024        res4b_branch2a[0][0]             
__________________________________________________________________________________________________
activation_26 (Activation)      (None, 14, 14, 256)  0           bn4b_branch2a[0][0]              
__________________________________________________________________________________________________
res4b_branch2b (Conv2D)         (None, 14, 14, 256)  590080      activation_26[0][0]              
__________________________________________________________________________________________________
bn4b_branch2b (BatchNormalizati (None, 14, 14, 256)  1024        res4b_branch2b[0][0]             
__________________________________________________________________________________________________
activation_27 (Activation)      (None, 14, 14, 256)  0           bn4b_branch2b[0][0]              
__________________________________________________________________________________________________
res4b_branch2c (Conv2D)         (None, 14, 14, 1024) 263168      activation_27[0][0]              
__________________________________________________________________________________________________
bn4b_branch2c (BatchNormalizati (None, 14, 14, 1024) 4096        res4b_branch2c[0][0]             
__________________________________________________________________________________________________
add_9 (Add)                     (None, 14, 14, 1024) 0           bn4b_branch2c[0][0]              
                                                                 activation_25[0][0]              
__________________________________________________________________________________________________
activation_28 (Activation)      (None, 14, 14, 1024) 0           add_9[0][0]                      
__________________________________________________________________________________________________
res4c_branch2a (Conv2D)         (None, 14, 14, 256)  262400      activation_28[0][0]              
__________________________________________________________________________________________________
bn4c_branch2a (BatchNormalizati (None, 14, 14, 256)  1024        res4c_branch2a[0][0]             
__________________________________________________________________________________________________
activation_29 (Activation)      (None, 14, 14, 256)  0           bn4c_branch2a[0][0]              
__________________________________________________________________________________________________
res4c_branch2b (Conv2D)         (None, 14, 14, 256)  590080      activation_29[0][0]              
__________________________________________________________________________________________________
bn4c_branch2b (BatchNormalizati (None, 14, 14, 256)  1024        res4c_branch2b[0][0]             
__________________________________________________________________________________________________
activation_30 (Activation)      (None, 14, 14, 256)  0           bn4c_branch2b[0][0]              
__________________________________________________________________________________________________
res4c_branch2c (Conv2D)         (None, 14, 14, 1024) 263168      activation_30[0][0]              
__________________________________________________________________________________________________
bn4c_branch2c (BatchNormalizati (None, 14, 14, 1024) 4096        res4c_branch2c[0][0]             
__________________________________________________________________________________________________
add_10 (Add)                    (None, 14, 14, 1024) 0           bn4c_branch2c[0][0]              
                                                                 activation_28[0][0]              
__________________________________________________________________________________________________
activation_31 (Activation)      (None, 14, 14, 1024) 0           add_10[0][0]                     
__________________________________________________________________________________________________
res4d_branch2a (Conv2D)         (None, 14, 14, 256)  262400      activation_31[0][0]              
__________________________________________________________________________________________________
bn4d_branch2a (BatchNormalizati (None, 14, 14, 256)  1024        res4d_branch2a[0][0]             
__________________________________________________________________________________________________
activation_32 (Activation)      (None, 14, 14, 256)  0           bn4d_branch2a[0][0]              
__________________________________________________________________________________________________
res4d_branch2b (Conv2D)         (None, 14, 14, 256)  590080      activation_32[0][0]              
__________________________________________________________________________________________________
bn4d_branch2b (BatchNormalizati (None, 14, 14, 256)  1024        res4d_branch2b[0][0]             
__________________________________________________________________________________________________
activation_33 (Activation)      (None, 14, 14, 256)  0           bn4d_branch2b[0][0]              
__________________________________________________________________________________________________
res4d_branch2c (Conv2D)         (None, 14, 14, 1024) 263168      activation_33[0][0]              
__________________________________________________________________________________________________
bn4d_branch2c (BatchNormalizati (None, 14, 14, 1024) 4096        res4d_branch2c[0][0]             
__________________________________________________________________________________________________
add_11 (Add)                    (None, 14, 14, 1024) 0           bn4d_branch2c[0][0]              
                                                                 activation_31[0][0]              
__________________________________________________________________________________________________
activation_34 (Activation)      (None, 14, 14, 1024) 0           add_11[0][0]                     
__________________________________________________________________________________________________
res4e_branch2a (Conv2D)         (None, 14, 14, 256)  262400      activation_34[0][0]              
__________________________________________________________________________________________________
bn4e_branch2a (BatchNormalizati (None, 14, 14, 256)  1024        res4e_branch2a[0][0]             
__________________________________________________________________________________________________
activation_35 (Activation)      (None, 14, 14, 256)  0           bn4e_branch2a[0][0]              
__________________________________________________________________________________________________
res4e_branch2b (Conv2D)         (None, 14, 14, 256)  590080      activation_35[0][0]              
__________________________________________________________________________________________________
bn4e_branch2b (BatchNormalizati (None, 14, 14, 256)  1024        res4e_branch2b[0][0]             
__________________________________________________________________________________________________
activation_36 (Activation)      (None, 14, 14, 256)  0           bn4e_branch2b[0][0]              
__________________________________________________________________________________________________
res4e_branch2c (Conv2D)         (None, 14, 14, 1024) 263168      activation_36[0][0]              
__________________________________________________________________________________________________
bn4e_branch2c (BatchNormalizati (None, 14, 14, 1024) 4096        res4e_branch2c[0][0]             
__________________________________________________________________________________________________
add_12 (Add)                    (None, 14, 14, 1024) 0           bn4e_branch2c[0][0]              
                                                                 activation_34[0][0]              
__________________________________________________________________________________________________
activation_37 (Activation)      (None, 14, 14, 1024) 0           add_12[0][0]                     
__________________________________________________________________________________________________
res4f_branch2a (Conv2D)         (None, 14, 14, 256)  262400      activation_37[0][0]              
__________________________________________________________________________________________________
bn4f_branch2a (BatchNormalizati (None, 14, 14, 256)  1024        res4f_branch2a[0][0]             
__________________________________________________________________________________________________
activation_38 (Activation)      (None, 14, 14, 256)  0           bn4f_branch2a[0][0]              
__________________________________________________________________________________________________
res4f_branch2b (Conv2D)         (None, 14, 14, 256)  590080      activation_38[0][0]              
__________________________________________________________________________________________________
bn4f_branch2b (BatchNormalizati (None, 14, 14, 256)  1024        res4f_branch2b[0][0]             
__________________________________________________________________________________________________
activation_39 (Activation)      (None, 14, 14, 256)  0           bn4f_branch2b[0][0]              
__________________________________________________________________________________________________
res4f_branch2c (Conv2D)         (None, 14, 14, 1024) 263168      activation_39[0][0]              
__________________________________________________________________________________________________
bn4f_branch2c (BatchNormalizati (None, 14, 14, 1024) 4096        res4f_branch2c[0][0]             
__________________________________________________________________________________________________
add_13 (Add)                    (None, 14, 14, 1024) 0           bn4f_branch2c[0][0]              
                                                                 activation_37[0][0]              
__________________________________________________________________________________________________
activation_40 (Activation)      (None, 14, 14, 1024) 0           add_13[0][0]                     
__________________________________________________________________________________________________
res5a_branch2a (Conv2D)         (None, 7, 7, 512)    524800      activation_40[0][0]              
__________________________________________________________________________________________________
bn5a_branch2a (BatchNormalizati (None, 7, 7, 512)    2048        res5a_branch2a[0][0]             
__________________________________________________________________________________________________
activation_41 (Activation)      (None, 7, 7, 512)    0           bn5a_branch2a[0][0]              
__________________________________________________________________________________________________
res5a_branch2b (Conv2D)         (None, 7, 7, 512)    2359808     activation_41[0][0]              
__________________________________________________________________________________________________
bn5a_branch2b (BatchNormalizati (None, 7, 7, 512)    2048        res5a_branch2b[0][0]             
__________________________________________________________________________________________________
activation_42 (Activation)      (None, 7, 7, 512)    0           bn5a_branch2b[0][0]              
__________________________________________________________________________________________________
res5a_branch2c (Conv2D)         (None, 7, 7, 2048)   1050624     activation_42[0][0]              
__________________________________________________________________________________________________
res5a_branch1 (Conv2D)          (None, 7, 7, 2048)   2099200     activation_40[0][0]              
__________________________________________________________________________________________________
bn5a_branch2c (BatchNormalizati (None, 7, 7, 2048)   8192        res5a_branch2c[0][0]             
__________________________________________________________________________________________________
bn5a_branch1 (BatchNormalizatio (None, 7, 7, 2048)   8192        res5a_branch1[0][0]              
__________________________________________________________________________________________________
add_14 (Add)                    (None, 7, 7, 2048)   0           bn5a_branch2c[0][0]              
                                                                 bn5a_branch1[0][0]               
__________________________________________________________________________________________________
activation_43 (Activation)      (None, 7, 7, 2048)   0           add_14[0][0]                     
__________________________________________________________________________________________________
res5b_branch2a (Conv2D)         (None, 7, 7, 512)    1049088     activation_43[0][0]              
__________________________________________________________________________________________________
bn5b_branch2a (BatchNormalizati (None, 7, 7, 512)    2048        res5b_branch2a[0][0]             
__________________________________________________________________________________________________
activation_44 (Activation)      (None, 7, 7, 512)    0           bn5b_branch2a[0][0]              
__________________________________________________________________________________________________
res5b_branch2b (Conv2D)         (None, 7, 7, 512)    2359808     activation_44[0][0]              
__________________________________________________________________________________________________
bn5b_branch2b (BatchNormalizati (None, 7, 7, 512)    2048        res5b_branch2b[0][0]             
__________________________________________________________________________________________________
activation_45 (Activation)      (None, 7, 7, 512)    0           bn5b_branch2b[0][0]              
__________________________________________________________________________________________________
res5b_branch2c (Conv2D)         (None, 7, 7, 2048)   1050624     activation_45[0][0]              
__________________________________________________________________________________________________
bn5b_branch2c (BatchNormalizati (None, 7, 7, 2048)   8192        res5b_branch2c[0][0]             
__________________________________________________________________________________________________
add_15 (Add)                    (None, 7, 7, 2048)   0           bn5b_branch2c[0][0]              
                                                                 activation_43[0][0]              
__________________________________________________________________________________________________
activation_46 (Activation)      (None, 7, 7, 2048)   0           add_15[0][0]                     
__________________________________________________________________________________________________
res5c_branch2a (Conv2D)         (None, 7, 7, 512)    1049088     activation_46[0][0]              
__________________________________________________________________________________________________
bn5c_branch2a (BatchNormalizati (None, 7, 7, 512)    2048        res5c_branch2a[0][0]             
__________________________________________________________________________________________________
activation_47 (Activation)      (None, 7, 7, 512)    0           bn5c_branch2a[0][0]              
__________________________________________________________________________________________________
res5c_branch2b (Conv2D)         (None, 7, 7, 512)    2359808     activation_47[0][0]              
__________________________________________________________________________________________________
bn5c_branch2b (BatchNormalizati (None, 7, 7, 512)    2048        res5c_branch2b[0][0]             
__________________________________________________________________________________________________
activation_48 (Activation)      (None, 7, 7, 512)    0           bn5c_branch2b[0][0]              
__________________________________________________________________________________________________
res5c_branch2c (Conv2D)         (None, 7, 7, 2048)   1050624     activation_48[0][0]              
__________________________________________________________________________________________________
bn5c_branch2c (BatchNormalizati (None, 7, 7, 2048)   8192        res5c_branch2c[0][0]             
__________________________________________________________________________________________________
add_16 (Add)                    (None, 7, 7, 2048)   0           bn5c_branch2c[0][0]              
                                                                 activation_46[0][0]              
__________________________________________________________________________________________________
activation_49 (Activation)      (None, 7, 7, 2048)   0           add_16[0][0]                     
__________________________________________________________________________________________________
avg_pool (AveragePooling2D)     (None, 1, 1, 2048)   0           activation_49[0][0]              
__________________________________________________________________________________________________
flatten_1 (Flatten)             (None, 2048)         0           avg_pool[0][0]                   
__________________________________________________________________________________________________
fc1000 (Dense)                  (None, 1000)         2049000     flatten_1[0][0]                  
==================================================================================================
Total params: 25,636,712
Trainable params: 25,583,592
Non-trainable params: 53,120
__________________________________________________________________________________________________

Creating Neural Networks using Keras

Finally, let's try something simpler.

Let's create a 1-layer network that can do linear regression.

In [4]:
# Reference: https://gist.github.com/fchollet/b7507f373a3446097f26840330c1c378
from keras.models import Sequential
from keras.layers import Dense

simple_model = Sequential()
simple_model.add(Dense(1, input_dim=4, activation='sigmoid')) # 4 inputs, 1 output
simple_model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 1)                 5         
=================================================================
Total params: 5
Trainable params: 5
Non-trainable params: 0
_________________________________________________________________
In [5]:
keras.models.Sequential?
In [6]:
keras.layers.Dense?
In [7]:
keras.Model.compile?

How about a 2-layer network to make it a deep neural network?

In [8]:
deeper_model = Sequential()
deeper_model.add(Dense(256, input_dim=16, activation='relu'))
deeper_model.add(Dense(1, activation='sigmoid'))

deeper_model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_2 (Dense)              (None, 256)               4352      
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 257       
=================================================================
Total params: 4,609
Trainable params: 4,609
Non-trainable params: 0
_________________________________________________________________

Visualizing Neural Net Architectures in Keras

https://keras.io/visualization/

In [9]:
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot

model_to_dot?
In [10]:
SVG(model_to_dot(simple_model, show_shapes=True).create(prog='dot', format='svg'))
Out[10]:
G 2688109751208 dense_1: Dense input: output: (None, 4) (None, 1) 2688109946528 2688109946528 2688109946528->2688109751208
In [11]:
SVG(model_to_dot(deeper_model, show_shapes=True).create(prog='dot', format='svg'))
Out[11]:
G 2688104496208 dense_2: Dense input: output: (None, 16) (None, 256) 2688102483672 dense_3: Dense input: output: (None, 256) (None, 1) 2688104496208->2688102483672 2688104240016 2688104240016 2688104240016->2688104496208
In [12]:
SVG(model_to_dot(mobilenet_model, show_shapes=True).create(prog='dot', format='svg'))
Out[12]:
G 2687801042032 input_1: InputLayer input: output: (None, 224, 224, 3) (None, 224, 224, 3) 2687801043264 conv1_pad: ZeroPadding2D input: output: (None, 224, 224, 3) (None, 226, 226, 3) 2687801042032->2687801043264 2687801043544 conv1: Conv2D input: output: (None, 226, 226, 3) (None, 112, 112, 32) 2687801043264->2687801043544 2687801042872 conv1_bn: BatchNormalization input: output: (None, 112, 112, 32) (None, 112, 112, 32) 2687801043544->2687801042872 2687801043320 conv1_relu: Activation input: output: (None, 112, 112, 32) (None, 112, 112, 32) 2687801042872->2687801043320 2687801330824 conv_pad_1: ZeroPadding2D input: output: (None, 112, 112, 32) (None, 114, 114, 32) 2687801043320->2687801330824 2687801161656 conv_dw_1: DepthwiseConv2D input: output: (None, 114, 114, 32) (None, 112, 112, 32) 2687801330824->2687801161656 2687801864088 conv_dw_1_bn: BatchNormalization input: output: (None, 112, 112, 32) (None, 112, 112, 32) 2687801161656->2687801864088 2687801435528 conv_dw_1_relu: Activation input: output: (None, 112, 112, 32) (None, 112, 112, 32) 2687801864088->2687801435528 2687802565520 conv_pw_1: Conv2D input: output: (None, 112, 112, 32) (None, 112, 112, 64) 2687801435528->2687802565520 2687802564960 conv_pw_1_bn: BatchNormalization input: output: (None, 112, 112, 64) (None, 112, 112, 64) 2687802565520->2687802564960 2687802849544 conv_pw_1_relu: Activation input: output: (None, 112, 112, 64) (None, 112, 112, 64) 2687802564960->2687802849544 2687803459960 conv_pad_2: ZeroPadding2D input: output: (None, 112, 112, 64) (None, 114, 114, 64) 2687802849544->2687803459960 2687803575096 conv_dw_2: DepthwiseConv2D input: output: (None, 114, 114, 64) (None, 56, 56, 64) 2687803459960->2687803575096 2687820442704 conv_dw_2_bn: BatchNormalization input: output: (None, 56, 56, 64) (None, 56, 56, 64) 2687803575096->2687820442704 2687821178584 conv_dw_2_relu: Activation input: output: (None, 56, 56, 64) (None, 56, 56, 64) 2687820442704->2687821178584 2687799139352 conv_pw_2: Conv2D input: output: (None, 56, 56, 64) (None, 56, 56, 128) 2687821178584->2687799139352 2687801647904 conv_pw_2_bn: BatchNormalization input: output: (None, 56, 56, 128) (None, 56, 56, 128) 2687799139352->2687801647904 2687801437432 conv_pw_2_relu: Activation input: output: (None, 56, 56, 128) (None, 56, 56, 128) 2687801647904->2687801437432 2687822066520 conv_pad_3: ZeroPadding2D input: output: (None, 56, 56, 128) (None, 58, 58, 128) 2687801437432->2687822066520 2687822229400 conv_dw_3: DepthwiseConv2D input: output: (None, 58, 58, 128) (None, 56, 56, 128) 2687822066520->2687822229400 2687823242184 conv_dw_3_bn: BatchNormalization input: output: (None, 56, 56, 128) (None, 56, 56, 128) 2687822229400->2687823242184 2687822297464 conv_dw_3_relu: Activation input: output: (None, 56, 56, 128) (None, 56, 56, 128) 2687823242184->2687822297464 2687823358048 conv_pw_3: Conv2D input: output: (None, 56, 56, 128) (None, 56, 56, 128) 2687822297464->2687823358048 2687823638256 conv_pw_3_bn: BatchNormalization input: output: (None, 56, 56, 128) (None, 56, 56, 128) 2687823358048->2687823638256 2687823771408 conv_pw_3_relu: Activation input: output: (None, 56, 56, 128) (None, 56, 56, 128) 2687823638256->2687823771408 2687824373128 conv_pad_4: ZeroPadding2D input: output: (None, 56, 56, 128) (None, 58, 58, 128) 2687823771408->2687824373128 2687824277856 conv_dw_4: DepthwiseConv2D input: output: (None, 58, 58, 128) (None, 28, 28, 128) 2687824373128->2687824277856 2687825400384 conv_dw_4_bn: BatchNormalization input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687824277856->2687825400384 2687824456952 conv_dw_4_relu: Activation input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687825400384->2687824456952 2687825504072 conv_pw_4: Conv2D input: output: (None, 28, 28, 128) (None, 28, 28, 256) 2687824456952->2687825504072 2687825502728 conv_pw_4_bn: BatchNormalization input: output: (None, 28, 28, 256) (None, 28, 28, 256) 2687825504072->2687825502728 2687825925960 conv_pw_4_relu: Activation input: output: (None, 28, 28, 256) (None, 28, 28, 256) 2687825502728->2687825925960 2687826533736 conv_pad_5: ZeroPadding2D input: output: (None, 28, 28, 256) (None, 30, 30, 256) 2687825925960->2687826533736 2687826440488 conv_dw_5: DepthwiseConv2D input: output: (None, 30, 30, 256) (None, 28, 28, 256) 2687826533736->2687826440488 2687826610440 conv_dw_5_bn: BatchNormalization input: output: (None, 28, 28, 256) (None, 28, 28, 256) 2687826440488->2687826610440 2687827408208 conv_dw_5_relu: Activation input: output: (None, 28, 28, 256) (None, 28, 28, 256) 2687826610440->2687827408208 2687827674728 conv_pw_5: Conv2D input: output: (None, 28, 28, 256) (None, 28, 28, 256) 2687827408208->2687827674728 2687827673160 conv_pw_5_bn: BatchNormalization input: output: (None, 28, 28, 256) (None, 28, 28, 256) 2687827674728->2687827673160 2687828097904 conv_pw_5_relu: Activation input: output: (None, 28, 28, 256) (None, 28, 28, 256) 2687827673160->2687828097904 2687828598176 conv_pad_6: ZeroPadding2D input: output: (None, 28, 28, 256) (None, 30, 30, 256) 2687828097904->2687828598176 2687828595992 conv_dw_6: DepthwiseConv2D input: output: (None, 30, 30, 256) (None, 14, 14, 256) 2687828598176->2687828595992 2687828996784 conv_dw_6_bn: BatchNormalization input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687828595992->2687828996784 2687829590480 conv_dw_6_relu: Activation input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687828996784->2687829590480 2687829823440 conv_pw_6: Conv2D input: output: (None, 14, 14, 256) (None, 14, 14, 512) 2687829590480->2687829823440 2687829819632 conv_pw_6_bn: BatchNormalization input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687829823440->2687829819632 2687830138328 conv_pw_6_relu: Activation input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687829819632->2687830138328 2687830759800 conv_pad_7: ZeroPadding2D input: output: (None, 14, 14, 512) (None, 16, 16, 512) 2687830138328->2687830759800 2687830870840 conv_dw_7: DepthwiseConv2D input: output: (None, 16, 16, 512) (None, 14, 14, 512) 2687830759800->2687830870840 2687831749408 conv_dw_7_bn: BatchNormalization input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687830870840->2687831749408 2687831899328 conv_dw_7_relu: Activation input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687831749408->2687831899328 2687831989328 conv_pw_7: Conv2D input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687831899328->2687831989328 2687831986920 conv_pw_7_bn: BatchNormalization input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687831989328->2687831986920 2687832304832 conv_pw_7_relu: Activation input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687831986920->2687832304832 2687832926024 conv_pad_8: ZeroPadding2D input: output: (None, 14, 14, 512) (None, 16, 16, 512) 2687832304832->2687832926024 2687833070336 conv_dw_8: DepthwiseConv2D input: output: (None, 16, 16, 512) (None, 14, 14, 512) 2687832926024->2687833070336 2687833341512 conv_dw_8_bn: BatchNormalization input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687833070336->2687833341512 2687834076832 conv_dw_8_relu: Activation input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687833341512->2687834076832 2687834153592 conv_pw_8: Conv2D input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687834076832->2687834153592 2687834153144 conv_pw_8_bn: BatchNormalization input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687834153592->2687834153144 2687834483624 conv_pw_8_relu: Activation input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687834153144->2687834483624 2687835290424 conv_pad_9: ZeroPadding2D input: output: (None, 14, 14, 512) (None, 16, 16, 512) 2687834483624->2687835290424 2687835072104 conv_dw_9: DepthwiseConv2D input: output: (None, 16, 16, 512) (None, 14, 14, 512) 2687835290424->2687835072104 2687835733800 conv_dw_9_bn: BatchNormalization input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687835072104->2687835733800 2687836245184 conv_dw_9_relu: Activation input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687835733800->2687836245184 2687836316000 conv_pw_9: Conv2D input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687836245184->2687836316000 2687836317008 conv_pw_9_bn: BatchNormalization input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687836316000->2687836317008 2687836654224 conv_pw_9_relu: Activation input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687836317008->2687836654224 2687837461304 conv_pad_10: ZeroPadding2D input: output: (None, 14, 14, 512) (None, 16, 16, 512) 2687836654224->2687837461304 2687837402336 conv_dw_10: DepthwiseConv2D input: output: (None, 16, 16, 512) (None, 14, 14, 512) 2687837461304->2687837402336 2687837891944 conv_dw_10_bn: BatchNormalization input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687837402336->2687837891944 2687838308392 conv_dw_10_relu: Activation input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687837891944->2687838308392 2687838420440 conv_pw_10: Conv2D input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687838308392->2687838420440 2687838420776 conv_pw_10_bn: BatchNormalization input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687838420440->2687838420776 2687838828136 conv_pw_10_relu: Activation input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687838420776->2687838828136 2687839570352 conv_pad_11: ZeroPadding2D input: output: (None, 14, 14, 512) (None, 16, 16, 512) 2687838828136->2687839570352 2687839436136 conv_dw_11: DepthwiseConv2D input: output: (None, 16, 16, 512) (None, 14, 14, 512) 2687839570352->2687839436136 2687840464120 conv_dw_11_bn: BatchNormalization input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687839436136->2687840464120 2687839962504 conv_dw_11_relu: Activation input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687840464120->2687839962504 2687840583072 conv_pw_11: Conv2D input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687839962504->2687840583072 2687841187992 conv_pw_11_bn: BatchNormalization input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687840583072->2687841187992 2687840999520 conv_pw_11_relu: Activation input: output: (None, 14, 14, 512) (None, 14, 14, 512) 2687841187992->2687840999520 2687841720136 conv_pad_12: ZeroPadding2D input: output: (None, 14, 14, 512) (None, 16, 16, 512) 2687840999520->2687841720136 2687841602640 conv_dw_12: DepthwiseConv2D input: output: (None, 16, 16, 512) (None, 7, 7, 512) 2687841720136->2687841602640 2687842631072 conv_dw_12_bn: BatchNormalization input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687841602640->2687842631072 2687842129008 conv_dw_12_relu: Activation input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687842631072->2687842129008 2687842766576 conv_pw_12: Conv2D input: output: (None, 7, 7, 512) (None, 7, 7, 1024) 2687842129008->2687842766576 2687842765736 conv_pw_12_bn: BatchNormalization input: output: (None, 7, 7, 1024) (None, 7, 7, 1024) 2687842766576->2687842765736 2687843170904 conv_pw_12_relu: Activation input: output: (None, 7, 7, 1024) (None, 7, 7, 1024) 2687842765736->2687843170904 2687843875808 conv_pad_13: ZeroPadding2D input: output: (None, 7, 7, 1024) (None, 9, 9, 1024) 2687843170904->2687843875808 2687843762752 conv_dw_13: DepthwiseConv2D input: output: (None, 9, 9, 1024) (None, 7, 7, 1024) 2687843875808->2687843762752 2687844171392 conv_dw_13_bn: BatchNormalization input: output: (None, 7, 7, 1024) (None, 7, 7, 1024) 2687843762752->2687844171392 2687844302072 conv_dw_13_relu: Activation input: output: (None, 7, 7, 1024) (None, 7, 7, 1024) 2687844171392->2687844302072 2687844912936 conv_pw_13: Conv2D input: output: (None, 7, 7, 1024) (None, 7, 7, 1024) 2687844302072->2687844912936 2687844974888 conv_pw_13_bn: BatchNormalization input: output: (None, 7, 7, 1024) (None, 7, 7, 1024) 2687844912936->2687844974888 2687845333424 conv_pw_13_relu: Activation input: output: (None, 7, 7, 1024) (None, 7, 7, 1024) 2687844974888->2687845333424 2687845959216 global_average_pooling2d_1: GlobalAveragePooling2D input: output: (None, 7, 7, 1024) (None, 1024) 2687845333424->2687845959216 2687846067672 reshape_1: Reshape input: output: (None, 1024) (None, 1, 1, 1024) 2687845959216->2687846067672 2687846871952 dropout: Dropout input: output: (None, 1, 1, 1024) (None, 1, 1, 1024) 2687846067672->2687846871952 2687847086904 conv_preds: Conv2D input: output: (None, 1, 1, 1024) (None, 1, 1, 1000) 2687846871952->2687847086904 2687846358880 act_softmax: Activation input: output: (None, 1, 1, 1000) (None, 1, 1, 1000) 2687847086904->2687846358880 2687847200528 reshape_2: Reshape input: output: (None, 1, 1, 1000) (None, 1000) 2687846358880->2687847200528
In [13]:
SVG(model_to_dot(resnet_model, show_shapes=True).create(prog='dot', format='svg'))
Out[13]:
G 2687847565016 input_2: InputLayer input: output: (None, 224, 224, 3) (None, 224, 224, 3) 2687692327232 conv1_pad: ZeroPadding2D input: output: (None, 224, 224, 3) (None, 230, 230, 3) 2687847565016->2687692327232 2687890460512 conv1: Conv2D input: output: (None, 230, 230, 3) (None, 112, 112, 64) 2687692327232->2687890460512 2687847687056 bn_conv1: BatchNormalization input: output: (None, 112, 112, 64) (None, 112, 112, 64) 2687890460512->2687847687056 2687888941800 activation_1: Activation input: output: (None, 112, 112, 64) (None, 112, 112, 64) 2687847687056->2687888941800 2687890669352 max_pooling2d_1: MaxPooling2D input: output: (None, 112, 112, 64) (None, 55, 55, 64) 2687888941800->2687890669352 2687889908232 res2a_branch2a: Conv2D input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687890669352->2687889908232 2687849730960 res2a_branch1: Conv2D input: output: (None, 55, 55, 64) (None, 55, 55, 256) 2687890669352->2687849730960 2687890232544 bn2a_branch2a: BatchNormalization input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687889908232->2687890232544 2687848329168 activation_2: Activation input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687890232544->2687848329168 2687848442064 res2a_branch2b: Conv2D input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687848329168->2687848442064 2687848390160 bn2a_branch2b: BatchNormalization input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687848442064->2687848390160 2687848672840 activation_3: Activation input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687848390160->2687848672840 2687849266760 res2a_branch2c: Conv2D input: output: (None, 55, 55, 64) (None, 55, 55, 256) 2687848672840->2687849266760 2687849049504 bn2a_branch2c: BatchNormalization input: output: (None, 55, 55, 256) (None, 55, 55, 256) 2687849266760->2687849049504 2687850855280 bn2a_branch1: BatchNormalization input: output: (None, 55, 55, 256) (None, 55, 55, 256) 2687849730960->2687850855280 2687851035056 add_1: Add input: output: [(None, 55, 55, 256), (None, 55, 55, 256)] (None, 55, 55, 256) 2687849049504->2687851035056 2687850855280->2687851035056 2687851428160 activation_4: Activation input: output: (None, 55, 55, 256) (None, 55, 55, 256) 2687851035056->2687851428160 2687850654912 res2b_branch2a: Conv2D input: output: (None, 55, 55, 256) (None, 55, 55, 64) 2687851428160->2687850654912 2687889872264 add_2: Add input: output: [(None, 55, 55, 256), (None, 55, 55, 256)] (None, 55, 55, 256) 2687851428160->2687889872264 2687851318632 bn2b_branch2a: BatchNormalization input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687850654912->2687851318632 2687890126608 activation_5: Activation input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687851318632->2687890126608 2687890212400 res2b_branch2b: Conv2D input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687890126608->2687890212400 2687890168632 bn2b_branch2b: BatchNormalization input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687890212400->2687890168632 2687890628056 activation_6: Activation input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687890168632->2687890628056 2687844025232 res2b_branch2c: Conv2D input: output: (None, 55, 55, 64) (None, 55, 55, 256) 2687890628056->2687844025232 2687889830912 bn2b_branch2c: BatchNormalization input: output: (None, 55, 55, 256) (None, 55, 55, 256) 2687844025232->2687889830912 2687889830912->2687889872264 2687889790848 activation_7: Activation input: output: (None, 55, 55, 256) (None, 55, 55, 256) 2687889872264->2687889790848 2687889790008 res2c_branch2a: Conv2D input: output: (None, 55, 55, 256) (None, 55, 55, 64) 2687889790848->2687889790008 2687888779080 add_3: Add input: output: [(None, 55, 55, 256), (None, 55, 55, 256)] (None, 55, 55, 256) 2687889790848->2687888779080 2687889664152 bn2c_branch2a: BatchNormalization input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687889790008->2687889664152 2687889357512 activation_8: Activation input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687889664152->2687889357512 2687889318464 res2c_branch2b: Conv2D input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687889357512->2687889318464 2687889401000 bn2c_branch2b: BatchNormalization input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687889318464->2687889401000 2687889116128 activation_9: Activation input: output: (None, 55, 55, 64) (None, 55, 55, 64) 2687889401000->2687889116128 2687888941240 res2c_branch2c: Conv2D input: output: (None, 55, 55, 64) (None, 55, 55, 256) 2687889116128->2687888941240 2687888978832 bn2c_branch2c: BatchNormalization input: output: (None, 55, 55, 256) (None, 55, 55, 256) 2687888941240->2687888978832 2687888978832->2687888779080 2687888521536 activation_10: Activation input: output: (None, 55, 55, 256) (None, 55, 55, 256) 2687888779080->2687888521536 2687888521592 res3a_branch2a: Conv2D input: output: (None, 55, 55, 256) (None, 28, 28, 128) 2687888521536->2687888521592 2687887590960 res3a_branch1: Conv2D input: output: (None, 55, 55, 256) (None, 28, 28, 512) 2687888521536->2687887590960 2687888481920 bn3a_branch2a: BatchNormalization input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687888521592->2687888481920 2687888269608 activation_11: Activation input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687888481920->2687888269608 2687888106944 res3a_branch2b: Conv2D input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687888269608->2687888106944 2687888134552 bn3a_branch2b: BatchNormalization input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687888106944->2687888134552 2687887839752 activation_12: Activation input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687888134552->2687887839752 2687887761704 res3a_branch2c: Conv2D input: output: (None, 28, 28, 128) (None, 28, 28, 512) 2687887839752->2687887761704 2687887761816 bn3a_branch2c: BatchNormalization input: output: (None, 28, 28, 512) (None, 28, 28, 512) 2687887761704->2687887761816 2687887350192 bn3a_branch1: BatchNormalization input: output: (None, 28, 28, 512) (None, 28, 28, 512) 2687887590960->2687887350192 2687887209528 add_4: Add input: output: [(None, 28, 28, 512), (None, 28, 28, 512)] (None, 28, 28, 512) 2687887761816->2687887209528 2687887350192->2687887209528 2687887002592 activation_13: Activation input: output: (None, 28, 28, 512) (None, 28, 28, 512) 2687887209528->2687887002592 2687887002816 res3b_branch2a: Conv2D input: output: (None, 28, 28, 512) (None, 28, 28, 128) 2687887002592->2687887002816 2687886040032 add_5: Add input: output: [(None, 28, 28, 512), (None, 28, 28, 512)] (None, 28, 28, 512) 2687887002592->2687886040032 2687886793976 bn3b_branch2a: BatchNormalization input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687887002816->2687886793976 2687886623800 activation_14: Activation input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687886793976->2687886623800 2687886665656 res3b_branch2b: Conv2D input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687886623800->2687886665656 2687886706672 bn3b_branch2b: BatchNormalization input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687886665656->2687886706672 2687886242424 activation_15: Activation input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687886706672->2687886242424 2687886207856 res3b_branch2c: Conv2D input: output: (None, 28, 28, 128) (None, 28, 28, 512) 2687886242424->2687886207856 2687886450360 bn3b_branch2c: BatchNormalization input: output: (None, 28, 28, 512) (None, 28, 28, 512) 2687886207856->2687886450360 2687886450360->2687886040032 2687885786976 activation_16: Activation input: output: (None, 28, 28, 512) (None, 28, 28, 512) 2687886040032->2687885786976 2687885902288 res3c_branch2a: Conv2D input: output: (None, 28, 28, 512) (None, 28, 28, 128) 2687885786976->2687885902288 2687884768424 add_6: Add input: output: [(None, 28, 28, 512), (None, 28, 28, 512)] (None, 28, 28, 512) 2687885786976->2687884768424 2687885828448 bn3c_branch2a: BatchNormalization input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687885902288->2687885828448 2687885481240 activation_17: Activation input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687885828448->2687885481240 2687885403024 res3c_branch2b: Conv2D input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687885481240->2687885403024 2687885177968 bn3c_branch2b: BatchNormalization input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687885403024->2687885177968 2687885312464 activation_18: Activation input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687885177968->2687885312464 2687885014464 res3c_branch2c: Conv2D input: output: (None, 28, 28, 128) (None, 28, 28, 512) 2687885312464->2687885014464 2687884851464 bn3c_branch2c: BatchNormalization input: output: (None, 28, 28, 512) (None, 28, 28, 512) 2687885014464->2687884851464 2687884851464->2687884768424 2687884720952 activation_19: Activation input: output: (None, 28, 28, 512) (None, 28, 28, 512) 2687884768424->2687884720952 2687884718768 res3d_branch2a: Conv2D input: output: (None, 28, 28, 512) (None, 28, 28, 128) 2687884720952->2687884718768 2687883752224 add_7: Add input: output: [(None, 28, 28, 512), (None, 28, 28, 512)] (None, 28, 28, 512) 2687884720952->2687883752224 2687884640384 bn3d_branch2a: BatchNormalization input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687884718768->2687884640384 2687884508464 activation_20: Activation input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687884640384->2687884508464 2687884305744 res3d_branch2b: Conv2D input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687884508464->2687884305744 2687884385416 bn3d_branch2b: BatchNormalization input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687884305744->2687884385416 2687883930816 activation_21: Activation input: output: (None, 28, 28, 128) (None, 28, 28, 128) 2687884385416->2687883930816 2687884001800 res3d_branch2c: Conv2D input: output: (None, 28, 28, 128) (None, 28, 28, 512) 2687883930816->2687884001800 2687884129784 bn3d_branch2c: BatchNormalization input: output: (None, 28, 28, 512) (None, 28, 28, 512) 2687884001800->2687884129784 2687884129784->2687883752224 2687883624744 activation_22: Activation input: output: (None, 28, 28, 512) (None, 28, 28, 512) 2687883752224->2687883624744 2687883627936 res4a_branch2a: Conv2D input: output: (None, 28, 28, 512) (None, 14, 14, 256) 2687883624744->2687883627936 2687882623744 res4a_branch1: Conv2D input: output: (None, 28, 28, 512) (None, 14, 14, 1024) 2687883624744->2687882623744 2687883544112 bn4a_branch2a: BatchNormalization input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687883627936->2687883544112 2687883292968 activation_23: Activation input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687883544112->2687883292968 2687883126600 res4a_branch2b: Conv2D input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687883292968->2687883126600 2687883158584 bn4a_branch2b: BatchNormalization input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687883126600->2687883158584 2687882993904 activation_24: Activation input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687883158584->2687882993904 2687882696968 res4a_branch2c: Conv2D input: output: (None, 14, 14, 256) (None, 14, 14, 1024) 2687882993904->2687882696968 2687882747520 bn4a_branch2c: BatchNormalization input: output: (None, 14, 14, 1024) (None, 14, 14, 1024) 2687882696968->2687882747520 2687933077712 bn4a_branch1: BatchNormalization input: output: (None, 14, 14, 1024) (None, 14, 14, 1024) 2687882623744->2687933077712 2687933357976 add_8: Add input: output: [(None, 14, 14, 1024), (None, 14, 14, 1024)] (None, 14, 14, 1024) 2687882747520->2687933357976 2687933077712->2687933357976 2687945715784 activation_25: Activation input: output: (None, 14, 14, 1024) (None, 14, 14, 1024) 2687933357976->2687945715784 2687945441008 res4b_branch2a: Conv2D input: output: (None, 14, 14, 1024) (None, 14, 14, 256) 2687945715784->2687945441008 2687949004472 add_9: Add input: output: [(None, 14, 14, 1024), (None, 14, 14, 1024)] (None, 14, 14, 1024) 2687945715784->2687949004472 2687946034536 bn4b_branch2a: BatchNormalization input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687945441008->2687946034536 2687946378936 activation_26: Activation input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687946034536->2687946378936 2687946944296 res4b_branch2b: Conv2D input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687946378936->2687946944296 2687947029360 bn4b_branch2b: BatchNormalization input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687946944296->2687947029360 2687947954552 activation_27: Activation input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687947029360->2687947954552 2687948041128 res4b_branch2c: Conv2D input: output: (None, 14, 14, 256) (None, 14, 14, 1024) 2687947954552->2687948041128 2687948042192 bn4b_branch2c: BatchNormalization input: output: (None, 14, 14, 1024) (None, 14, 14, 1024) 2687948041128->2687948042192 2687948042192->2687949004472 2687949062040 activation_28: Activation input: output: (None, 14, 14, 1024) (None, 14, 14, 1024) 2687949004472->2687949062040 2687949060864 res4c_branch2a: Conv2D input: output: (None, 14, 14, 1024) (None, 14, 14, 256) 2687949062040->2687949060864 2687951541976 add_10: Add input: output: [(None, 14, 14, 1024), (None, 14, 14, 1024)] (None, 14, 14, 1024) 2687949062040->2687951541976 2687948878848 bn4c_branch2a: BatchNormalization input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687949060864->2687948878848 2687949521472 activation_29: Activation input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687948878848->2687949521472 2687950077008 res4c_branch2b: Conv2D input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687949521472->2687950077008 2687950678280 bn4c_branch2b: BatchNormalization input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687950077008->2687950678280 2687950473424 activation_30: Activation input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687950678280->2687950473424 2687951186072 res4c_branch2c: Conv2D input: output: (None, 14, 14, 256) (None, 14, 14, 1024) 2687950473424->2687951186072 2687951129512 bn4c_branch2c: BatchNormalization input: output: (None, 14, 14, 1024) (None, 14, 14, 1024) 2687951186072->2687951129512 2687951129512->2687951541976 2687952095888 activation_31: Activation input: output: (None, 14, 14, 1024) (None, 14, 14, 1024) 2687951541976->2687952095888 2687952096392 res4d_branch2a: Conv2D input: output: (None, 14, 14, 1024) (None, 14, 14, 256) 2687952095888->2687952096392 2687954517352 add_11: Add input: output: [(None, 14, 14, 1024), (None, 14, 14, 1024)] (None, 14, 14, 1024) 2687952095888->2687954517352 2687952162544 bn4d_branch2a: BatchNormalization input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687952096392->2687952162544 2687952636840 activation_32: Activation input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687952162544->2687952636840 2687953161072 res4d_branch2b: Conv2D input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687952636840->2687953161072 2687952886248 bn4d_branch2b: BatchNormalization input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687953161072->2687952886248 2687953580552 activation_33: Activation input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687952886248->2687953580552 2687954273896 res4d_branch2c: Conv2D input: output: (None, 14, 14, 256) (None, 14, 14, 1024) 2687953580552->2687954273896 2687954205664 bn4d_branch2c: BatchNormalization input: output: (None, 14, 14, 1024) (None, 14, 14, 1024) 2687954273896->2687954205664 2687954205664->2687954517352 2687955201832 activation_34: Activation input: output: (None, 14, 14, 1024) (None, 14, 14, 1024) 2687954517352->2687955201832 2687955200600 res4e_branch2a: Conv2D input: output: (None, 14, 14, 1024) (None, 14, 14, 256) 2687955201832->2687955200600 2687957633848 add_12: Add input: output: [(None, 14, 14, 1024), (None, 14, 14, 1024)] (None, 14, 14, 1024) 2687955201832->2687957633848 2687955287568 bn4e_branch2a: BatchNormalization input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687955200600->2687955287568 2687955620584 activation_35: Activation input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687955287568->2687955620584 2687956004704 res4e_branch2b: Conv2D input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687955620584->2687956004704 2687956286936 bn4e_branch2b: BatchNormalization input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687956004704->2687956286936 2687956950992 activation_36: Activation input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687956286936->2687956950992 2687957229408 res4e_branch2c: Conv2D input: output: (None, 14, 14, 256) (None, 14, 14, 1024) 2687956950992->2687957229408 2687957300728 bn4e_branch2c: BatchNormalization input: output: (None, 14, 14, 1024) (None, 14, 14, 1024) 2687957229408->2687957300728 2687957300728->2687957633848 2687958295440 activation_37: Activation input: output: (None, 14, 14, 1024) (None, 14, 14, 1024) 2687957633848->2687958295440 2687958298576 res4f_branch2a: Conv2D input: output: (None, 14, 14, 1024) (None, 14, 14, 256) 2687958295440->2687958298576 2687960735872 add_13: Add input: output: [(None, 14, 14, 1024), (None, 14, 14, 1024)] (None, 14, 14, 1024) 2687958295440->2687960735872 2687958371296 bn4f_branch2a: BatchNormalization input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687958298576->2687958371296 2687959098368 activation_38: Activation input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687958371296->2687959098368 2687959430760 res4f_branch2b: Conv2D input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687959098368->2687959430760 2687959669896 bn4f_branch2b: BatchNormalization input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687959430760->2687959669896 2687960346128 activation_39: Activation input: output: (None, 14, 14, 256) (None, 14, 14, 256) 2687959669896->2687960346128 2687960391408 res4f_branch2c: Conv2D input: output: (None, 14, 14, 256) (None, 14, 14, 1024) 2687960346128->2687960391408 2687960847920 bn4f_branch2c: BatchNormalization input: output: (None, 14, 14, 1024) (None, 14, 14, 1024) 2687960391408->2687960847920 2687960847920->2687960735872 2687961398352 activation_40: Activation input: output: (None, 14, 14, 1024) (None, 14, 14, 1024) 2687960735872->2687961398352 2687961398016 res5a_branch2a: Conv2D input: output: (None, 14, 14, 1024) (None, 7, 7, 512) 2687961398352->2687961398016 2687963837440 res5a_branch1: Conv2D input: output: (None, 14, 14, 1024) (None, 7, 7, 2048) 2687961398352->2687963837440 2687961456032 bn5a_branch2a: BatchNormalization input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687961398016->2687961456032 2687962376904 activation_41: Activation input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687961456032->2687962376904 2687962248808 res5a_branch2b: Conv2D input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687962376904->2687962248808 2687962967736 bn5a_branch2b: BatchNormalization input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687962248808->2687962967736 2687962753624 activation_42: Activation input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687962967736->2687962753624 2687963373184 res5a_branch2c: Conv2D input: output: (None, 7, 7, 512) (None, 7, 7, 2048) 2687962753624->2687963373184 2687963159352 bn5a_branch2c: BatchNormalization input: output: (None, 7, 7, 2048) (None, 7, 7, 2048) 2687963373184->2687963159352 2687964485952 bn5a_branch1: BatchNormalization input: output: (None, 7, 7, 2048) (None, 7, 7, 2048) 2687963837440->2687964485952 2687965361600 add_14: Add input: output: [(None, 7, 7, 2048), (None, 7, 7, 2048)] (None, 7, 7, 2048) 2687963159352->2687965361600 2687964485952->2687965361600 2687964903016 activation_43: Activation input: output: (None, 7, 7, 2048) (None, 7, 7, 2048) 2687965361600->2687964903016 2687964902960 res5b_branch2a: Conv2D input: output: (None, 7, 7, 2048) (None, 7, 7, 512) 2687964903016->2687964902960 2687967893824 add_15: Add input: output: [(None, 7, 7, 2048), (None, 7, 7, 2048)] (None, 7, 7, 2048) 2687964903016->2687967893824 2687965578744 bn5b_branch2a: BatchNormalization input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687964902960->2687965578744 2687965928024 activation_44: Activation input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687965578744->2687965928024 2687966560328 res5b_branch2b: Conv2D input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687965928024->2687966560328 2687967197952 bn5b_branch2b: BatchNormalization input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687966560328->2687967197952 2687966953256 activation_45: Activation input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687967197952->2687966953256 2687967643408 res5b_branch2c: Conv2D input: output: (None, 7, 7, 512) (None, 7, 7, 2048) 2687966953256->2687967643408 2687967582864 bn5b_branch2c: BatchNormalization input: output: (None, 7, 7, 2048) (None, 7, 7, 2048) 2687967643408->2687967582864 2687967582864->2687967893824 2687968562928 activation_46: Activation input: output: (None, 7, 7, 2048) (None, 7, 7, 2048) 2687967893824->2687968562928 2687968561920 res5c_branch2a: Conv2D input: output: (None, 7, 7, 2048) (None, 7, 7, 512) 2687968562928->2687968561920 2687987579872 add_16: Add input: output: [(None, 7, 7, 2048), (None, 7, 7, 2048)] (None, 7, 7, 2048) 2687968562928->2687987579872 2687968648048 bn5c_branch2a: BatchNormalization input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687968561920->2687968648048 2687985569296 activation_47: Activation input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687968648048->2687985569296 2687986227688 res5c_branch2b: Conv2D input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687985569296->2687986227688 2687985932944 bn5c_branch2b: BatchNormalization input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687986227688->2687985932944 2687986908856 activation_48: Activation input: output: (None, 7, 7, 512) (None, 7, 7, 512) 2687985932944->2687986908856 2687987193672 res5c_branch2c: Conv2D input: output: (None, 7, 7, 512) (None, 7, 7, 2048) 2687986908856->2687987193672 2687987259320 bn5c_branch2c: BatchNormalization input: output: (None, 7, 7, 2048) (None, 7, 7, 2048) 2687987193672->2687987259320 2687987259320->2687987579872 2687988252456 activation_49: Activation input: output: (None, 7, 7, 2048) (None, 7, 7, 2048) 2687987579872->2687988252456 2687988252568 avg_pool: AveragePooling2D input: output: (None, 7, 7, 2048) (None, 1, 1, 2048) 2687988252456->2687988252568 2687988540192 flatten_1: Flatten input: output: (None, 1, 1, 2048) (None, 2048) 2687988252568->2687988540192 2687988771640 fc1000: Dense input: output: (None, 2048) (None, 1000) 2687988540192->2687988771640

Troubleshooting: Graphviz Installation

If pydot is not able to find graphviz, you can try installing graphviz manually.

  1. Download and install graphviz binaries from: https://graphviz.gitlab.io/download/
  2. Add the path to graphviz to your PATH environment variable, e.g. C:/Program Files (x86)/Graphviz2.38/bin
  3. Launch a new Anaconda Prompt and re-run the Jupyter notebook.

Training a neural network

A neural network is trained using Stochastic Gradient Descent

  • Forward Propagation to compute the output at each layer
  • Back Propagation to compute gradients
  • Update weights and biases using gradients

Forward Propagation

For 1 neuron:

$$y = W'g(Wx + b) + b'$$

Forward Propagation

2 layers of neurons:

$$x_1 = W_1'g(W_1x + b_1) + b_1'$$

$$y = x_2 = W_2'g(W_2x_1 + b_2) + b_2'$$

Forward Propagation

For layer $l$, single layer operation:

$$x_l = \sigma_l(W_lx_{l-1} + b_1)$$

where $\sigma_l(z) = W_l'g(z) + b_l'$

Feedforward through Layers

for $l = 1$ to $\,L$:

$\,\,\,\,x_l = \sigma_l(W_lx_{l-1} + b_l)$

Where:

  • Number of layers: $L$
  • Input: $x_0$, Output: $x_L$
  • Note: $x_l$ are tensors with the input & output dimensions of that layer

Backward Propagation

Objective

  • Compute the gradients of the cost function $J$ w.r.t. to $W^j_l$ and $b^j_l$ (layer $l$, neuron $j$)
    • Partial derivatives $\frac{\partial J}{\partial W^j_l}$, $\frac{\partial J}{\partial b^j_l}$
  • E.g. quadratic cost function, $n$ training samples, output $x_L$: $$J({W_l},{b_l}) = \frac{1}{2n}\sum_{i=1}^n {\|y^i - x_{L}^i\|}^2$$

Backward Propagation

  1. Feedforward from layer 1 to L
  2. Compute the output error vector at layer L ($\delta_L$)
  3. Backward propagate the error (backwards from layer L-1, .. 1) to compute per-layer error vectors ($\delta_l$)
  4. Compute gradient of cost function for layer $l$, neuron $j$: $$\frac{\partial J}{\partial W_l^j} = x_{l-1}^j\delta_l^j$$ $$\frac{\partial J}{\partial b_l^j} = \delta_l^j$$

Gradient Descent Update Rule

$$W_l^j := W_l^j + \epsilon \frac{\partial J}{W_l^j}$$

$$b_l^j := b_l^j + \epsilon \frac{\partial J}{b_l^j}$$

$\epsilon$ = learning rate

Workshop: Neural Network for Logistic Regression

In this workshop, you'll train a neural network to perform logistic regression on the MNIST dataset.

Credits: https://medium.com/@the1ju/simple-logistic-regression-using-keras-249e0cc9a970

In [25]:
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras import backend as K
In [26]:
# Training settings
BATCH_SIZE = 128
NUM_CLASSES = 10
EPOCHS = 30

# Input size settings
IMG_ROWS = 28 # 28 pixels wide
IMG_COLS = 28 # 28 pixels high
In [27]:
# Import the dataset, split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
In [35]:
# Input processing
input_dim = IMG_ROWS * IMG_COLS
X_train = X_train.reshape(60000, input_dim) 
X_test = X_test.reshape(10000, input_dim) 

X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255 # scale to between 0 and 1 (pixel: 0-255)
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
X_train shape: (60000, 784)
60000 train samples
10000 test samples
In [36]:
import matplotlib.pyplot as plt
(X_train1, y_train1), (X_test1, y_test1) = mnist.load_data()
plt.imshow(X_train1[3, :, :], cmap=plt.cm.gray)
plt.show()
print(X_train1[3, :, :]/255)
[[0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.48627451 0.99215686 1.         0.24705882 0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.37647059
  0.95686275 0.98431373 0.99215686 0.24313725 0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.49803922
  0.98431373 0.98431373 0.99215686 0.24313725 0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.26666667 0.9254902
  0.98431373 0.82745098 0.12156863 0.03137255 0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.23529412 0.89411765 0.98431373
  0.98431373 0.36862745 0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.60784314 0.99215686 0.99215686
  0.74117647 0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.07843137 0.99215686 0.98431373 0.92156863
  0.25882353 0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.1254902  0.80392157 0.99215686 0.98431373 0.49411765
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.40784314 0.98431373 0.99215686 0.72156863 0.05882353
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.31372549 0.94117647 0.98431373 0.75686275 0.09019608 0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.1254902
  0.99215686 0.99215686 0.99215686 0.62352941 0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.59215686
  0.98431373 0.98431373 0.98431373 0.15294118 0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.18823529 0.86666667
  0.98431373 0.98431373 0.6745098  0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.91764706 0.98431373
  0.98431373 0.76862745 0.04705882 0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.99215686 0.98431373
  0.98431373 0.34901961 0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.62352941 1.         0.99215686
  0.99215686 0.12156863 0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.18823529 0.89411765 0.99215686 0.96862745
  0.54901961 0.03137255 0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.25098039 0.98431373 0.99215686 0.8627451
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.25098039 0.98431373 0.99215686 0.8627451
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.09411765 0.75686275 0.99215686 0.8627451
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]]
In [37]:
# We are doing multi-class classification
# Convert class vectors to binary class matrices
y_train_cat = keras.utils.to_categorical(y_train, NUM_CLASSES)
y_test_cat = keras.utils.to_categorical(y_test, NUM_CLASSES)

# Show how the classes look like
print('y_train shape:', y_train_cat.shape)
print('First y_train sample:', y_train_cat[1])
y_train shape: (60000, 10)
First y_train sample: [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
In [38]:
# Create the model
from keras.layers import Activation

model = Sequential() 
model.add(Dense(NUM_CLASSES, input_dim=input_dim))
model.add(Activation('softmax'))
model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_6 (Dense)              (None, 10)                7850      
_________________________________________________________________
activation_52 (Activation)   (None, 10)                0         
=================================================================
Total params: 7,850
Trainable params: 7,850
Non-trainable params: 0
_________________________________________________________________

Exercise: Training the Neural Network

  1. Compile the model
  2. Train the model using sgd, minibatch size 128
    • Training set: X_train, y_train_cat
    • Test set: X_test, y_test_cat
  3. Plot the learning curve, using the accuracy metrics
  4. Analyze the learning curve to determine if overfitting or underfitting occurred.
    • If overfitting occurred, which epoch can training stop
    • If underfitting occurred, train more epochs to determine what the optimum number of epochs should be

How to get the accuracy metrics:

history = model.fit(..., metrics=['accuracy'])
...
loss = history.history['loss']
val_loss = history.history['val_loss']

You may reference this example for steps 1 and 2: https://medium.com/@the1ju/simple-logistic-regression-using-keras-249e0cc9a970

In [39]:
# Compile and train model
# Your code here

model.compile(optimizer='sgd',
              loss='categorical_crossentropy', metrics=['accuracy']) 

history = model.fit(X_train, y_train_cat, batch_size=BATCH_SIZE,
                    epochs=EPOCHS,
                    verbose=1,
                    validation_data=(X_test, y_test_cat)) 

score = model.evaluate(X_test, y_test_cat, verbose=0) 
print('Test score:', score[0]) 
print('Test accuracy:', score[1])

pred = model.predict(X_test)
Train on 60000 samples, validate on 10000 samples
Epoch 1/30
60000/60000 [==============================] - 3s 43us/step - loss: 1.2983 - acc: 0.6889 - val_loss: 0.8124 - val_acc: 0.8318
Epoch 2/30
60000/60000 [==============================] - 2s 27us/step - loss: 0.7178 - acc: 0.8397 - val_loss: 0.6065 - val_acc: 0.8625
Epoch 3/30
60000/60000 [==============================] - 2s 28us/step - loss: 0.5879 - acc: 0.8587 - val_loss: 0.5249 - val_acc: 0.8746
Epoch 4/30
60000/60000 [==============================] - 2s 30us/step - loss: 0.5260 - acc: 0.8691 - val_loss: 0.4793 - val_acc: 0.8813
Epoch 5/30
60000/60000 [==============================] - 3s 46us/step - loss: 0.4882 - acc: 0.8752 - val_loss: 0.4497 - val_acc: 0.8860
Epoch 6/30
60000/60000 [==============================] - 2s 41us/step - loss: 0.4623 - acc: 0.8793 - val_loss: 0.4283 - val_acc: 0.8915
Epoch 7/30
60000/60000 [==============================] - 2s 28us/step - loss: 0.4430 - acc: 0.8829 - val_loss: 0.4121 - val_acc: 0.8941
Epoch 8/30
60000/60000 [==============================] - 1s 24us/step - loss: 0.4281 - acc: 0.8862 - val_loss: 0.3991 - val_acc: 0.8959
Epoch 9/30
60000/60000 [==============================] - 2s 32us/step - loss: 0.4160 - acc: 0.8883 - val_loss: 0.3893 - val_acc: 0.8978
Epoch 10/30
60000/60000 [==============================] - 2s 37us/step - loss: 0.4060 - acc: 0.8903 - val_loss: 0.3802 - val_acc: 0.8994
Epoch 11/30
60000/60000 [==============================] - 2s 32us/step - loss: 0.3975 - acc: 0.8920 - val_loss: 0.3727 - val_acc: 0.9011
Epoch 12/30
60000/60000 [==============================] - 1s 22us/step - loss: 0.3902 - acc: 0.8933 - val_loss: 0.3665 - val_acc: 0.9016
Epoch 13/30
60000/60000 [==============================] - 1s 24us/step - loss: 0.3838 - acc: 0.8949 - val_loss: 0.3611 - val_acc: 0.9026
Epoch 14/30
60000/60000 [==============================] - 1s 18us/step - loss: 0.3782 - acc: 0.8964 - val_loss: 0.3558 - val_acc: 0.9038
Epoch 15/30
60000/60000 [==============================] - 1s 22us/step - loss: 0.3731 - acc: 0.8973 - val_loss: 0.3516 - val_acc: 0.9051
Epoch 16/30
60000/60000 [==============================] - 1s 23us/step - loss: 0.3686 - acc: 0.8982 - val_loss: 0.3476 - val_acc: 0.9064
Epoch 17/30
60000/60000 [==============================] - 1s 21us/step - loss: 0.3644 - acc: 0.8992 - val_loss: 0.3439 - val_acc: 0.9065
Epoch 18/30
60000/60000 [==============================] - 1s 22us/step - loss: 0.3607 - acc: 0.9000 - val_loss: 0.3408 - val_acc: 0.9073
Epoch 19/30
60000/60000 [==============================] - 1s 21us/step - loss: 0.3572 - acc: 0.9008 - val_loss: 0.3378 - val_acc: 0.9074
Epoch 20/30
60000/60000 [==============================] - 1s 18us/step - loss: 0.3540 - acc: 0.9015 - val_loss: 0.3350 - val_acc: 0.9080
Epoch 21/30
60000/60000 [==============================] - 1s 25us/step - loss: 0.3511 - acc: 0.9024 - val_loss: 0.3324 - val_acc: 0.9085
Epoch 22/30
60000/60000 [==============================] - 1s 18us/step - loss: 0.3483 - acc: 0.9030 - val_loss: 0.3302 - val_acc: 0.9091
Epoch 23/30
60000/60000 [==============================] - 1s 18us/step - loss: 0.3458 - acc: 0.9035 - val_loss: 0.3281 - val_acc: 0.9099
Epoch 24/30
60000/60000 [==============================] - 1s 20us/step - loss: 0.3435 - acc: 0.9040 - val_loss: 0.3261 - val_acc: 0.9104
Epoch 25/30
60000/60000 [==============================] - 1s 20us/step - loss: 0.3412 - acc: 0.9046 - val_loss: 0.3243 - val_acc: 0.9109
Epoch 26/30
60000/60000 [==============================] - 2s 25us/step - loss: 0.3391 - acc: 0.9052 - val_loss: 0.3224 - val_acc: 0.9109
Epoch 27/30
60000/60000 [==============================] - 1s 23us/step - loss: 0.3371 - acc: 0.9055 - val_loss: 0.3206 - val_acc: 0.9119
Epoch 28/30
60000/60000 [==============================] - 1s 22us/step - loss: 0.3353 - acc: 0.9063 - val_loss: 0.3191 - val_acc: 0.9117
Epoch 29/30
60000/60000 [==============================] - 1s 25us/step - loss: 0.3335 - acc: 0.9066 - val_loss: 0.3180 - val_acc: 0.9126
Epoch 30/30
60000/60000 [==============================] - 1s 19us/step - loss: 0.3318 - acc: 0.9073 - val_loss: 0.3162 - val_acc: 0.9127
Test score: 0.31617396712303164
Test accuracy: 0.9127
In [40]:
pred = model.predict(X_test)
print(pred)

pred_classes = model.predict_classes(X_test)
print(pred_classes)
[[1.92300795e-04 3.05666418e-07 2.73727986e-04 ... 9.94899571e-01
  1.48842766e-04 2.02709646e-03]
 [7.04828463e-03 1.23849735e-04 9.05364752e-01 ... 6.24142373e-08
  5.18862577e-03 1.11505744e-06]
 [1.44006510e-04 9.55273390e-01 1.45111205e-02 ... 5.02551626e-03
  9.41167492e-03 1.76334998e-03]
 ...
 [1.89898765e-06 8.51190543e-06 7.51070256e-05 ... 4.82808892e-03
  1.33694988e-02 4.35379557e-02]
 [3.01771378e-03 4.32022521e-03 1.24727038e-03 ... 7.10607506e-04
  3.68826538e-01 1.30194030e-03]
 [2.76002334e-04 8.18013834e-09 8.10557511e-04 ... 1.97109102e-08
  4.32123943e-06 7.32404715e-07]]
[7 2 1 ... 4 5 6]
In [41]:
(X_train3, y_train3), (X_test3, y_test3) = mnist.load_data()

test = X_test3[:20]
plt.imshow(test[8], cmap=plt.cm.gray)
plt.show()

# preprocessing
test = test.reshape(test.shape[0], input_dim) 
test = test.astype('float32')
test /= 255

predicted_numbers = model.predict_classes(test)
predicted_prob = model.predict(test)

print('prediction', predicted_numbers)
print('truth', y_test3[:20])

print(predicted_prob[8])
print(predicted_numbers[8])
print(y_test3[8])
prediction [7 2 1 0 4 1 4 9 6 9 0 6 9 0 1 5 9 7 3 4]
truth [7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4]
[8.9296000e-03 1.3875663e-04 3.4160007e-02 2.3498736e-05 1.7972823e-02
 8.6964490e-03 9.2203307e-01 2.3101331e-05 6.4728083e-03 1.5499224e-03]
6
5

Keras Model API

https://keras.io/models/model/

Workflow:

  • Create model
  • compile: set up loss, optimizer, what to return in history
  • fit: trains the model. This returns a History object (History can be obtained from History.history). Optionally setup callbacks for Tensorboard, etc.
  • evaluate: evaluates metrics
  • predict: performs a prediction

Cheatsheet: https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Keras_Cheat_Sheet_Python.pdf

In [ ]:
# Plot learning curve


# How to get the accuracy metrics:
#
# history = model.fit(..., metrics=['accuracy'])
# ...
# loss = history.history['loss']
# val_loss = history.history['val_loss']

print(history.history.keys())
print(history.history['val_loss'])

# Use matplotlib to plot 'val_loss' and 'loss' vs. number of epochs
# Your code here

Reading List

Material Read it for URL
Lecture 1: Deep Learning Challenge. Is There Theory? Intro to Deep Learning https://stats385.github.io/lecture_slides (lecture 1)
Lecture 2: Overview of Deep Learning from a Practical Point of View More background on Neural Nets https://stats385.github.io/lecture_slides (lecture 2)
Neural Networks and Deep Learning, Chapter 2 Understanding Back Propagation http://neuralnetworksanddeeplearning.com/chap2.html
Guide to the Sequential Model Basic usage of Keras for neural net training https://keras.io/getting-started/sequential-model-guide/