this post is mainly focus on image recognizaton.
traslation invariance Link to heading
neural network will recognize traditional images for sure. but what if our target image is very small and at the left top corner?
in this case, neural network will not be so helpful.
when a machine learning model can recognize an object no matter whether it is moved (or translated) in the image.
above called translation invariance.
we introduce convolutional layer.
convolutional layer will break image into small overlapping tiles, passing each tiles to neural network. in this time, the output would be array.
after done, we changes set of weights on the nodes. because we changing the weights, nodes will look different pattern than first time.
repeat above again and again, it forms a 3D array. this 3D array will be passed to next layer of neural network, and it will use this info to decide which patterns are most important.
there could be more than one convolutional layers.
maxpooling Link to heading
idea of maxpooling is to down sample the data by only passing on the most important bits.
for example:
dropout Link to heading
dropout is to solve the problem that neural network tend to memorize the input instead actually learning it.
dropout layer will randomly throw away some of the data by cutting some of the connections.
convolutional block Link to heading
convolutional layer + maxpooling layer + dropout layer = convolutional block
convolutional block is a standard method.
image recognition models Link to heading
- VGG
- ResNet-50
- Inception v3
- MobileNet
- NASNet
transfer learning Link to heading
the essence of transfer learning is to reuse the convolutional layers and then create our own dense layer to train.
because in convolutional layers, they will capture all the patterns, and we don’t want to start from scratch, so we just send those patterns to new dense layer to tell neural network what new it is.
if from scratch, it is more like teaching baby to read and then recognize something, but if we reuse convolutional layers, it is more like ask an adult that already know how to read and just tell him what it is.