Want to achieve deep neural networks? An Excel spreadsheet is enough

Convolutional neural networks (CNN) are often used in image recognition, speech processing and other fields, and are an important part of the rapid development of artificial intelligence in recent years. However, for beginners, it seems difficult for us to understand the principles. In fact, the concept of "convolution" is not out of reach. The author of this article, Blake West, introduced us to a method of convolutional neural networks using spreadsheets such as Excel and Google Sheets.

In fact, deep convolutional neural networks are not as terrible as it sounds. This article will be implemented once in Google Sheets to prove it to you. Go to the demo address, download it as an Excel spreadsheet, and then you can edit it to see how the different layers affect the final predictions of the model.

Demo address: https://docs.google.com/spreadsheets/d/1SwfVctd4TjdN2S8BL09ktpQN_41sARYzD3NEHyr-8Z0/edit?usp=sharing

Below we will introduce a brief convolutional neural network principle from a higher perspective.

Before we get started, I want to introduce FastAI. I recently completed their in-depth study course, and all the inspiration and merits belong to them. Jeremy Hdward is a great mentor, and his co-founder Rachel Thomas shows how to use excel as a convolutional neural network. But in the current situation, this form is not found online, and I don't think this form is a complete network. I made a little extension and put this in the Google form so that everyone can play at will.

How to build?

I trained a very simple convolutional neural network with the MNIST dataset, which is used to predict numbers in handwritten digital pictures. Each picture is 28 x 28 pixels in size. Each pixel is represented by 0 (blank) to 1 (dark). MNIST is a very classic data set that is often used in a variety of new technologies, such as the Capsule by Geoffrey Hinton et al. This data set is very small so it is very fast to train, but this data set has enough data to show the complexity of machine learning. The job of this model is to predict what the numbers in the picture are. Each picture is clearly a number between 0-9.

An example of a MNIST data set, 28 x 28 pixel size. Note: I added a conditional format so that pixels with larger numbers will look redder.

I used a very famous deep learning library, Keras, and then replayed the training rights in my model into the table. The weights that are trained are just numbers. Putting them in a table means copying from the model and pasting it into the form. The final step is to add the formula for copying the model function to the table, using traditional addition, multiplication, and so on. Let me say it again: The math used to reproduce a deep learning model is multiplication and addition.

Keras documentation: https://keras.io

The figure below is the weight/parameter of each layer of the model. Weights are automatically learned from machine learning models. This model has almost 100 weights. More complex models can easily have hundreds of millions of parameters. In the image below you can see all the 1000 parameters of this model.

When should I use a convolutional neural network?

Convolutional neural networks work by looking for patterns in sequence data that may be difficult to express in words or expressed in simple rules. The convolutional neural network assumes that the order of the sequences is important.

For example, image classification is one of the main uses of convolutional neural networks because pixels are logically ordered, and for all humans the picture is full of patterns. However, as long as you try to use language to describe exactly what distinguishes a cat from a chihuahua, you know why convolutional neural networks are useful (subconscious perception process).

On the other hand, if you collect all kinds of data from two baseball teams and hope to predict the outcome of the two matchups, convolutional neural networks are a less suitable option in this case. The data you have (such as the number of league wins & failures, the average number of shots the team has) is not inherently sequential. The order here is not really important, and we have extracted the features that we think are useful. Here, convolutional neural networks are not applicable.

Understand CNN

In order to understand the working principle behind CNN, we split the deep convolutional neural network into "depth", "convolution" and "neural network" and explain them separately.

convolution

Imagine that you are closing your eyes and trying to identify the numbers on the handwritten image. You can talk to people who look at the images, but they don't know what the number is. So you can only ask some simple questions, what should I do?

There are some feasible questions, for example, "Is it basically straight at the top?" "Does it have a diagonal?" After asking enough questions, you can guess whether the number is 7 or 2, or any other number.

Intuitively, this is how convolution works. The computer is blind, so it asks a lot of questions about the small-area mode in its own way.

The numbers in box 1 and the numbers in box 2 are multiplied separately, and then the sum of all the products, the number in box 3, is obtained. This is the convolution operation.

To ask these questions, each pixel in the image needs to run a function (ie, a convolution operation) to generate a corresponding pixel value to answer the corresponding question about the small region mode. Convolution operations use convolution kernels to find patterns. For example, the number in box 2 above is darker on the right (the value is larger) and lighter on the left (the value is smaller). Then the function of the convolution kernel is to find the edge of the vertical color from left to right (referred to as the left edge).

This may not be intuitive enough, but as long as you try to interact in the table, you can see how the convolution kernel works. The convolution kernel looks for a pattern similar to itself. And a CNN usually has hundreds of convolution kernels, and you can capture different types of patterns for all pixels, such as the left edge, the top edge, the diagonal, the corner, and so on.

depth

Finding edges is a very basic element. How do you deal with more complex shapes? This is the "depth" or multi-layered use. Through the operation of the bottom convolution layer, we already have the distribution of the "left edge", "upper edge" and other simple convolution modes of the image. Now add more layers, convolve again for all the pattern distributions obtained earlier, and then combine them. So combining a left edge and an upper edge by 50% can give you an arbitrary left corner. How, it's very powerful.

The second convolutional layer takes the pixel output from a convolutional layer as an input and convolves it with its own convolution kernel. As before, we get the new corresponding pixel value of the second convolutional layer.

The actual application of CNN will have many layers, enabling the network to build more and more abstract and complex shapes. Even after only 4 to 5 layers of convolution, the model can begin to look for key features about objects such as faces and animals.

Neural Networks

Now you may ask yourself, "It's all good, but it's boring to put all the convolution results together. How can we combine the results of all these convolution kernels into something meaningful?"

First, at a higher level, our convolutional network can be divided into two parts. The first is convolution, which is found in the image data to find useful features.

The second part is that the "dense layer" behind the table (that is, the fully connected layer, so named because each neuron has a lot of parameters here) is used for classification. Once you have these textures, these dense The role of the layer is to calculate a bunch of linear regressions for each possible number, and then give a score. The highest score is the final prediction of the model.

Matrix 1 is our convolution output. The pixels in each matrix 1 are multiplied by the number in matrix 2, and the result after the summation produces the number in box 3. Next, repeat the same operation for the matrix in the green box. Here, we end up with 8 outputs, which are called "neurons" in deep learning.

It is a very tedious task to finally find the correct weights in these convolution kernels and dense layers. Fortunately, automatic update weights are part of the neural network work, so we don't have to worry about this. If you are curious about this, look for "backpropagation" (see: Abandoned by Geoffrey Hinton, why is backpropagation questioned? (with BP derivation)).

to sum up

Each convolutional neural network consists of two parts: convolution, which always starts with actions to find useful features in the picture; and stratification is often referred to as "dense" layers, which are based on feature pairs. Sort things.

In order to have a clear understanding of these concepts, I recommend that you use a spreadsheet to manipulate them.

Here, you can track a pixel from start to finish and watch it pass through the filter and what will happen in the end. I also added more technical details to the comments in the spreadsheet.

Baseus

Baseus,65w Wall Charger,65 W Usb C Charger,Xiaomi 65w Gan Charger

Pogo Technology International Ltd , https://www.wisesir.net