Neural networks — summarized by beginners, for beginners

Put in simple terms, neural networks are a kind of statistical model used in computer science to model a function. There are many different types of neural networks, each being able to learn to do a different type of thing, depending on their structure, how they’re trained, the data that they are provided with, and certain other parameters, such as the learning rate.

Some compare neural networks and machine learning to the new electricity or the new fire of our generation, as it brings incomparable utility and efficiency to our economy and our day-to-day lives; others, however, think of it as a threat to humanity, as anything that exceeds us in intelligence could be dangerous and destroy human civilization if left in the wrong hands. No matter the case, neural networks are indeed a powerful creation, and this article will give you an introduction to the world of deep learning, and how it works.

Due to their versatility, neural networks are used in all sorts of varying applications ranging from self-driving cars to Chess bots. These applications will be discussed in further detail. Before that, let’s take a look at how they actually work.

Neural networks are a method of implementing machine learning, in which they learn a set of data to perform a specific task. Usually, for neural networks to be trained, the dataset must at least contain a few thousand training examples for the neural network to learn the task properly. For example, if a neural network was made to recognize handwriting, many examples of labeled images of human handwriting would have to be prepared beforehand in order to train the neural network to recognize handwritten symbols.

Based loosely on the human brain, a neural network is made of many processing nodes that are densely interconnected through “weights.” Most neural networks are known as “feed-forward” neural networks, which are made of layers of nodes that are interconnected, but not usually connected within themselves, feeding the information from one side to another in one direction. Any node might be connected to several nodes in the layers before and after its layer, receiving and sending data, respectively.

For every connection between nodes, a value known as a “weight” is assigned, used when the incoming data from one node is exported to the other. Generally, the neuron “fires” a value into the neurons it is exporting to if the value generated at that processing node exceeds a certain threshold. The weights play an important role in the calculation of the value of the node (and hence, whether or not the neuron fires): the data from an incoming node is multiplied by the value of the corresponding weight; these values from the several incoming nodes are summed, and if the sum exceeds the threshold, the neuron will fire; otherwise, it will not send a value to the nodes in the next layer.

When a neural net is being trained, all of its weights and thresholds are initially set to random values. Training data is fed through the bottom layer, where it is then fed forward through the layers of the neural network, being multiplied and summed in complex ways until the neural network produces an output value. The neural network is generally trained until the error in the network’s output reaches a certain minimum, through the adjustment and fine-tuning of weights. The techniques to train neural networks will be discussed in further sections.

There are many different types of neural networks, with different structures and methods of learning, that are specifically designed to perform certain tasks. The details will not be touched upon in-depth in this article but will be briefly mentioned.

CNN: a convolutional neural network exceeds in recognizing patterns in images, and is often used in optical character recognition (OCR), as well as face recognition and many other technologies. It does this by training a certain small part of the neural network to identify patterns in smaller regions of an image rather than the image as a whole to identify smaller patterns and use them to construct larger objects, such as learning the small strokes, curves, and lines that define a number.

RNN: a recurrent neural network is one in which information is passed onto not only the layers behind a node but also to future versions of the current node/layer. This is used so that the node can have a “memory”, and recall previous information to help aid its predictions on a current piece of data. Biological brains don’t simply start thinking from scratch at every instance, as humans read a text, their understanding of each and every single word depends on their understanding of the previous words, as in, they are able to continue a “train of thought” that relies on previous information created. Traditional, or naive, neural networks are unable to have a “memory” in this way, which can be a major drawback when attempting to understand language or to understand sequential data, where the order of the information matters greatly. Recurrent neural networks are able to address this issue, in that there are “loops” inside of the network; certain nodes are able to both output their information to the next node, as well as back to themselves, so they may use the current understanding they have produced in future calculations.

However, there is still a flaw in RNN’s in that the network is unable to “remember” things for very long; the further in the past an output was created, the more likely the current neural network won’t be able to form a connection between that output and its current input. Hence, recurrent neural networks were even further improved to be able to avoid their long-term memory loss problem, in that they contain objects called “gates” which are able to learn what information might be relevant in future situations, and pass that relevant information along a chain of sequences to make future predictions, in simple terms. For example, if a human were to watch a movie that was quite heavy in symbolism, they would have no trouble connecting a symbol displayed previously in the movie with the reoccurrence of the same symbol later on in the movie, after a considerable period of time. Similarly, long short-term memory layers in neural networks achieve the same purpose in they remember previous knowledge produced. Having neural networks that learn to remember important information, such as previous trends in stock prices, is essential when attempting to predict changes in the stock market.

There are still many more different types of neural networks out there, and the possibilities are truly endless.

Supervised versus Unsupervised learning:

  • Supervised learning is when the data is processed, cuddled, and labeled, and presented to the network for training. This would be like labeling a bunch of images that you would want the neural network to classify.
  • Unsupervised learning is when the data is given to the neural network but not processed or labeled. Sometimes, even the people don’t know the answer to the problem. This is when neural networks take a set of data and try to learn patterns in them all by themselves. This would be used to recommend movies based on similar ones someone has watched and their history of watching films. This can also be used to detect anomalies such as suspicious credit card transactions.

Reinforcement Learning

  • Reinforcement learning is where the neural network is in an environment where there is no “data” to learn, but rather its actions can be used to obtain rewards; such as in a video game. In this environment, it learns to optimize its actions based on its environment to obtain the maximum possible reward.

Supervised neural networks (the main focus of this article) learn through minimizing the error of their predictions. They measure this error by comparing their output for a specific set of data to the labeled or correct output for the data, and they measure the data for each and every weight in the network through a method called “backpropagation,” where they figure out how much each weight contributes to the total error. The network then slightly adjusts its weights to reduce the error. Over the course of training, a neural network will do this many, many times to finally reach a minimum point where error cannot be further reduced. This method is known as gradient descent.

As mentioned in the introduction, neural networks are very versatile and now that you have gotten a deeper understanding of how they work, it’s time to explore some applications in further detail as well.

Perhaps the most popular, recent technological innovation is the self-driving car. There are many different components that work together to make a self-driving car function properly. The car’s ability to see clearly is governed by the sensors. However, to make use and make sense of that information, the program needs to be able to identify what it sees. This process is known as perception. While one may think that neural networks, specifically in this case Deep Neural Networks, are only involved in the process of perception, in actual applications the network is also involved in the decision-making. Instead of the program decided to go at a green light because it was explicitly stated so in the code, the program learns to make that association between the green light and go, on its own.

Due to the large differences in types of input that such a program receives, it is clear to see why multiple Deep Neural Networks are used. Some are used for making decisions relating to road routes, others to stop signs, pedestrians, and others to identify intersections and other road signs. Sometimes, the same tasks are carried out by multiple networks. This overlap is intended and is implemented this way so that even if one network was to fail, the chances of complete failure are minimized. In this context, the Deep Neural Networks can be divided into two main categories: the pathfinding networks and the object classification networks.

There are very interesting applications of neural networks in games like Chess and Go and even multiplayer online strategy games such as DOTA 2 and Starcraft 2.

Starcraft 2 is a real-time strategy game that requires extreme mechanical skill, advanced strategy, as well as imperfect information as only a small portion of the map is available.

A company situated in the UK called DeepMind created AlphaStar which was able to defeat top tier players in the game.

DOTA 2 is another multiplayer online strategy game that is known for the long-term strategic planning involved in the game.

In 2017, Open AI 5, created by OpenAI was able to defeat the world champion DOTA 2 player in a 1v1 game with one hero.

In 2018 the team challenged the reigning world champions for a 5v5 game. In game 1, the AI predicted its win probability to be 67%. However, at first, it felt like the AI was falling behind and was playing aimlessly. It had a very aggressive playing style. As the round progressed, the AI was asked about what they thought of their chances of winning the game. The AI replied 95% and in a few minutes the round was over and OpenAI 5 had come out victorious. The second game was quite one-sided though an interesting play happened where the AI lost one of its heroes for the 2 of the opponents’. There is some speculation as to whether the AI had predicted this.

In chess, Google’s AlphaZero went up against Stockfish, a hand-crafted (not machine learning) program, and in just 4 hours of playing against itself, the results were very impressive. Almost every single game that was not a draw was a win for AlphaZero. The most impressive part is that AlphaZero is a general algorithm that can play several 2-player perfect information (unlike DOTA 2 or Starcraft 2 where not all the information is known all the time to players) games at a superhuman level. The other games that AlphaZero mastered were Go and Shogi (a Japanese version of chess).

Ultimately, the way that human chess players advance their rating and their play level is through pattern recognition. The same is true for Deep Neural Networks. One certain example of this neural network uses four different layers and looks at a position in three ways: the global state of a game, a piece-wise perspective in which the location of the pieces is observed, and finally, observation of all the squares and the pieces on them, that each piece can attack.

This network is then trained using a large, varied dataset. Lastly, a bootstrapping strategy is used to make the program play itself in order to learn and improve its future position evaluation prediction.

A much lesser-known application of neural networks is in physics simulations. Physics and specifically fluid simulations all have pre-existing solutions associated with them. So one might wonder why it is that a neural network would be necessary for physics simulations. After all, the problem is unlike those in computer vision where there is no predefined equation that the program can use to classify, for example, cats and dogs.

However, in this case, the Navier-Stokes equation does exactly that. There already exist fluid simulators that work upon fluid dynamics. However, the main problem when implemented is that these types of solutions are very slow to compute. Instead, one particular approach involves using pre-existing solutions to model, for example, a rising plume of smoke and then asking a neural network to continue the simulation. Similarly, the neural network could be fed video footage of a classical fluid simulator and then model its own fluid using what it has seen. In other words, the model which includes splashes and drops of fluid, or the dispersal of gas is treated as training data which the network uses to model similar phenomena. This method cuts down on the time taken to model fluids from minutes to milliseconds.

There are still endless possibilities as to what neural networks can be used for; these are just a few.

Hopefully, this article has enriched your understanding of neural networks and your general knowledge of the subject. Whether this article has merely piqued your interest in the subject or has clarified a lot of confusing topics, perhaps the most important information that you should take away from this (if nothing else) is as follows: Neural networks are our gateway into a much more efficient, reliable and exciting future. The applications of this technology can almost be seen in every field and the potential applications are virtually unlimited due to the mind-boggling versatility of this computer science tool. On a much smaller scale, if nothing else, the techniques used are extremely useful to add to your academic repertoire and will invariably be of use in many of your personal projects. This technology should inspire the creator in you, simply due to how accessible the resources are. The next material science breakthrough may come from some exclusive lab in the middle of nowhere. The next space travel breakthrough may come from some launching center of a major space agency. The next energy production breakthrough may come from some plasma physics research center that is funded by billions of dollars. But the next paradigm-shifting application of neural networks may very well come from your laptop. Good Luck!

Written by Jack Gao and Vinith Thyagarajan (for the IB CAS project).