Generative Adversarial Networks

GANs, i.e. Generative Adversarial Networks, were first proposed by University of Montreal students Ian Goodfellow and others (including Yoshua Bengio) in 2014. In 2016, Facebook’s AI research director and New York University professor Yann LeCun called them “the most interesting idea in the last 10 years in machine learning”.

In order to understand what GANs are, it is necessary to compare them with discriminative algorithms like the simple Deep Neural Networks (DNNs). For an introduction to neural networks, please see this article. For more information on Convolutional Neural Networks, click here.

Let us use the issue of predicting whether a given email is spam or not as an example. The words that make up the body of the email are variables that determine one of two labels: “spam” and “non-spam”. The discriminator algorithm learns from the input vector (the words occurring in a given message are converted into a mathematical representation) to predict how much of a spam message the given email is, i.e. the output of the discriminator is the probability of the input data being spam, so it learns the relationship between the input and the output.

GANs do the exact opposite. Instead of predicting what the input data represents, they try to predict the data while having a label. More specifically, they are trying to answer the following question: assuming this email is spam, how likely is this data?

Even more precisely, the task of Generative Adversarial Networks is to solve the issue of generative modelling, which can be done in 2 ways (you always need high-resolution data, e.g. images or sound). The first possibility is density estimation — with access to numerous examples, you want to find the density probability function that describes them. The second approach is to create an algorithm that learns to generate data from the same training dataset (this is not about re-creating the same information but rather creating new information that could be such data).

What generative modelling approach do GANs use?

This approach can be likened to a game played by two agents. One is a generator that attempts to create data. The other is a discriminator that predicts whether this data is true or not. The generator’s goal is to cheat the other player. So, over time, as both get better at their task, it is forced to generate data that is as similar as possible to the training data.

What does the learning process look like?

The first agent, i.e. the discriminator (it is some differentiable function D, usually a neural network), gets a piece of the training data as input (e.g. a photo of a face). This picture is then called  (it is simply the name of the model input) and the goal is for D(x) to be as close to 1 as possible — meaning that x is a true example.

The second agent, i.e. the generator (differentiable function G; it is usually a neural network as well), receives white noise z (random values that allow it to generate a variety of plausible images) as input. Then, applying the function G to the noise z, one obtains x (in other words, G(z) = x). We hope that sample x will be quite similar to the original training data but will have some problems — such as noticeable noise — that may allow the discriminator to recognise it as a fake example. The next step is to apply the discriminant function D to the fake sample x from the generator. At this point, the goal of D is to make D(G(z)) as close to zero as possible, whereas the goal of G is for D(G(z)) to be close to one.

This is akin to the struggle between money counterfeiters and the police. The police want the public to be able to use real banknotes without the possibility of being cheated, as well as to detect counterfeit ones and remove them from circulation, and punish the criminals. At the same time, counterfeiters want to fool the police and use the money they have created. Consequently, both the police and the criminals are learning to do their jobs better and better.

Assuming that the hypothetical capabilities of the police and the counterfeiters — the discriminator and the generator — are unlimited, then the equilibrium point of this game is as follows: the generator has learned to produce perfect fake data that is indistinguishable from real data, and as such, the discriminator’s score is always 0.5 — it cannot tell if a sample is true or not.

What are the uses of GANs?

GANs are used extensively in image-related operations. This is not their only application, however, as they can be used for any type of data.

Style Transfer by CycleGAN
Figure 1 Style Transfer carried out by CycleGAN

For example, the DiscoGAN network can transfer a style or design from one domain to another (e.g. transform a handbag design into a shoe design). It can also generate a plausible image from an item’s sketch (many other networks can do this, too, e.g. Pix2Pix). Known as Style Transfer, this is one of the more common uses of GANs. Other examples of this application include the CycleGAN network, which can transform an ordinary photograph into a painting reminiscent of artworks by Van Gogh, Monet, etc. GANs also enable the generation of images based on a description (StackGAN network) and can even be used to enhance image resolution (SRGAN network).

Useful resources

[1] Goodfellow I., Improved Techniques for Training GANs, https://arxiv.org/abs/1606.03498
2016, https://arxiv.org/pdf/1609.04468.pdf

[2] Chintala S., How to train a GAN, https://github.com/soumith/ganhacks

[3] White T., Sampling Generative Networks, School of Design, Victoria University of Wellington, Wellington

[4] LeCun Y., Mathieu M., Zhao J., Energy-based Generative Adversarial Networks, Department of Computer Science, New York University, Facebook Artificial Intelligence Research, 2016, https://arxiv.org/pdf/1609.03126v2.pdf

References

[1] Goodfellow I., Tutorial: Generative Adversarial Networks [online], “NIPS”, 2016, https://arxiv.org/pdf/1701.00160.pdf
[2] Skymind, A Beginner’s Guide to Generative Adversarial Networks (GANs) [online], San Francisco, Skymind, accessed on: 31 May 2019
[3] Goodfellow, Ian, Pouget-Abadie, Jean, Mirza, Mehdi, Xu, Bing, Warde-Farley, David, Ozair, Sherjil, Courville, Aaron, and Bengio, Yoshua. Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680, 2014
[4] LeCun, Y., What are some recent and potentially upcoming breakthroughs in deep learning?, “Quora”, 2016, accessed on: 31 May 2019, https://www.quora.com/What-are-some-recent-and-potentially-upcoming-breakthroughs-in-deep-learning
[5] Kim T., DiscoGAN in PyTorch, accessed on: 31 May 2019, https://github.com/carpedm20/DiscoGAN-pytorch

Convolutional neural networks

Artificial intelligence elevates the capabilities of the machines closer to human-like level at an increasing rate. Since it is an issue of great interest, many fields of science have taken a big leap forward in recent years.

One of the goals of artificial intelligence is to enable machines to observe the world around them in a human-like way. This is possible through the application of neural networks. Neural networks are mathematical structures that, at their base, are inspired by the natural neurons found in the human nerves and brain.

Surely you have felt the presence of neural networks in everyday life many times, for example in:

  • face detection and recognition in smartphone photos,
  • recognition of voice commands by the virtual assistant,
  • autonomous cars.

The potential of neural networks is enormous. The examples listed above represent merely a fraction of current applications. They are, however, related to a special class of neural networks, called convolutional neural networks, CNNs, or ConvNet (Convolutional Neural Networks).

Image processing and neural networks

To explain the idea of convolutional neural networks, we will focus on their most common application – image processing. A CNN is an algorithm that can take an input image and classify it according to predefined categories (e.g. the breed of dog). This is be achieved by assigning weights to different shapes, structures, objects.

Convolutional networks, through training, are able to learn which specific features of an image help to classify it. Their advantage over standard deep networks is that they are more proficient at detecting intricate relationships between images. This is possible thanks to the use of filters that examine the relationship between adjacent pixels.

General RGB image size scheme
Figure 1 General RGB image sizing scheme

Each image is a matrix of values, the number of which is proportionate to its width and height in pixels. For RGB images, the image is characterised by three primary colours, so each pixel is represented by three values. ConvNet’s task is to reduce the size of the image to a lighter form. However, it happens without losing valuable features, i.e. those that carry information crucial for classification.

CNN has two key layers. The first one is convolutional layer.

Convulational layer
Animation of RGB image filtering with 3x3x3 filter
Figure 2 Animation of RGB image filtering with a 3x3x3 filte

The animation above shows an RGB image and a 3x3x3 filter moving through it with a defined step. The step is the value in pixels by which the filter moves. We can apply the “zero padding” option, i.e. filling with zeros (white squares). This procedure helps preserve more information at the expense of efficiency.

Subsequent values of the output matrix are calculated as follows:

  • multiplying the values in a given section of the image by the filter (after the elements),
  • summing up the calculated values for a given channel,
  • summing up the values for each channel taking into account the bias (in this case equal to 1).

It is worth noting that the filter values for a particular channel may differ. The task of the convolution layer, is to extract features such as edges, colours, gradients. Subsequent layers of the network – using what the previous layers have determined – can detect increasingly complex shapes. Much like the layers of an ordinary network, the convolution layer is followed by an activation layer (usually a ReLU function), introducing non-linearity into the network.

We can interpret the result of the convolution with each filter as an image. Many such images formed by convolution with multiple filters are multi-channel images. An RGB image is something very similar – it consists of 3 channels, one for each colour. The output of the convolution layer, however, does not consist of colours per se, but certain “colour-shapes” that each filter represents. This is also responsible for noise reduction. The most popular method is “max pooling”.

Typically multiple filters are used, so that the convolution layer increases the depth, i.e. the number of image channels.

Bonding layer

Another layer, called the bonding layer, has the task of reducing the remaining dimensions of the image (width and height), while retaining key information needed, e.g. for image classification.

Scheme of the connection operation
Figure 3 Diagram of the merging operation

The merging operation is similar to the one applied in the convolution layer. A filter and step are defined. The subsequent values of the output matrix are the maximum value covered by the filter.

Together, these layers form a single layer of the convolutional network. Once the selected number of layers has been applied, the resulting matrix is “flattened out” to a single dimension. It means that the width and height dimensions are gradually replaced by a depth dimension. The result of the convolutional layers translates directly into the input to the next network layers, usually the standard fully connected ones (Dense Layers). This allows the algorithm to learn the non-linear relationships between the features determined by the convolution layers.

The last layer of the network is the Soft-Max layer. It makes it possible to obtain values for the probabilities of class membership (for example, the probability that there is a cat in the image). During training, these are compared with the desired classification result in the applied cost function. Then, through a back-propagation algorithm, the network adjusts its weights to minimise the error.

Convolutional neural networks are an important part of the machine learning development. They contribute to the progress of automation and help extend to human perceptual abilities. Their capabilities will continue to grow with the computing power of computers and the amount of available data.

References

[1] https://medium.com/@raycad.seedotech/convolutional-neural-network-cnn-8d1908c010ab

[2] https://medium.com/@RaghavPrabhu/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148

[3] https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53

Innovation in a company

Today’s world is characterised by constant technological progress. You hear about new products, services, methods and other things virtually every day. Moreover, they are often referred to as “innovative” as well. This term can also be applied to companies, and companies increasingly often call themselves “innovative”, too. In today’s article, we will take a look at what innovation means in a company and how to promote it.

What is innovation?

Innovation is defined as “a set of competencies of an organisation to continuously discover sources of new solutions, as well as to absorb them from outside and generate on one’s own, and to implement and diffuse them (make them commonplace)”. Put simply, it is the ability to generate new ideas; the desire to improve, to create something new, and then implement and commercialise these new solutions. Innovation manifests itself in thinking outside the box, seeking solutions and going beyond the daily routine.

Virtually everyone knows companies like Apple, Google and Microsoft. Undoubtedly, these companies have achieved enormous global success through their innovation. This shows that the world is open to innovation and the demand for it is increasing. This also means that companies that do not follow the path of innovation may lose their competitiveness and ultimately cease to exist in a few years’ time. So do companies that do not have a charismatic leader like Steve Jobs or capital equal to that of Google have a chance to become innovative? The answer is a resounding YES! This is since innovation is not a trait that only the chosen few can attain it is an attitude that anyone can develop.

Attitude is key

Some people are born innovators. They find it remarkably easy to come up with new ideas. But what about the people who spend hours coming up with anything new and the results of their efforts still leave much to be desired? Well, we have one piece of advice for them — attitude is key! Innovation is primarily a kind of attitude that you can develop. The most important thing about being innovative is having an open mind. This is the driving force behind innovation. You will not invent anything new by repeating the same activities every day and cutting yourself off from any contact with the outside world.

This is where another innovation-driving factor comes in, i.e. contact. A lot of ideas come from outside as a result of conversations with others. That is why it is so vital to spend time with people, as well as to talk to them, and get their opinions on various topics. This allows you to trigger something within yourself, which may result in new ideas and solutions. Therefore, if you want to create innovation in your company, you have to start by changing your mindset.

“Architects of Innovation”

A key role in driving innovation in a company is played by leaders, who were dubbed “innovation architects” in “Innovation as Usual”, a book by Thomas Wedell-Wedellsborg and Paddy Miller. The above authors believe that the leader’s primary task is to create a culture of innovation in the company, i.e. conditions in which creativity is inherent in the work of every employee, regardless of their position. Here, they point to a mistake often made, which is the desire to create something innovative at a moment’s notice. To that end, companies hold brainstorming sessions and send their staff off to workshops that are meant to help them come up with new ideas.

However, this often has the opposite effect. Employees return to a job where they repeat the same thing every day, which kills their creativity. This is why it is so important to develop a culture of innovation that drives innovation on a daily basis. Such culture can manifest itself in the way work is organised, as well as the development of new habits, practices and rituals to help trigger new ideas.

Yet another task facing managers is the ability to motivate and support their employees. Leaders should serve as guides for their teams, as well as be able to spark creativity and mobilise them to generate new ideas. To enable this, the book’s authors have proposed a set of “5+1 keystone behaviours”, which include focus, insight, modification, selection and diplomacy. All these behaviours should be supported by perseverance in introducing innovation on a daily basis. The introduction of the “5+1 keystone behaviour” model in a company has a significant impact on shaping an attitude of innovation among employees. This ensures that the creation of new ideas is not a one-off activity but rather a permanent part of the company’s system.

Innovation management

Innovation is becoming increasingly vital. Many companies now set up dedicated departments to handle their innovation activities. Therefore, the introduction of an innovation management process is a key step in creating an innovative company.

The figure below shows the four pillars that should comprise an innovation management process according to Instytut Innowacyjności Polska.

Pillars of the innovation management process by Institute of Innovation Poland
Figure 1 Pillars of the innovation management process according to Instytut Innowacyjności Polska

The first and most important pillar in innovation management is diagnosis. Diagnosis is construed as the determination of the company’s previous innovation level, as well as an analysis of its environment in terms of its ability to create innovation. A company may carry out an innovation diagnosis on its own or have an outside company carry out a so-called “innovation audit”.

In the second step, an organisational structure and processes need to be put in place to implement the process of generating innovative ideas in the company.

The next step is to come up with new ideas and manage the process of their implementation.

The final pillar of innovation management is determining how innovation is to be funded. Funding may be provided through both internal and external sources (grants, investors, etc.).

The innovation management process is a must for any company that wants to successfully implement innovation. It makes it possible to effectively supervise the implementation of innovations, measure the company’s innovation level and control the expenses incurred in this area. By introducing this process, the company demonstrates that it deems innovation a top priority.

Conclusions

Innovation is certainly an issue that is becoming increasingly important. The high level of computerisation and technological progress makes the demand for innovation ever greater. Therefore, to stay in the market, companies should follow the path of innovation and shape this trait within their structures. As “innovation architects”, leaders play a vital role in this process and are tasked with creating a company system that triggers creative ideas in employees every day. In addition, a leader should be a kind of guide who motivates his or her team to act creatively. Creating innovation in a company is therefore a continuous, day-to-day process. However, there are solutions that support process management, such as Data Engineering. Utilising cutting-edge IoT technology to collect and analyse information, Data Engineering enables companies to make quick and accurate decisions.

References

[1] https://www.instytutinnowacyjnosci.pl/

[2] http://it-manager.pl/kultura-innowacyjnosci/

[3] Miller P., Wedell-Wedellsborg T., “Innovation as Usual: How to Help Your People Bring Great Ideas to Life”

Introduction to neural networks

The topic of neural networks in the IT area has become very popular in recent years. Neural networks are not a new concept, as they were already popular in the 1970s. However, their real development took place in the 21st century due to the technology’s huge leap forward. Neural networks are one of the areas of artificial intelligence (AI). The interest in neural networks is growing, thus forcing us to constantly develop and improve them.

Characteristics

In order to describe the way neural networks work, it is worth referring (in a certain simplification) to the way the human nervous system works. The characteristic of the functionality is acting as a biological system. Despite enormous progress and the use of innovative solutions, today’s networks still are not able to act as well as the human brain. However, it can’t be ruled out that in the future such an advanced stage of development will be reached.

Neural network structure

The neural network consists of a certain number of neurons. The simplest neural network is called perceptone, which consists of only one artificial neuron. Input data with assigned weight scales are sent to the perceptone – it determines the final result of a parameter. This set of data is later sent to the summation block. The summation block is just a pattern, an algorithm prepared by programmers. Summing all inputs gives a result, which in today’s advanced types of artificial neurons answers the form of a real number. The result informs about the type of decision that was made based on the calculations.

Illustrative diagram of the perceptron operation
Img 1 Schematic diagram of how a perceptron works. Each of the 4 input elements is multiplied by its corresponding weights. The products are summed (summation block) and the sum is passed to the activation function (activation block), whose output is also the output of the perceptron.

The usage of neural networks

When it comes to the development of AI, it is closely connected to the development of neural networks. An unquestionable advantage of networks is that they have a wide range of applications. Furthermore, they leave room for unlimited possibilities for further development. Another advantage of it is that they deal well with large data sets, which are sometimes very difficult for humans. What’s more, they can adapt to the new situations when new variables appear. However, most available on the market programs do not have this possibility.

Neural networks’ ability to work based on damaged data is still a field of development. They will find applications in a growing number of areas, mainly in finance, medicine, and technology. Neural networks will appear successively in areas that require solutions related to prediction, classification, and control. They will find their application wherever creating scenarios or making decisions is based on many variables.

References

[1] http://businessinsider.com.pl/technologie/czym-sa-sieci-neuronowe/pwtfrsy

[2] http://pclab.pl/art71255-2.html