SEO

What makes some websites appear immediately after entering a search query, while others disappear in the midst of other sites? How can we make it easier for users to find our website? SEO is responsible for these and other aspects, and it has nothing to do with randomness.  Whether you are just starting your journey with running a website or have been doing it for a long time, whether you handle everything yourself or delegate it to someone else, it’s important to know the basic principles of SEO. After reading this article, you will learn what SEO is, what it consists of, and how to use it properly. 

What is SEO?

Let’s start with what SEO actually is and what it consists of. SEO (Search Engine Optimization) is a set of activities undertaken to improve the positioning of a website in search results [1]. It consists of various practices and strategies, such as proper text editing and building a link profile. SEO also involves adapting the website to algorithms used by search engines. These algorithms determine which pages will be displayed on the first page of search results and in what order. Through optimization, a website can gain a better position in the search results, which increases its visibility.

It is important to remember, of course, that SEO tools are only one way to improve the popularity of a website. It doesn’t produce results as quickly as, for example, paid advertising, but it’s relatively inexpensive. Furthermore, the achieved effect will last longer and won’t disappear after a subscription expires, as is the case with many other marketing techniques.

On-site positioning

We can divide SEO into two types: on-site and off-site. On-site SEO includes all activities that take place on a specific website. These are all editorial, technical, or other issues that affect content loading speed. By taking care of these aspects, the website is more readable for both the user and Google’s robots. Good on-site SEO requires attention to:

  • Metadata and ALT description – even if a page is readable for users, what about search engine algorithms? To make it readable for them as well, it’s worth taking care of meta titles and descriptions, which will help search engines find our website. In addition, it is also worth taking care of ALT descriptions, also known as alternative text. Algorithms don’t understand what’s in images. With this short description, they will be able to assign its content to the searched phrase and improve positioning. 
  • Header – this is another thing that affects more than just human perception. Proper distribution of headers and content optimization in them can significantly contribute to improved positioning. 
  • Hyperlinks – the set of links, also known as the link profile. Here we can distinguish between external and internal linking. External linking refers to links coming from websites other than our own and is considered off-site SEO. On the other hand, internal linking refers to links within a single website that redirect users to other tabs or articles. 

Off-site positioning

Off-site SEO refers to all activities undertaken outside the website to increase its visibility and recognition on the web. This helps generate traffic to the site from external sources. Such activities include:

  • Hyperlinks – again, a link profile that builds a site’s popularity and recognition on the web. Off-site SEO includes external linking, i.e. from other sources. It is worth ensuring that these are of good quality, i.e. from reliable sources. Gone are the days when only quantity mattered. Nowadays, search engine algorithms pay much more attention to value.
  • Internet marketing – this includes activities such as running profiles on social media, engaging in discussions with users on forums, or collaborating with influencers. These aspects do not directly affect search results but can indirectly contribute a great deal to boosting the number of queries about our website. 
  • Reviews – after some time, opinions about a website or business naturally appear on the web. It’s worth taking care of them and responding to users who leave them. Maintaining a good customer opinion is one aspect of building a trustworthy brand image [3].

Link building and positioning

Link building is the process of acquiring links that will lead to our website. These can be links from external sources (so-called backlinks) or internal linking. In that case, we are talking about links that will redirect us within a given website. A well-built link profile significantly affects positioning, as discussed above [4]. However, how has the significance of such practices changed? 

For many years, Google allowed SEO practitioners a lot of leeway in this regard. It was commonplace to encounter sites that had hundreds of thousands of links leading to them because the number of links had a significant impact on positioning, and their quality was not as crucial. The vast majority of these were low-quality links, which were posted online in forums, guestbooks, directories, comments, etc. This was often not handled by a human, but special applications were used that did it automatically. This approach brought significant results and could be carried out relatively inexpensively. But not for long. This all changed in April 2012. There was a kind of revolution back then – Google introduced a new algorithm called Penguin.

How did Penguin change SEO?

What is Penguin? It is an algorithm created by Google and introduced on 24th April 2012, to combat unethical SEO practices. SEO specialists tried to trick Google’s script by buying links and placing them in inappropriate places, but Penguin effectively caught them. 

Let’s try to answer how Penguin works. This script analyses the links leading to a particular website and decides on their value. If it deems them to be of low quality, it will lower the rankings of the sites they lead to. Such links include purchased ones (also from link exchanges) or those created by bots. It will also do the same for spam links, such as those placed in forum comments or on completely unrelated websites. However, its action is not permanent – when low-quality links are removed, a given website can regain its position. It’s worth mentioning that Penguin was not created only to detect fraud and reduce the visibility of websites in search results. Its role is also to reward honestly conducted websites. If it deems the link profile valuable, it will increase the visibility of such sites [6].

Ethical and unethical positioning

Depending on what we base our SEO techniques on, a distinction can be made between White Hat SEO and Black Hat SEO. These terms allude to the good and evil characters in western tales. According to culturally accepted convention, the characters usually wore white and black hats respectively, hence the association. But what do they mean and how do these techniques differ?

White Hat SEO is ethical SEO, applied according to guidelines recommended by search engines. It involves procedures such as creating good quality content (free of duplicates). Using headings, bullet points and ensuring paragraphs are the right length is also important. Black Hat SEO, on the other hand, is characterized by unethical behavior aimed at artificially boosting popularity. These include practices such as overusing key phrases out of context, hiding text or buying links. Such actions can result in a decrease in trust in the site and the imposition of filters lowering its position. Even exclusion from search results is possible[7].

Summary

The key to increasing traffic to a website and improving its positioning is the skilful use of SEO tools. These are both on-site and off-site techniques that can significantly increase reach. When using SEO, it is important to remember to do it properly. By following the recommendations of search engines and adapting the content to both the user and the algorithms, we can count on positive results and improved statistics. Unethical practices, on the other hand, can lead to the opposite effect.

References

[1] https://searchengineland.com/guide/what-is-seo 

[2] https://www.semstorm.com/pl/blog/seo-and-ppc/czym-sie-rozni-on-site-seo-od-off-site-seo 

[3]https://www.semrush.com/blog/off-page-seo/?kw=&cmp=EE_SRCH_DSA_Blog_EN&label=dsa_pagefeed&Network=g&Device=c&utm_content=676606914923&kwid=dsa-2185834089536&cmpid=18361923498&agpid=157305243831&BU=Core&extid=105138960331&adpos=&gad_source=1&gclid=CjwKCAjw7-SvBhB6EiwAwYdCAQvsJcp7q2JoIQMf2RzGg_HVRjTFb7AB2sTcZ2khQdIN3qvCREr9GhoCzOIQAvD_BwE 

[4] https://greenparrot.pl/blog/co-to-jest-off-site-seo/ 

[5] https://1stplace.pl/blog/algorytm-google-pingwin/  

[6] https://www.business2community.com/infographics/history-google-penguin-infographic-01468714 

[7]https://www.semrush.com/blog/black-hat-seo/?kw=&cmp=EE_SRCH_DSA_Blog_EN&label=dsa_pagefeed&Network=g&Device=c&utm_content=683809340380&kwid=dsa-2264710307245&cmpid=18361923498&agpid=156456448517&BU=Core&extid=105138960709&adpos=&gad_source=1&gclid=CjwKCAjw7-SvBhB6EiwAwYdCAZln5MkdcE3R2XZq-FUhanEKkDWUbpUoZxIowWHslE3ETaNFW88vPBoCJ5sQAvD_BwE

Moral dilemmas associated with Artificial Intelligence

Artificial intelligence is one of the most exciting technological developments of recent years. It has the potential to fundamentally change the way we work and use modern technologies in many areas. We talking about text and image generators, various types of algorithms or autonomous cars. However, as the use of artificial intelligence becomes more widespread, it is also good to be aware of the potential problems it brings with it. Given the increasing dependence of our systems on artificial intelligence, how we approach these dilemmas could have a crucial impact on the future image of society. In this article, we will present these moral dilemmas. We will also discuss the problems associated with putting autonomous vehicles on the roads. Next we will jump to the dangers of using artificial intelligence to sow disinformation. Finaly, it will come to te concerns about the intersection of artificial intelligence and art.

The problem of data acquisition and bias

As a rule, human judgements are burdened by a subjective perspective; machines and algorithms are expected to be more objective. However, how machine learning algorithms work depends heavily on the data used to teach the algorithms. Therefore, data selected to train an algorithm with any, even unconscious bias, can cause undesirable actions by the algorithm. Please have a look at our https://bfirst.tech/problemy-w-danych-historycznych-i-zakodowane-uprzedzenia/earlier article for more information on this topic.

Levels of automation in autonomous cars

In recent years, we have seen great progress in the development of autonomous cars. There has been a lot of footage on the web showing prototypes of vehicles moving without the driver’s assistance or even presence. When discussing autonomous cars, it is worth pointing out that there are multiple levels of autonomy. It is worth identifying which level one is referring to before the discussion. [1]

  • Level 0 indicates vehicles that require full control of the driver, performing all driving actions (steering, braking acceleration, etc.). However, the vehicle can inform the driver of hazards on the road. It will use systems such as collision warning or lane departure warnings to do so. 
  • Level 1 includes vehicles that are already common on the road today. The driver is still in control of the vehicle, which is equipped with driving assistance systems such as cruise control or lane-keeping assist. 
  • Level 2, in addition to having the capabilities of the previous levels, is – under certain conditions – able to take partial control of the vehicle. It can influence the speed or direction of travel, under the constant supervision of the driver. The support functions include controlling the car in traffic jams or on the motorway. 
  • Level 3 of autonomy refers to vehicles that are not yet commercially available. Cars of this type are able to drive fully autonomously, under the supervision of the driver. The driver still has to be ready to take control of the vehicle if necessary. 
  • Level 4 means that the on-board computer performs all driving actions, but only on certain previously approved routes. In this situation, all persons in the vehicle act as passengers. Although, it is still possible for a human to take control of the vehicle. 
  • Level 5 is the highest level of autonomy – the on-board computer is fully responsible for driving the vehicle under all conditions, without any need for human intervention. [2] 

Moral dilemmas in the face of autonomous vehicles

Vehicles with autonomy levels 0-2 are not particularly controversial. Technologies such as car control on the motorway are already available and make travelling easier. However, the potential introduction of vehicles with higher autonomy levels into general traffic raises some moral dilemmas. What happens when an autonomous car, under the care of a driver, is involved in an accident. Who is then responsible for causing it? The driver? The vehicle manufacturer? Or perhaps the car itself? There is no clear answer to this question.

Putting autonomous vehicles on the roads also introduces another problem – these vehicles may have security vulnerabilities. Something like this could potentially lead to data leaks or even a hacker taking control of the vehicle. A car taken over in this way could be used to deliberately cause an accident or even carry out a terrorist attack. There is also the problem of dividing responsibility between the manufacturer, the hacker and the user. [3]

One of the most crucial issues related to autonomous vehicles is the ethical training of vehicles to make decisions. It is expecially important in the event of danger to life and property. Who should make decisions in this regard – software developers, ethicists and philosophers, or perhaps country leaders? These decisions will affect who survives in the event of an unavoidable accident. Many of the situations that autonomous vehicles may encounter will require decisions that do not have one obvious answer (Figure 1). Should the vehicle prioritise saving pedestrians or passengers, the young or the elderly? How important is it for the vehicle not to interfere with the course of events? Should compliance with the law by the other party to the accident influence the decision? [4]

An illustration of one of the situations that autonomous vehicles may encounter

Fig. 1. An illustration of one of the situations that autonomous vehicles may encounter. Source: https://www.moralmachine.net/  

Deepfake – what is it and why does it lead to misinformation?

Contemporary man using modern technology is bombarded with information from everywhere. The sheer volume and speed of information delivery means that not all of it can be verified. This fact enables those fabricating fake information to reach a relatively large group of people. This allows them to manipulate their victims into changing their minds about a certain subject or even attempt to deceive them. Practice like this has been around for some time but it did not give us such moral dilemmas. The advent of artificial intelligence dramatically simplifies the process of creating fake news and thus allows it to be created and disseminated more quickly.

Among disinformation techniques, artificial intelligence has the potential to be used particularly effectively to produce so-called deepfakes. Deepfake is a technique for manipulating images depicting people, relying on artificial intelligence. With the help of machine learning algorithms, modified images are superimposed on existing source material. Thereby, it is creating realistic videos and images depicting events that never took place. Until now, the technology mainly allowed for the processing of static images, and video editing was far more difficult to perform. The popularisation of artificial intelligence has dissolved these technical barriers, which has translated into a drastic increase in the frequency of this phenomenon. [5]

Video 1. Deepfake in the form of video footage using the image of President Obama.

Moral dilemmas associated with deepfakes

Deepfake could be used to achieve a variety of purposes. The technology could be used for harmless projects, for example educational materials such as the video showing President Obama warning about the dangers of deepfakes (see Figure 2). Alongside this, it finds applications in the entertainment industry, such as the use of digital replicas of actors (although this application can raise moral dilemmas), an example of which is the use of a digital likeness of the late actor Peter Cushing to play the role of Grand Moff Tarkin in the film Rogue One: A Star Wars Story (see Figure 2).

A digital replica of actor Peter Cushing as Grand Moff Tarkin

Fig. 2. A digital replica of actor Peter Cushing as Grand Moff Tarkin. Source: https://screenrant.com/star-wars-rogue-one-tarkin-ilm-peter-cushing-video/ 

However, there are also many other uses of deepfakes that have the potential to pose a serious threat to the public. Such fabricated videos can be used to disgrace a person, for example by using their likeness in pornographic videos. Fake content can also be used in all sorts of scams, such as attempts to extort money. An example of such use is the case of a doctor whose image was used in an advertisement for cardiac pseudo-medications, which we cited in a previous article [6]. There is also a lot of controversy surrounding the use of deepfakes for the purpose of sowing disinformation, particularly in the area of politics. Used successfully, fake content can lead to diplomatic incidents, change the public’s reaction to certain political topics, discredit politicians and even influence election results. [7]

By its very nature, the spread of deepfakes is not something that can be easily prevented. Legal solutions are not fully effective due to the global scale of the problem and the nature of social network operation. Other proposed solutions to the problem include developing algorithms to detect fabricated content and educating the public about it.

AI-generated art

There are currently many AI-based text, image or video generators on the market. Midjourney, DALL-E, Stable Diffusion and many others, despite the different implementations and algorithms underlying them, have one thing in common – they require huge amounts of data which, due to their size, can be obtained only from the Internet – often without the consent of the authors of these works.  As a result, a number of artists and companies have decided to file lawsuits against the companies developing artificial intelligence models. According to the plaintiffs, the latter are illegally using millions of copyrighted images retrieved from the Internet. Up till now, he most high-profile lawsuit is the one filed by Getty Images – an agency that offers images for business use – against Stability AI, creators of the open-source image generator Stable Diffusion. The agency accuses Stability AI of copying more than 12 million images from their database without prior consent or compensation (see Figure 3). The outcome of this and other legal cases related to AI image generation will shape the future applications and possibilities of this technology. [8]

An illustration used in Getty Images' lawsuit showing an original photograph and a similar image with a visible Getty Images watermark generated by Stable Diffusion. Graphic shows football players during a match.

Fig. 3. An illustration used in Getty Images’ lawsuit showing an original photograph and a similar image with a visible Getty Images watermark generated by Stable Diffusion. Source: https://www.theverge.com/2023/2/6/23587393/ai-art-copyright-lawsuit-getty-images-stable-diffusion  

In addition to the legal problems of training generative models on the basis of copyrighted data, there are also moral dilemmas about artworks made with artificial intelligence. [9]

Will AI replace artists?

Many artists believe that artificial intelligence cannot replicate the emotional aspects of art that works by humans offer. When we watch films, listen to music and play games, we feel certain emotions that algorithms cannot give us. They are not creative in the same way that humans are. There are also concerns about the financial situation of many artists. These occur both due to not being compensated for the created works that are in the training collections of the algorithms, and because of the reduced number of commissions due to the popularity and ease of use of the generators. [10]

On the other hand, some artists believe that artificial intelligence’s different way of “thinking” is an asset. It can create works that humans are unable to produce. This is one way in which generative models can become another tool in the hands of artists. With them they will be able to create art forms and genres that have not existed before, expanding human creativity.

The popularity and possibilities of generative artificial intelligence continue to grow. Consequently, there are numerous debates about the legal and ethical issues surrounding this technology. It has the potential to drastically change the way we interact with art.

Conclusions

The appropriate use of artificial intelligence has the potential to become an important and widely used tool in the hands of humanity. It has the potential to increase productivity, facilitate a wide range of activities and expand our creative capabilities. However, the technology carries certain risks that should not be underestimated. Reckless use of autonomous vehicles, AI art or deepfakes can lead to many problems. These can include financial or image losses, but even threats to health and life. Further developments of deepfake detection technologies, new methods of recognising disinformation and fake video footage, as well as new legal solutions and educating the public about the dangers of AI will be important in order to reduce the occurrence of these problems.

References

[1] https://www.nhtsa.gov/vehicle-safety/automated-vehicles-safety

[2] https://blog.galonoleje.pl/pojazdy-autonomiczne-samochody-bez-kierowcow-juz-sa-na-ulicach

[3] https://www.forbes.com/sites/naveenjoshi/2022/08/05/5-moral-dilemmas-that-self-driving-cars-face-today/

[4] https://www.bbc.com/news/technology-45991093

[5] https://studiadesecuritate.uken.krakow.pl/wp-content/uploads/sites/43/2019/10/2-1.pdf

[6] https://www.medonet.pl/zdrowie/wiadomosci,kolejny-lekarz-ofiara-oszustow–zostal-twarza-pseudolekow–dr-sutkowski–to-jest-kradziez,artykul,26668977.html

[7] https://businessinsider.com.pl/technologie/nowe-technologie/deepfakes-historia-falszywych-filmow-i-pomysly-na-walke-z-nimi/s17z2p0

[8] https://apnews.com/article/getty-images-artificial-intelligence-ai-image-generator-stable-diffusion-a98eeaaeb2bf13c5e8874ceb6a8ce196

[9] https://www.benchmark.pl/aktualnosci/dzielo-sztucznej-inteligencji-docenione.html

[10] https://businessinsider.com.pl/technologie/digital-poland/sztuczna-inteligencja-w-sztuce-szansa-czy-zagrozenie/7lq70sx

Application of Machine Learning in Data Lakes

In the digital age, there is a growing need for advanced technologies. It means not only for collecting but especially for analysing data. Companies are accumulating increasing amounts of different information that can improve their efficiency and innovation. Data Engineering offered by BFirst.Tech can play a key role in the process of using data for the benefit of a company. This is an area of sustainable products for effective information management and processing. The article presents one of the opportunities offered by the Data Engineering area. For example the integration of Machine Learning with Data Lakes. 

Data Engineering – an area of ​​sustainable products dedicated to collecting, analysing and aggregating data 

Data engineering is a process of designing and implementing systems for the effective collection, storage and processing of large sets of data. This supports the accumulation of information such as website traffic analysis, data from IoT sensors or consumer purchasing trends. Firstly, the task of data engineering is to ensure that information is skillfully collected. What is more, it is stored but also easily accessible and ready for analysis. Data can be effectively stored in Lakes, Data Storages and Data Warehouses. Such integrated data sources can be used to create analyses or feed artificial intelligence engines, which ensures comprehensive use of the collected information (see the detailed description of the Data Engineering area (img 1)). 

Data Engineering

img 1 – Data Engineering

Data lakes used for storing sets of information    

Data lakes enable storing a huge amount of raw data in its original, unprocessed format. Thanks to the possibilities offered by Data Engineering, data lakes are capable of accepting and integrating data from a wide variety of sources. For instance, text documents, images, IoT sensor data. It makes it possible to analyse and utilise complex sets of information in one place. The flexibility of data lakes and their ability to integrate diverse types of data make them extremely valuable to organisations facing the challenge of managing and analysing dynamically changing data sets. Unlike Data Warehouses, Data Lakes offer greater versatility in handling a variety of data types, made possible by advanced data processing and management techniques used in Data Engineering. However, that versatility also raises challenges in the area of storing and managing such complex sets of data. It requires data engineers to constantly adapt and implement innovative approaches.[1, 2] 

Information processing in data lakes and the application of machine learning   

The increasing volume of stored data and its diversity pose a challenge in the area of effective processing and analysis. Traditional methods are often unable to keep up with the growing complexity. What is more, they lead to delays and limitations in accessing key information. Machine Learning, supported by innovations in Data Engineering, can significantly improve those processes. Using extensive data sets, Machine Learning algorithms identify patterns, predict outcomes and automate decisions. Thanks to the integration with Data Lakes (img 2), they can work with a variety of data types. That is to say, structured to unstructured, enabling more complex analyses. Such comprehensiveness enables a more thorough understanding and use of data that would be inaccessible in traditional systems.

Applying Machine Learning to Data Lakes enables deeper analysis and more efficient processing. It facilitates the process by advanced Data Engineering tools and strategies. This enables organisations to transform great amounts of raw data into useful and valuable information. That is important for increasing their operational and strategic efficiency. Moreover, the use of Machine Learning supports the interpretation of collected data and contributes to more informed business decision-making. As a result, companies can adapt to market demands more dynamically, and create data-driven strategies in an innovative way. 

Data Lake

img 2 – Data Lakes

Fundamentals of Machine Learning, key techniques and their application  

In this paragraph, let’s discuss Machine Learning. as an integral part of the so-called artificial intelligence. It enables information systems to learn and develop based on data. Different types of learning are distinguished in that field: Supervised Learning, Unsupervised Learning and Reinforcement Learning. In Supervised Learning, each type of data is assigned a label or score that allows machines to learn. For example, to recognise patterns and create forecasts. That type of learning is used in image classification or financial forecasting, inter alia. In turn, Unsupervised Learning, in the case of which unlabeled data is used, focuses on finding hidden patterns and is useful in tasks such as grouping elements or detecting anomalies. Reinforcement Learning is based on a system of rewards and punishments. It helps machines to optimise their actions under dynamically changing conditions, e.g. games or automation. [3]

In terms of algorithms, neural networks are excellent for recognising patterns in complex data, such as images or sound. It also forms the basis of many advanced AI systems. Decision trees are used for classification and predictive analysis, for example in recommendation systems or sales forecasting. Each of those algorithms has unique applications and can be tailored to the specific needs of a task or problem. As a result, it makes Machine Learning a versatile tool in the world of data. 

Examples of applications of Machine Learning 

The application of Machine Learning to Data Lakes opens up a wide spectrum of possibilities. We can enumerate from anomaly detection, through personalisation of offers, to optimisation of supply chains. In the financial sector, such algorithms effectively analyse transaction patterns and identify anomalies or potential fraud in real time. That is crucial in preventing financial fraud. In retail and marketing, Machine Learning enables the personalisation of offers to customers. It happens by analysing purchase behaviour and preferences, increasing customer satisfaction and sales efficiency. [4] In industry, the algorithms contribute to the optimisation of supply chains by analysing data from various sources – as weather forecasts or market trends. It helps predicting demand and manage inventory and logistics [5].

They can also be used for pre-design or product optimisation. Another interesting application of Machine Learning in Data Lakes is image analysis. Machine Learning algorithms are able to process and analyse large sets of images and pictures. They are used in fields such as medical diagnostics, where they can help detect and classify lesions in radiological images, or in security systems, where camera image analysis can be used to identify and track objects or people.  

 CONCLUSIONS  

The article emphasises developments in the field of data analytics, highlighting how Machine Learning, Data Lakes and data engineering influence the way organisations process and use information. Introducing such technologies into business improves existing processes and opens the way to new opportunities. The Data Engineering area introduces modernisation into information processing, characterised by greater precision, deeper conclusions and faster decision-making.  That progress emphasises the growing value of Data Engineering in the modern business world, which is an important factor in adapting to dynamic market changes and creating data-driven strategies. 

References 

[1] https://bfirst.tech/data-engineering/ 

[2] https://www.netsuite.com/portal/resource/articles/data-warehouse/data-lake.shtml 

[3] https://mitsloan.mit.edu/ideas-made-to-matter/machine-learning-explained 

[4] https://www.tableau.com/learn/articles/machine-learning-examples 

[5]https://neptune.ai/blog/use-cases-algorithms-tools-and-example-implementations-of-machine-learning-in-supply-chain 

Artificial intelligence and voice creativity

Artificial intelligence and voice creativity

 

Artificial intelligence (AI) has recently ceased to be a catchphrase that belongs in science-fiction writing and has become part of our reality. From all kinds of assistants to text, image, and sound generators, the machine and the responses it produces have made their way into our everyday lives. Are there any drawbacks to this situation? If so, can they be counterbalanced by benefits? This post addresses these questions and other dilemmas related to the use of AI in areas involving the human voice. 

How does artificial intelligence get its voice? The development of AI voices encompasses a number of cutting-edge areas, but the most commonly used methods include  

 

  • machine learning algorithms that allow systems to learn from data and improve their performance over time. Supervised learning is often employed to train AI voice models using large data sets related to human speech. With supervised learning, an AI model learns to recognise patterns and correlations between text input and corresponding voice messages. The AI learns from multiple examples of human speech and adjusts its settings so that the output it generates is as close as possible to real human speech. As the model processes more data, it refines its understanding of phonetics, intonation, and other speech characteristics, which results in increasingly natural and expressive voices;  

 

  • natural language processing (NLP) enables machines to understand and interpret human language. Applying NLP techniques allows artificial intelligence to break down written words and sentences to find important details such as grammar, meaning, and emotions. NLP allows AI voices to interpret and speak complex sentences, even if the words have multiple meanings or sound the same. Thanks to this, the AI voice sounds natural and makes sense, regardless of the type of language used. NLP is the magic that bridges the gap between written words and speech, making AI voices sound like real people, even when complex language patterns are involved.  

 

  • Speech synthesis techniques allow machines to transform processed text into intelligible and expressive speech. This can be done in a variety of ways, for example, by assembling recorded speech to form sentences (concatenative synthesis) or using mathematical models to create speech (parametric synthesis), which allows for greater customisation. Recently, a breakthrough method called neural TTS (Text-to-Speech) has emerged. It uses deep learning models, such as neural networks, to generate speech from text. This technique makes AI voices sound even more natural and expressive, capturing the finer details, such as rhythm and tone, that make human speech unique.  

 

 

In practice, the available tools can be divided into two main categories:  Text-to-Speech and Voice-to-Voice. Each allows you to clone a person’s voice, but TTS is much more limited when it comes to reproducing unusual words, noises, reactions, and expressing emotions. Voice-to-Voice, put simply, “replaces” the sound of one voice with another, making it possible, for example, to create an artificial performance of one singer’s song by a completely different singer, while Text-to-Speech uses the created voice model to read the input text (creating a spectrogram from the text and then passing it to a vocoder, which generates an audio file) [1]. As with any machine learning issue, the quality of the generated speech depends to a large extent on the model and the data on which the model was trained.  

While the beginnings of the research on human speech can be traced back to as early as the late 18th century, work on speech synthesis gained momentum much later, in the 1920s-30s, when the first vocoder was developed at Bell Labs [2]. The issues related to voice imitation and cloning (which is also referred to as voice deepfakes) were first addressed on a wider scale in a scientific paper published in 1997, while the fastest development of the technologies we know today occurred after 2010. The specific event that fuelled the popularity and availability of voice cloning tools was Google’s publication of the Tacotron speech synthesis algorithm in 2017 [3].   

 

Artificial intelligence can already “talk” to us in many daily life situations; virtual assistants like Siri or Alexa found in devices and customer service call machines encountered in various companies and institutions are already widespread. However, the technology offers opportunities that could cause problems, raising controversy about the ethics of developing it in the future. 

At the forefront here are the problems raised by voice workers, who fear the prospect of losing their jobs to machines. For these people, apart from being part of their identity, their voice is also a means of artistic expression and a work tool. If a sufficiently accurate model of a person’s voice is created, then suddenly, at least in theory, that person’s work becomes redundant. This very topic was the subject of a discussion that ignited the Internet in August 2023, when a YouTube creator posted a self-made animation produced in Blender, inspired by the iconic TV series Scooby-Doo [4]. The controversy was caused by the application of AI by the novice author to generate dialogues for the four characters featured in the cartoon, using the voice models of the original cast (who were still professionally active). A wave of criticism fell on the artist for using someone else’s voice for his own purposes, without permission. The issue was discussed among animation professionals, and one of the voice actresses from the original cast of the series also commented on it. She expressed her outrage, adding that she would never work with this artist and that she would warn her colleagues in the industry against him. As the artist published an apology (admitting his mistake and explaining that his actions were motivated by the lack of funds to hire voice-overs and the entirely amateur and non-profit nature of the animation he had created), the decision to blacklist him was revoked and the parties reconciled. However, what emerged from the discussion was the acknowledgment that the use of artificial intelligence for such purposes needs to be legally regulated. The list of professions affected by this issue is long, and there are already plenty of works using people’s voices in a similar way. Even though this is mostly content created by and for fans paying a kind of tribute to the source material, technically speaking, it still involves using part of someone’s identity without their permission. 

 

Another dilemma has to do with the ethical concerns that arise when someone considers using the voice of a deceased person to create new content. The Internet is already full of “covers” in which newly released songs are “performed” by deceased artists. This is an extremely sensitive topic, considering the feelings of the family, loved ones, and fans of the deceased person, as well as how the deceased person would feel knowing that part of their image was used this way.  

Another danger is that the technology may be used for the purposes of deception and misrepresentation. While remakes featuring politicians playing multiplayer games remain in the realm of innocent jokes, putting words that the politicians have never said into their mouths, for example, during an election campaign, is already dangerous and can have serious consequences for society as a whole. Currently, the elderly are particularly vulnerable to such fakes and manipulation, however, with the improvement of models and the parallel development of methods for generating images and mouth movements, even those who are familiar with the phenomenon may find it increasingly difficult to tell the difference between what is false and what is real [5].  

In the worst-case scenario, such deceptions can result in identity theft. From time to time, we learn about celebrities appearing in advertisements that they have never heard of [6]. Experts and authorities in specific fields, such as doctors, can also fall victim to this kind of identity theft when their artificially created image is used to advertise various preparations that often have nothing to do with medicine. Such situations, already occurring in our country [7], are particularly harmful, as potential recipients of such advertisements are not only exposed to needless expenses but also risk their health and potentially even their lives. Biometric verification by voice is also quite common. If a faithful model of a customer’s voice is created and there is a leak of his or her personal data, the consequences may be disastrous. The risk of such a scenario has already materialised for an application developed by the Australian government [8]. 

 

It is extremely difficult to predict in what direction the development of artificial intelligence will go with regard to human voice generation applications. It seems necessary to regulate the possibility of using celebrity voice models for commercial purposes and to ensure that humans are not completely replaced by machines in this sphere of activity. Failure to make significant changes in this matter could lead to a further loss of confidence in tools using artificial intelligence. This topic is divisive and has many supporters as well as opponents.  Like any tool, it is neither good nor bad in itself – rather, it all depends on how it is used and on the user’s intentions. We already have tools that can detect whether a given recording has been artificially generated. We should also remember that it takes knowledge, skill, and effort to clone a human voice in a convincing way. Otherwise, the result is clumsy and one can immediately tell that something is not right. This experience is referred to as the uncanny valley. The subtleties, emotions, variations, accents, and imperfections present in the human voice are extremely difficult to reproduce. This gives us hope that machines will not replace human beings completely, and this is only due to our perfect imperfection.

Sight-playing – part 3

We already created the harmony of the piece in the previous article. What we need now is a good melody which will match this harmony. Melodies consist of motifs, i.e. small fragments of about 2-5 notes and their variations (transformations).

We will start by generating the first motif – its rhythm and sounds. As we did when generating the harmony, we will use N-gram statistics for musical pieces. Such statistics will be prepared using the Essen Folksong Collection base. You might as well use any other melody base, this choice will affect the type of melodies that will be generated. For each piece, we must isolate the melody, convert it into a sequence of rhythmic values and a sequence of sounds, and from these sequences extract the statistics. When compiling sound statistics, it is a good idea to first somehow prepare the melodies – transpose them all to two keys, e.g. C major and c minor. This will reduce the number of possible (probable) N-grams by 12 times and therefore the statistics will be better assessed.

A good motif

We will begin creating the first motif by generating its rhythm. Here, I would like to remind you that we have previously made a certain simplification – each motif and its variations will last exactly one bar. The subsequent steps for generating the rhythm of a motif: – we draw the first rhythmic value using unigrams, – we draw the next rhythmic value using bigrams and unigrams, – we continue to draw consecutive rhythmic values, using N-grams of increasingly higher level (up to 5-grams), – we stop until we reach a total rhythmic value equal to the length of one bar – if we have exceeded the length of 1 bar, we start the whole process from the beginning (such generation is fast enough that we can afford such a sub-optimal trial-and-error method).

The next step is to generate the sounds of the motif. Another simplification we made earlier is that we generate pieces only in C major key, so we will make use of the N-gram statistics created on the basis of pieces transposed to this key, excluding pieces in minor keys. The procedure is similar to that for generating rhythm: – we draw the first sound using unigrams, –we draw the next sound using bigrams and unigrams, – we continue until we have drawn as many sounds as we have previously drawn rhythmic values, – we check whether the motif matches the harmony, if not, we go back and start again – if after approx.

100 attempts we failed to generate a motif matching the harmony, this could mean that with the preset harmony and the preset motif there is a very low probability of drawing sounds that will match the harmony. In this case, we go back and generate a new motif rhythm.

Generate until you succeed

When generating both the motif rhythm and its sounds, we use the trial-and-error method. It will also be used in the generation of motif variations described below. Even if this method may seem “stupid”, it’s simple and it works. Although very often such randomly generated motifs don’t match the harmony, we can afford to make many such mistakes. Even 1000 attempts take a short time to calculate on today’s computers, and this is enough to find the right motif.

Variations with raepetitions

We have the first motif, and now need the rest of the melody. However, we will not continue to generate new motifs, as the piece would become chaotic. We also cannot keep repeating the same motif, as the piece would become too boring. A reasonable solution would be, in addition to repeating the motif, to create a modification of that motif, ensuring variation, but without making the piece chaotic.

There are many methods to create motif variations. One such method is chromatic transposition. It involves transposing all notes upward or downward by the same interval. This method can lead to a situation where a motif variation has sounds from outside the key of the piece. This, in turn, means that the probability that the variation will match the harmony is very low. Another method is diatonic transposition, whereby all notes are transposed by the same number of scale steps. Unlike the previous method, diatonic variations do not have off-key sounds.

Yet another method is to change a single interval; one of the motif intervals is changed, while all other intervals remain unchanged. That way, only one part of the motif (the beginning or the end) is transposed (via chromatic or diatonic transposition). Further methods are to convert two notes with the same rhythmic value to one or to convert one note to two notes with the same rhythmic value. For the first method, if the motif has two notes with the same rhythmic value, its rhythm can be changed by combining these two notes. For the second method, a note is selected at random and converted to two “shorter” notes.

Each of the described methods for creating variations makes it possible to generate different motifs. The listed methods are not the only valid methods; it is possible to come up with many more. The only restriction here is that the generated variation should not differ too much from the original motif. Otherwise, it would constitute a new motif rather than a variation. The border where the variation ends and a different motif begins is conventional in nature.

Etc., etc.

There are many more methods for generating motif variations; it is possible to come up with a lot of these. The only restriction is that the generated variation should not differ too much from the original motif. Otherwise, it would constitute a new motif rather than a variation. The border where the variation ends and another motif begins is rather conventional in nature and everyone “feels”, defines it a little differently.

Is that all?

That would be all when it comes to piece generation. Let us summarise the steps that we have taken:

  1. Generating piece harmony:
    • generating harmonic rhythm,
    • generating chord progression.
  2. Generating melody:
    • generating motif rhythm,
    • generating motif sounds,
    • creating motif variations,
    • creating motifs and variations “until it’s done”, that is, until they match the generated harmony

All that is left is to make sure that the generated pieces are of the given difficulty, i.e. matching the skills of the performer.

Controlling the difficulty

One of our assumptions was the ability to control the piece difficulty. This can be achieved via two approaches:

  1. generating pieces “one after another” and checking their difficulty levels (using the methods described earlier), thereby preparing a large database of pieces from which random pieces of the given difficulty will then be selected,
  2. controlling the parameters for creating the harmonies, motifs and variations in such a way that they generate musical elements of the given difficulty with increased frequency

Both methods are not mutually exclusive and thus can be successfully used together. First, a number of pieces (e.g. 1000) should be generated randomly, and then parameters should be controlled to generate further pieces (but only those which are missing). With respect to parameter control, it is worth noting that the probability of motif repetition can be changed. For pieces with low difficulty, the assigned probability will be higher (repetitions are easier to play). On the other hand, difficult pieces will be assigned lower probability and rarer harmonies (which will also force rarer motifs and variations).

Sight-playing part – 2

In the first part of the article, we have learned about many musical and technical concepts. Now it is time to use them to build an automatic composer.  Before doing so, however, we must make certain assumptions (or rather simplifications):

  • the pieces will consist of 8 bars in periodic structure (antecedent 4 bars, consequent 4 bars)
  • the metre will be 4/4 (i.e. 4 quarter notes to each bar, accent on the first and third measures of the bar)
  • the length of each motif is 1 bar (although this requirement appears restrictive, many popular pieces are built precisely from motifs that last 1 bar).
  • only C major key will be used (if necessary, we can always transpose the piece to any key after its generated)
  • we will limit ourselves to about 25 most common varieties of harmonic degrees (there are 7 degrees, but some of them have several versions, with additional sounds which change the chord colour).

What is needed to create a musical piece?

In order to automatically create a simple musical piece, we need to:

  • generate the harmony of a piece – chords and their rhythm
  • create motifs – their sounds (pitches) and rhythm
  • create variations of these motifs – as above
  • combinate the motifs and variations into a melody, matching them with the harmony

Having mastered the basics, we can move on to the first part of automatic composing – generating a harmony. Let’s start by creating a rhythm of the harmony.

Slow rhythm

Although one might be tempted to create a statistical model of the harmonic rhythm, unfortunately, (at least at the time of writing this article) there is no available base which would make this possible. Given the above, we must handle this in a different way – let’s come up with such a model ourselves. For this purpose, let’s choose a few “sensible” harmonic rhythms and give them some “sensible” probability.

rhythmprobabilityrhythmprobability
[8]0.2[2,2]0.1
[6, 2]0.1[2,1,1]0.02
[2, 6]0.1[3,1]0.02
[7, 1]0.02[1,1,1,1]0.02
[4]0.4[1,1,2]0.02
Table 1. Harmonic rhythms, values expressed in quarter notes – [6, 2] denotes a rhythm in which there are two chords, the first one lasts 6 quarter notes, the second 2 quarter notes.

The rhythms in the table are presented in terms of chord duration, and the duration is shown in the number of quarter notes. Some rhythms last two bars (e.g. [8], [6, 2]), and others one bar ([4], [1, 1, 2] etc.).

Generating a rhythm of the harmony proceeds as follows. We draw new rhythms until we have as many bars as we needed (8 in our case). Sometimes certain complications may arise from the fact that the rhythms have different lengths. For example, there may be a situation where to complete the generation we need the last rhythm that lasts 4 quarter notes, but we draw one that lasts 8 quarter notes. In this case, in order to avoid unnecessary problems, we can force drawing from a subset of 4-quarter-note rhythms.

Then, in line with the above findings, let’s suppose that we drew the following rhythms:

  • antecedent: [4, 4], [2, 2], [3, 1], 
  • consequent: [3, 1], [8], [2, 2]

Likelihood

In the next step, we will be using the concept of likelihood. It is a probability not normalised to one (so-called pseudo-probability), which helps to assess the relative probability level of different events. For example, if the likelihoods of events A and B are 10 and 20 respectively, this means that event B is twice as likely as event A. These likelihoods might as well be 1 and 2 or 0.005 and 0.01. From the likelihoods, probability can be calculated. If we assume that only events A and B can occur, then their probability will be respectively:

Chord progressions

In order to generate probable harmonic flows, we will first prepare the N-gram models of harmonic degrees. To this end, we will use N-gram models available on github (https://github.com/DataStrategist/Musical-chord-progressions).

In our example, we will use 1-, 2-, 3-, 4- and 5-grams.

In the rhythm of the antecedent’s harmony, there are 6 rhythmic values, so we need to prepare the flow of 6 harmonic degrees. We generate the first chord using unigrams (1-grams). Now, we first prepare the likelihoods for each possible degree and then draw while taking these likelihoods into consideration. The formula for likelihood is quite simple in this case

likelihoodX=p(X)

where

  • X means any harmonic degree
  • p(X) is the probability of the 1-gram of X

In this case, we drew IV degree (in this key of F major).

We generate the second chord using bigrams and unigrams, with a greater weight for bigrams.

likelihoodX=weight2gramp(X v IV)+weight1gram*p(X)

where:

  • p(X v IV) is the probability of the flow (IV, X)
  • weightNgram is the adopted N-gram weight (the greater the weight, the greater the impact of this N-gram model, and the smaller the impact of other models)

We can adopt N-gram weights as we wish. For this example, we chose the following:

N-gram12345
weight0.0010.010.115

The next chord we drew was: vi degree (a minor).

The generation of the third chord is similar, except that we can now use 3-grams:

likelihoodX=weight3gramp(X v IV, vi)+weight2gramp(X v IV)+weight1gram*p(X)

And so we continue until we have generated all the necessary chords. In our case, we drew:

IV, vi, I, iii, IV, vi (in the adopted key of C major these are, respectively, F major, a minor, C major, e minor, F major and a minor chords).

This is not a very common chord progression but, as it turns out, it occurs in 5 popular songs (https://www.hooktheory.com/trends#node=4.6.1.3.4.6&key=rel)

Summary

We were able to generate the rhythms and chords which are the components of the harmony of a piece. However, it should still be noted here that, for the sake of simplicity, we didn’t take into account two important factors:

  • The harmonic flows of the antecedent and consequent are very often linked in some way. The harmony of the consequent may be identical with that of the antecedent or perhaps slightly altered to create the impression that these two sentences are somehow linked.
  • The antecedent and consequent almost always end on specific harmonic degrees. This is not a strict rule, but some harmonic degrees are far more likely than others at the end of musical sentences.

For the purposes of the example, however, the task can be deemed completed. The harmony of the piece is ready, now we only need to create a melody to this harmony. In the next part of our article, you will find out how to compose such a melody.

Cloud computing vs environment

The term “cloud computing” is difficult to define in a clear manner. Companies will approach the cloud differently than individuals. Typically, “cloud computing” is used to mean a network of server resources available on demand – computing power and disk space, but also software – provided by external entities, i.e. the so-called cloud providers. The provided resources are accessible via the Internet and managed by the provider, which eliminates the need for companies to purchase hardware and directly manage physical servers. In addition, the cloud is distributed over multiple data centres located in many different regions of the world, which means that users can count on lower failure rates and higher availability of their services [1].

The basic operation of the cloud

Resources available in the cloud are shared by multiple clients, which makes it possible to make better use of the available computing power and, if utilised properly, can prove to be more cost-effective. Such an approach to resource sharing may raise some concerns, but thanks to virtualisation, the cloud provides higher security than the traditional computing model. Virtualisation makes it possible to create simulated computers, so-called virtual machines, which behave like their physical counterparts, but reside on a single server and are completely isolated from each other. Resource sharing and virtualisation allow for efficient use of hardware and ultimately reduce power consumption by server rooms. Financial savings can be felt thanks to the “pay-as-you-go” business model commonly used by providers, which means that users are billed for actually used resources (e.g. minutes or even seconds of used computing time), as opposed to paying a fixed fee. 

The term “cloud” itself originated as a slang term. In technical diagrams, network and server infrastructure is often represented by a cloud icon [2]. Currently, “cloud computing” is a generally accepted term in IT and a popular computing model. The affordability of the cloud and the fact that users are not required to manage it themselves mean that this computing model is being increasingly preferred by IT companies, which has a positive impact on environmental aspects [3].

Lower power consumption 

The increasing demand for IT solutions leads to increased demand for electricity – a strategic resource in terms of maintaining the cloud. A company maintaining its own server room leads to significant energy expenditure, generated not only by the computer hardware itself but also by the server room cooling system. 

Although it may seem otherwise, larger server rooms which process huge amounts of data at once are more environmentally friendly than local server rooms operated by companies [4]. According to a study carried out by Accenture, migrating a company to the cloud can reduce power consumption by as much as 65%. This stems from the fact that cloud solutions on the largest scale are typically built at dedicated sites, which improves infrastructure organisation and resource management [5]. Providers of large-scale cloud services can design the most effective cooling system in advance. In addition, they make use of modern hardware, which is often much more energy-efficient than the hardware used in an average server room. A study conducted in 2019 revealed that the AWS cloud was 3.6 times more efficient in terms of energy consumption than the median of the surveyed data centres operated by companies in the USA [6].

Moreover, as the cloud is a shared environment, performance can be effectively controlled. The scale of the number of users of a single computing cloud allows for a more prudent distribution of consumed energy between individual cases. Sustainable resource management is also enabled by our Data Engineering product, which collects and analyses data in order to maximise operational efficiency and effectiveness.

Reduction of emissions of harmful substances

Building data processing centres which make use of green energy sources and are based on low-emission solutions makes it possible, among others, to control emissions of carbon dioxide and other gases which contribute to the greenhouse effect. According to data presented in the “The Green Behind Cloud” report [7], migrating to public cloud can reduce global carbon dioxide emissions by 59 million tonnes per year, which is equivalent to removal of 22 million cars from the roads.

It is also worth considering migration to providers which are mindful of their carbon footprint. For example, the cloud operated by Google is fully carbon-neutral through the use of renewable energy, and the company promises to use only zero-emission energy around the clock in all data centres by 2030 [8]. The Azure cloud operated by Microsoft has been carbon-neutral since 2012, and its customers can track the emissions generated by their services using a special calculator [9].

Reduction of noise related to the use of IT hardware  

Noise is classified as environmental pollution. Though at first glance it may appear quite inconspicuous and harmless, it has a negative impact on human health and the quality of the environment. With respect to humans, it increases the risk of such diseases as cancer, myocardial infarction and arterial hypertension. With respect to the environment, it leads to changes in animal behaviour and affects bird migration and reproduction.

The main source of noise in solutions for storing data on company servers is a special cooling system which maintains the appropriate temperature in the server room. Using cloud solutions makes it possible to reduce the noise emitted by cooling devices at workplaces, which helps limit environmental noise pollution.

If you want to learn more about the available solutions for reducing industrial noise, check our Intelligent Acoustics product.

Waste level reduction 

Making use of cloud computing in business activities, as opposed to having traditional servers as part of company resources, also helps reduce the amount of generated electronic waste. This stems primarily from the fact that cloud computing does not necessitate the purchase of additional equipment or preparation of infrastructure for a server room at the company, which reduces the amount of equipment that needs to be disposed of in the long term.  

In addition, the employed virtualisation mechanisms, which entail the replacement of a larger number of low-performance servers with a smaller number of high-performance servers which are able to use this performance more effectively, optimise and increase server efficiency, and thus reduce the demand for hardware resources.  

Summary 

Sustainability is currently an important factor in determining the choice of technology. Environmental protection is becoming a priority for companies and for manufacturers of network and telecommunications devices, which means that greener solutions are being sought. Cloud computing definitely fits this trend. It not only limits the consumption of hardware and energy resources, but also reduces the emission of harmful substances into the ecosystem as well as noise emissions into the environment.  

References 

[1] https://www.wit.edu.pl/dokumenty/wydawnictwa_naukowe/zeszyty_naukowe_WITZ_06/0006_Joszczuk-Januszewska.pdf 

[2] https://rocznikikae.sgh.waw.pl/p/roczniki_kae_z36_21.pdf 

[3] http://yadda.icm.edu.pl/yadda/element/bwmeta1.element.ekon-element-000171363539  

[4] Paula Bajdor, Damian Dziembek “Środowiskowe i społeczne efekty zastosowania chmury obliczeniowej w przedsiębiorstwach” [“Environmental and Social Effects of the Use of Cloud Computing in Companies”], 2018 

[5] https://www.accenture.com/_acnmedia/PDF-135/Accenture-Strategy-Green-Behind-Cloud-POV.pdf  

[6] “Reducing carbon by moving to AWS” https://www.aboutamazon.com/news/sustainability/reducing-carbon-by-moving-to-aws

[7] https://www.accenture.com/us-en/insights/strategy/green-behind-cloud

[8] “Operating on 24/7 Carbon-Free Energy by 2030.” https://sustainability.google/progress/energy/

[9] https://www.microsoft.com/en-us/sustainability/emissions-impact-dashboard