Google Developed an AI that to Creates Music from Text

Google has created an AI model capable of generating music from basic prompts and text descriptions but refuses to release it due to the tool’s risks and limitations.

The generative AI tool, dubbed AudioML, was revealed in a research paper published on Friday, January 26, and uses data from a vast training database of over 280,000 hours of music to produce music with “high fidelity” and “significant complexity.”

Not only can the AI combine genres and instruments, but also write tracks using abstract concepts that are normally difficult for computers to comprehend and replicate.

“MusicLM can be conditioned on both text and a melody in that it can transform whistled and hummed melodies according to the style described in a text caption,” the co-authors wrote in the paper.

Google is not the first to create an AI tool for song generation. Other attempts include Riffusion and OpenAI’s Jukebox. But according to the the tools’s developers, MusicLM “outperforms previous systems both in audio quality and adherence to the text description.”

To demonstrate the tools’s potential, Google shared dozens of audio sample on its Github Page alongside the text prompts that has used to create them.

It also published long-form music generated by the tool, as well as music created using a series of text prompts, which it calls “story mode.”

Users can enter descriptions like “enchanting jazz song with a memorable saxophone solo and solo singer” or “the main soundtrack of an arcade game,” and the AI engine will produce the accurate generate a song that correlates with the prompt.

Impressive but terrifying

Although MusicLM’s impressive capabilities demonstrate the potential for AI-wired music generation, the tool is also far from flawless.

Some of the samples shared by Google are noticeably distorted, one of the major unavoidable side effects of the AI training process. It also fails at creating intelligible vocals, instead producing muddled lyrics with little sense or gibberish.

When it does produce coherent vocals, the tool does so by compiling voices from real-life artists, sparking concern that it could incorporate copyrighted material into its songs.

According to the authors of the study, about 1 per cent of the music generated by the tool was directly sampled from the music it trends – an amount high enough to discourage them from realise the tool at this stage.

“We acknowledge the risk of potential misappropriation of creative content associated with the use case,” the co-authors of the paper explained.

“We strongly emphasise the need for more future work in tackling these risks associated with music generation,” they added.

Generative AI models generate content by scraping thousands of often copyrighted files from the internet and collecting data from each file to create code.

This code can sometimes lead to gendered content having attributes that can be traced back to its original creators, sparking major ethical and legal concerns.

Just last week, Image Repository site Getty Images filed a lawsuit against AI Art Generator Stable Diffusion after it determined that the AI had “unlawfully copied and processed” millions of images from the platform “to the detriment of content creators posting on the platform.

Getty Sues AI — ***An image was created using Stable Diffusion. It shows a distorted Getty watermark.***

Music generators like Google’s MusicML have also found themselves in legal hot water. In 2020, Jay-Z’s record label filed copyright lawsuits against YouTube channel Vocal Synthesis for creating Jay-Z covers of songs using the rapper's voice.

‘Code red’

Despite MusicML’s limitations, the tool’s development is significant in the context of the current excitement and speculation surrounding AI technologies.

In what some are calling the “AI revolution” experts believe that 2023 and the mark will be the beginning of major advancements in AI sparked by the launch of OpenAI’s generative AI chatbot ChatGPT.

ChatGPT exploded in popularity when it was launched in November last year for its ability to produce accurate, human-like responses to almost any prompt or query, leading some to suggest that it could one day replace Google’s search engine.

The technology even garnered the interest of Google’s big competitor, Microsoft, which has invested over $10 billion into the chatbot in recent months. It plans to implement to tool into its search engine, Bing.

This caused Google management to issue a “code red” in December, which directed Google teams to aid in developing and launching AI prototypes and products, the Times reported.

It is thought that MusicML may be the first project to derive from this initiative, but other 20 other technologies, such as an AI chatbot-enabled version of Google, are also in development and set to be showcased in 2023.