A deep-learning algorithm developed by EPFL scientists can generate melodies that imitate a given style of music. The ‘deep artificial composer’ could one day generate convincing music for multiple instruments in real time, with applications ranging from video games to helping composers in the creative process.
The ‘deep artificial composer’, or ‘DAC’ for short, generates brand-new melodies that imitate traditional folk music of Irish or Klezmer origin. It does so without plagiarising already existing ones, since melodies it writes are as original as those produced by a human composer. The results were presented in April at this year’s edition of the Evostar conference.
The DAC actually produces musical scores of melodies, symbolic music written using notation, and does not generate audio files. ‘The deep artificial composer can produce complete melodies, with a beginning and an end, that are completely novel and that share features that we relate to style,’ says EPFL scientists Florian Colombo who developed the artificial intelligence under the guidance of Wulfram Gerstner, director of the Computational Neuroscience Laboratory.
Colombo continues, ‘To my knowledge, this is the first time that an artificial neural network model has produced entire and convincing melodies. We also provide a new tool to evaluate the originality of a piece.’
Algorithmic music composition was first suggested in the literature by English mathematician Ada Lovelace in the 19th century, requiring an ‘Analytical Machine’ which could be programmed to solve even the most complex problems, like writing music. The computational power of modern computers and the sheer amount of digitized musical scores are now making automatic music composition a reality.
Artificial Intelligence (AI) is already capable of composing symbolic music and is often based on implementing music theory. What’s new with the DAC is that the AI learns to compose complete melodies without any music theory from start to finish, solely based on a large database of existing music. No human postproduction is necessary.
Extracting music style with probabilities
EPFL’s deep artificial composer avoids traditional music theory altogether. Each style of music has its own set of rules, and existing AI generated music often uses the Western musical language of harmony and counterpoint. In fact, the EPFL algorithm determines its own composition rules by extracting probability distributions from existing melodies using neural networks, requiring only the computation power of graphic cards that can speed up calculations by a factor of ten compared to standard computers.
The DAC extracts the style of the music by learning how a given piece of music transitions from one note to the next, and calculates the probability of the next note’s pitch and duration. The algorithm then trains on multiple scores of music, of any given style, in order to improve its ability to correctly predict the pitch and duration of the upcoming note.
Once the training is complete, which means that the predictive performance of the deep artificial composer has reached its target value, set at 50% successful pitches and 80% successful durations, it no longer needs to be trained and can be used to generate new melodies, one note at a time. The deep artificial composer builds a string of notes from beginning to end, including the very first note and the length of the composition, that resembles melodies of the dataset that was used for the training. Listen to a melody composed by the DAC based on Irish and Klezmer melodies and interpreted by Colombo on the cello.
Of course, the DAC can compose melodies before the training process is complete, but this leads to unconvincing melodies , even to the untrained ear. It can also be trained beyond the target value, but the generated pieces tend to resemble existing compositions. The DAC can also determine itself if the composition is original enough by comparing phrases of notes with existing patterns in its database of melodies. Similarly, the algorithm can determine the musical genre - Irish or Klezmer folk in this case - of the generated scores.
The generated music is not limited to Irish or Klezmer traditional folk music: any style of music could be used. It just so happens that many Irish and Klezmer melodies are already digitized and easily accessible.
Insight into the human brain
The computing power behind the artificial intelligence is an artificial neural network, known as ‘long short-term memory’ and invented twenty years ago at the IDSIA in Lugano. It has already proven useful for speech recognition and is widely used by the largest software companies like Google, Apple, and Microsoft.
‘The success of the deep artificial composer provides insight into how the human brain works,’ says Gerstner. ‘Neural networks with memory spanning different time scales are needed to successfully create music, implying that the ability of the human brain to retain information, even after a long period of time, is key to composing music.’
For Colombo and Gerstner, the work is preliminary since the DAC is limited to single voice compositions. In the long-run, Colombo hopes to generate a score for an entire orchestra.