Can AI Truly Capture the Soul of Music?

With artificial intelligence tools becoming more popular, the question arises of AI’s role within the arts and whether it can truly replace musicians, writers, and illustrators. If the next radio hit, catchy jingle, or best-selling album is just a few prompts away, are we wasting our time still making music the old-fashioned way?


To excel as a musician, one must become an exceptional listener. Many musicians practice by learning songs by ear, which demands a high level of attention. All the instruments harmonize but also maintain their distinct spaces. Listening elevates one's musical journey, especially when performing with others. Being present, attentive, and synchronized is crucial. 


In my career, I've worked in various genres, collaborated with fellow artists, nurtured emerging talent, created music for TV and film, and produced in studios. I've even adjudicated for music grants and the JUNOS, Canada's version of the Grammys. This extensive exposure has deepened my analytical listening, and my conclusion is this: technically perfect performances lack the imperfections that add soul and emotion to great music. It has got me pondering about the goose-bumping, spine-shivering, and breath-taking abilities of music.


I have read that musicians are 85% more prone to get chills from music. This could be due to the dopamine release when our anticipation matches the musical outcome. When hearing an unfamiliar song, there's always a note that stands out, resonating deeply.


When AI analyzes music, it examines chords, melodies, rhythms, dynamics, and other musical elements to identify patterns. These patterns then guide the AI in producing music that resembles what it has analyzed. While the potential for technical excellence is there, something might be lacking.


Artists like Adele have raw emotional power. It's not about technical perfection; it's the relatability in the imperfections. Some of the world's iconic recording studios don't have perfect sound environments, and some artists choose to use vintage microphones instead of the latest technology. Authenticity connects with us because we are inherently imperfect.


Expressing genuine emotions can often be messy. Similarly, genuine music has elements of unpredictability and imperfection. Take the Blues, for instance. BB King's selective notes conveyed profound emotions. In contrast, some highly technical performances can feel soulless. In modern studios, tools like autotune are employed. While initially used for minor corrections, it's now an integral sound. I took my son to a concert the other day, and the opening act had autotune on his voice throughout the entirety of the concert, including when he talked to the audience. I could hardly make out what he was saying, and rather than enhancing the sound, it just made it awkward. In my opinion, extreme reliance on autotune can strip music of its humanity. There needs to be a human element to a recording to capture the spontaneous perfect imperfections of the moment, the dynamics, the timing, etc… 


In a way, it can be compared to using Chat GPT and other AI tools to edit writing. Using it as an aid can be beneficial, but an entirely AI-generated piece of writing tends to feel impersonal. It's all about balance. Introducing organic elements can bring warmth to synthetic sounds. Authenticity is magnetic, whether in branding, communication, or music. When something feels off, it can be jarring. AI is impressive, but there are nuances it might miss, much like Grammarly might suggest changes that don't align with one's intent or a writer’s unique voice. As an example, imagine if Grammarly had a chance to edit some of our most loved pop and rock lyrics. We’d be stuck with Mick Jagger singing “I cannot get any satisfaction” or Roger Waters using perfect grammar to sing “We do not need an education.” I doubt those would have been hits.


It’s worth repeating: The essence lies in those subtle imperfections.


You’re allowed to do that?


AI, no matter how advanced, is still working off patterns and information from the past. While it can mimic creativity, there is a uniqueness to human spontaneity and unpredictability that's hard to capture. Every mistake, every stray thought, every unintentional note or word adds layers of depth and authenticity. Can AI really reproduce that or even appreciate it?


And it's the same with art or films or any piece of creative work. We cherish the things that resonate with us on a deep, personal level, whether it's because they're perfect or beautifully imperfect. It's that ability to connect, to make us feel seen or understood, that's hard to find. And if we rely too much on data or AI to dictate what will resonate, we may miss out on the raw, unpolished gems that have the most profound impact.


A song, a book, or a movie doesn't necessarily adhere to the standard formula but touches us deeply. The ones that make us go: “You’re allowed to do that?” Those are the moments we remember. Those are the stories we share. There's an intangible quality that can't be quantified or replicated. Perhaps the unpredictability and the surprises are what make art truly magical. The unexpected in music or any art form grabs our attention, similar to how unforeseen elements while driving or walking might jolt us back to reality. They are psychological tools artists can use to engage audiences. Even when we predict or anticipate an outcome in music or a story, its realization brings satisfaction. On the other hand, when something unexpected happens, the surprise element can be equally rewarding. 


In music, even professionals occasionally hit the wrong note on stage. The mark of a seasoned artist, though, is seamlessly blending the mistake, even repeating it, to give the illusion of intention.


When musicians step off the stage after a performance, they often hear, "That amazing thing you did," pointing to a mistake. It's illuminating. In your mind, you had the perfect note to hit, and you ended up playing something entirely different. To you, it was an error because it wasn't the planned note. But the audience, oblivious to your intention, heard it in the context and believed it was purposeful. So, a moment you consider a slip-up becomes a highlight for them. Their lack of context changes their perception. While we can program technology to mimic these "human moments," it wouldn't be the same. Pre-programming a musical hiccup doesn't capture the spontaneity of a live performance. It's often the unexpected, the unpredictable, that leaves the most lasting impression. Whether it's in art, music, or writing, it's those moments that stay with us and shape our experiences. And that, I believe, is what makes human creativity so invaluable.


Imitation vs. originality 


Not only does it feel different listening to AI-generated music or reading AI-generated texts. We stand to lose a lot by unquestioningly trusting technology. For a long time, I've been deeply interested in data. When big data became mainstream, I was right in the thick of it, exploring tools and discussing insights with people. A particular website, Next Big Sound, stood out. It aggregated data from various online sources about artists, predicting the next breakout stars, often contrary to popular perception. This piqued my interest, especially when data sometimes didn't support the hype around certain big artists.


What fascinated me further was the intertwining of data with music. I understand the value of historical data. For instance, if a musical hook is introduced in three seconds rather than seven, and data shows it's more effective, that's worth noting. However, truly memorable art often stems from novelty and differentiation. In my book about the music business – now a textbook in universities – I emphasize that difference is crucial. Simply imitating someone else's success never leads to surpassing them. Consider iconic singers like Axl Rose of Guns & Roses or Brian Johnson of ACDC. Their uniqueness made them stand out. If every artist tried to replicate Lady Gaga's meat dress, its originality would be lost. We often anchor our future predictions on past data, but this approach may not always yield genuine innovation.


A well-known Canadian producer once told me he aims to create music that makes people feel something new. This essence might be complex to capture purely through AI, devoid of human emotion.


Regarding AI and music, I don't think the technology can replicate the human touch. For instance, a TikTok song claims it's "scientifically programmed" to make listeners cry. While I can only speak for myself, it certainly didn’t make me cry, but it did sound like my phone was having a sob. It does make me wonder, can AI really capture the nuances of human emotion? Music, art, and writing are reflections of the soul, our emotions, and connections. A soulless device, no matter how hard it studies humans, will be unlikely to accurately depict what is under the surface.


Making a song too specific, pinpointing a particular time and place, can alienate listeners. A universal theme, on the other hand, is relatable. And isn't that what we're all seeking in music? Connection? Songs that are too melodically intricate to replicate vocally rarely become hits. It's the simplicity, the shared experience, that makes a tune memorable.


If we look into the future, is it likely that we will be telling our friends about the great series of AI books we have just finished reading or sharing our favorite AI songs that make us emotional? I certainly hope not, but only time will tell.