When it comes to art, there is no consensus on what qualifies as art. There is a perpetual passionate debate about "What is Art?" amongst the art scholars and critics. So when it comes to calling something "Art", I stick to the basics: If you think it is Art then it is Art.
There is an apparent ongoing paradigm shift in the art world with all the craze around NFTs. Renewing school-of-thought that the spirit of art could accommodate the utilitarian as well as aesthetic outlook. As for "the purpose of art", it is integral to the human experience. I digress to steer this blog into the never-ending discussions on what qualifies as Art. This blog is about Artificial Intelligence.
Having created a neural network that outputs "art" seems like a paradox to me. What is the epitome of creativity here? The art created or the code that creates a virtual "brain" that creates the artwork. If we try to settle all the questions on that, it will take a lot of time and many assumptions. Thus, I believe, let us leave it to the thinkers as a subjective matter.
So after "Releasing" the album in my last blog post, I thought now it is time to hand over an easel and brush to another AI model. So I trained AI to create another artform. When it comes to painting I wanted my model to paint what I like to paint the most: Oil painting portraits. Well, we can't actually do it on canvas with real oil paints cos let's face it I have so many limitations on having a real-life robot and training it to do so. So for all the intents and purposes, I will stick to a virtual replica of an oil painting aka Image.
Oil On Canvas Portrait
To do so, I built a Generative Adversarial Network (GAN). Generative modelling is an unsupervised learning task in machine learning that involves automatically discovering and learning the regularities or patterns in input data. GANs work by identifying the patterns, So I trained them on images of portraits. The orientation and poses in the dataset did very vastly which makes it difficult for the model to recognise the patterns. Despite knowing that, I was still willing to give it a try. As I basically love oil painted portraits and thought it would be awesome to see what a machine would process out of it.
How It works!
Neural Networks are the basis of deep learning where the algorithms are inspired by the structure of the human brain. The Neural Network takes in data, train itself to understand the patterns in the data and gives out the output in a new set of similar data. Let us learn how a neural network processes data. A Neural Network consists of layers of Perceptrons, the fundamental component of a Neural Network. The network consists of three kinds of layers; An input layer, an output layer and sandwiched between those are one or more hidden layers.
Say we want to build a neural network that can classify images of cats and dogs. The input layer takes in the data in the form of images of cats and dogs, which is encoded as the numeric values of pixels in all three colour channels. The neurons in the input layer are connected to the consequent layer, so on and so forth, via channels. each of these channels is allotted some weights. Weights are a numeric value, which is then multiplied with corresponding input data, that is the pixels value in our example. This multiplied data value is passed on to the corresponding hidden layer. Each neurone in the hidden layer is associated with a value called bias. Which is then added to the input to the neuron from the previous layer and passed on to an activation function. The Activation function outputs the result which decides if the corresponding neuron will be activated or not. I like to think of it as the synapse in the human brain. If the corresponding neuron is activated the data is forwarded to the next layer. This unidirectional flow of data is called forward propagation. This goes on for as many hidden layers as the said neural network has. At the output, layer ends the neurons with the highest value fire up and this is the determinant of the prediction by the network. This prediction is in the form of probability and the output class getting higher probability is the final classification of the model. For an untrained neural network, this is absolutely arbitrary.
When we train a network it iterates the values of weights and biases in such a way that the final values are optimized to predict the right output. This is done by a process of bi-direction information flow that includes backpropagation. To train a neural network along with the training data the network is also fed the actual class of the image. In this way, with each iteration, the network gets to evaluate the errors. This comparison of the calculated values and the true values is indicative that there is a need to change the values of weights and biases. As this information feed propagates backwards through the network the weights are adjusted. This backward flow of information is called backpropagation. During the training, this forward and backward flow of information iterates over with multiple data points a number of times, until the error is significantly low and the predictions are mostly correct.
This approach is good for classifying data. To generate images, I will build a Generative Adversarial Network (GAN). It is a dexterous way of posing the problem as a supervised learning problem. It comprises two models, a Generator and a Discriminator.
Two models are trained, simultaneously, by an adversarial process. The generator ("the artist") learns to create images, that look like the dataset. While a discriminator ("the art critic") learns to tell real images apart from fakes. During training, the generator progressively becomes better at creating images that look real. Likewise, the discriminator becomes better at telling them apart. As the process reaches equilibrium, the discriminator can no longer distinguish between real and fake.
Generative modelling is an unsupervised learning task in machine learning. It involves automatically discovering and learning the regularities or patterns in input data. As GANs work by identifying the patterns in the data, I used portraits. However, glancing over the dataset gave me an idea that it was going to be a long shot. The orientation and poses in the dataset vary vastly. Keeping that in mind I was still willing to give it a try. Only because portraits are my jam. I basically love oil painted portraits.
The output of the Neural Network picked up some patterns in the portraits. As we can see in the slide below. My model with GAN worked quite well.
GAN Generated Portraits
Check out my full project Kaggle notebook for this one. (here)
GANs are notorious for being data-hungry, so I would consider increasing the dataset. There were many inconsistencies in the data which is rather complicated for the GAN to learn. Incustencies in colour pallet and poses orientations. Cleaning the data for the portrait styles would certainly help. Training it longer i.e. for more epochs would also help. Lastly, one can always strive to make a more robust architecture for the Neural Networks.
Nonetheless, I am pretty pleased with the way it turned out. It has been a few weeks since I published this notebook. So, I am delighted to tell you that it has a Kaggle's Gold medal now. This being my 15th Gold on Notebooks (and my 15th Notebook) I am now a Kaggle Notebooks Grandmaster!
Thanks for visiting my Blog; it’s always good to see you here!
The uncanny valley is the abrupt dip in human affinity to a non-human creature when we see it approaching human-like characteristics. For instance, the spooky feeling when one looks at Sofia the robot or Lil Miquela the Instagram influencer. Really though, Lil Miquela gives me the creeps when I go through her timeline. It is the eeriness of a realistic face with personalized captions with her sense of awareness that she is not a real person that is quite unsettling.
Album Cover
Although, Artificial Intelligence amazes me all the time. There is something surreal about it that makes working with it exciting. Of course, we can peel off the layers and see the maths behind them. Get to the matrices and tensor to understand how these neurons work. Even get the values of weights and biases and assure ourselves that this is no sorcery. Still, when I see the result play out it is astonishing. They are mysterious and quite understandable at the same time. Nonetheless, It is the emotional uneasiness associated with it that is difficult to process.
When I first decided to build an AI to write song lyrics for me, to my surprise, I was able to get to a working model pretty fast. I will spare you a summary of my personal learning curve and my initial skill set. As that is too resume-ish for the blog's content, some may say. I geek out in my blogs, I mean that's basically what I live for.
Long story short, the initial "working" Recurrent Neural Network that did generate some output, only generated gibberish. After a few improvements, I reached a model that had real word generation. Although to my dismay, the network appeared to be obsessed with being "born". That was a sneak into the uncanny valley of AI. Fascinating!
AI's Obsession On Being Born
This Probably was my Victor Frankenstein moment. God's complex or God's conundrum.
After that, refining the model was pretty peachy, The AI wrote an extension of my poems. And I think it was quite thoughtful. Okay, at least it was a bunch of fully-formed meaningful sentences... stuck together mostly out of any context. And that was actually what's expected if we look into the way Recurrent Neural Networks work.
My Poem Extended By AI
As long as there were words and were arranged together in a manner that would pass as a sentence at first glance; I was proud of my little AI monster. After all, it is just a child in front of the giants, from open AI & hugging face. Those transformer-based models are trained on a ginormous amount of information. The adjective State-of-The-Art is often associated with such models. They have is these delightful websites interfaces where you could just go and type one line to text and In return, they would generate a book for you! I so wish I had that during my academic years. I would have saved so many midnight candles that I burnt on completing the assignments. The assignments that, I still believe, no one ever read.
Getting unstructured sentences from the model, got me thinking; If only language had a set of rules that were slightly less intricate. Maybe had a certain strick pattern ... a sort of key. Like musical compositions have. Like Mozart's Symphony No 40, one of his most popular, in the key of G minor. the symphony is literary known as the great G minor.
Sound is created by vibrations in the air. There can be infinite kinds of soundwaves as there are infinite combinations of frequencies and amplitudes. In music, we consider pitch and wavelength. As there are infinite soundwaves there can be infinite musical notes. However, contemporary music uses12 unique notes. The 12 notes have a constant difference in frequency with respect to each other.
Twelve Notes on Piano
Though there are far more keys on a piano, all of them are different versions of those 12 notes on musical instruments. Say in the above image the key of A has a frequency of 220 Hz. The next assortment of keys will be set in a different octave and that A will be of frequency 440 Hz. Each shift in the octave doubles the frequency.
Twelve Tone Musical Scale
Not all of the notes sound good together, there are a selected set of notes that are used in a song. This set of selected notes are indicative of the Key. When a song is in the key of C Major or D Minor this is simply telling you which of the 12 notes are used in this song.
Back to the initial topic, This sets up an easier pattern for a Recurrent Neural Network to learn than to learn a language with a strict syntax and oh so many exceptions. Besides even if it breaks those rules, I would never know.
I went further down the valley and made my first AI model generate music. I trained it with Frédéric Chopin's compositions. On second thought I think, Beethoven would have been an ideal candidate. Nonetheless, I did skip over my initial plan to work with Mozart's works... Ugh! What the hell I didn't want to work on data scraping there I said it! So I found this big dataset on Kaggle and Chopin’s was the one with the most files.
The first output was basically "something" ponding on the same key with one finger at a constant interval of time. It was still an art form, in my opinion, Made me think deep and hard about existence and how we should all be just annihilated right on the instinct the next note is played. The longer I played it the more I was convinced of it
After making a few tweaks in the architecture of the network It worked alright not as good as Frédéric Chopin's. Still, it is nice. See for yourself.
On scrutinising the generated melody, as it has a variety of notes; I am quite satisfied. On the enigma, is it a good musical composition; is it artsy? Did the AI create a masterpiece? I don't know! I am not a connoisseur of music. I used a Recurrent Neural Network and it worked alright. I decided to let the AI have the fame it deserved.
So, I am releasing the album here on this blog! Yey! Don't forget to get your copy!
Get your copy:
Afternoon On Pluto
Almost A Lovesong (for Zombies)
Children's Rhyme By The Clown
Homealone In Lockdown
Midnight Intruder
Netflix and Videocall
Secret Stash
Up The Infinite Castel
Thanks for visiting my Blog; it’s always good to see you here!