Which Beatle Had The Most Musical Influence Within The Beatles?

Using Convolutional Neural Networks And Scikit-Optimize To Predict Who Had The Largest Influence Within The Beatles

Image for post
Image for post
The Beatles’ London rooftop concert, 1969

Personal Background

Data Collection

Image for post
Image for post
Mel Spectrogram image of the song “A Day In The Life” off Sgt. Pepper’s Lonely Hearts Club

Data Processing

Image for post
Image for post
Short-Time Fourier Transform of “Working Class Hero” by John Lennon

Model Training

Interpreting the Results

from model_staging import fetch_images_dataframe
import pandas as pd
import numpy as np
import keras
train_df = pd.read_csv("train_df.csv")
test_df = pd.read_csv("test_df.csv")
train_path = "C://Users//Alec//MyPython//Beatles/train_melspec"
test_path = "C://Users//Alec//MyPython//Beatles/test_melspec"
# load the Beatles data to predict on
test_gen = fetch_images_dataframe(test_df, x_col="song",
y_col="artist",
directory=test_path,
batch_size=16,
target_size=(256, 256),
class_mode="categorical",
shuffle=False, seed=1,
save_format="png")
# load the model
best_model_adam = keras.models.load_model("models/melspec/skopt_best_adamV3.h5")# predict the probabilities for each song
probabilities = best_model_adam.predict_generator(test_gen)
# get the prediction based on largest probability
preds = np.argmax(probabilities, axis=1)
# load in the training data to get the artist indicies
train_gen, valid_gen = fetch_images_dataframe(train_df,
x_col="song",
y_col="artist",
directory=train_path,
batch_size=16,
target_size=(256, 256),
class_mode="categorical",
shuffle=True, seed=1,
validation_split=0.2,
save_format="png")
class_map = train_gen.class_indices
# create a dataframe of songs and predictions
pred_df = pd.DataFrame(data={"songs": test_gen.filenames,
"predictions": preds})
# now convert the prediction column to the artist name
mapping = {v:k for k,v in class_map.items()}
pred_df["predictions"] = pred_df["predictions"].map(mapping)
# merge the pred_df with the test_df in order to bring the album
# name in for each song
pred_df = pred_df.merge(test_df[["album", "song"]], left_on="songs",right_on="song")
pred_df.drop("song", axis=1, inplace=True)
# join the prediction probabilities with the prediction dataframe
pred_df = pred_df.join(pd.DataFrame(probabilities))
pred_df.rename(mapping, axis=1, inplace=True)
pred_df[['Lennon', 'harrison', 'mccartney', 'starr']] = pred_df[['Lennon', 'harrison', 'mccartney', 'starr']].round(4)
Image for post
Image for post
Image for post
Image for post
Image for post
Image for post

Conclusion and Lessons Learned

Applied Data Science Master's Student. Data Science and finance enthusiast,

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store