layout: true <div class="my-footer"><span>https://github.com/brunaw/genre_characterization</span></div> <!-- this adds the link footer to all slides, depends on my-footer class in css--> --- name: bookdown-title <br> <br> .pull-left[ <div class="column"> <img src="img/MU_logo.png" width="350"> </div> ] <br> <br> <br> <br> <br> <br> <br> <br> <br> ### .fancy[Feature Engineering for Genre Characterization in Brazilian Music] .large[Bruna Wundervald | Maynooth University | Sep 18, 2020] <!-- this ends up being the title slide since seal = FALSE--> --- exclude: true name: lifecycle individual files: .Rmd to .md (via knitr) .md to HTML (via pandoc) HTML to lots of HTML --> BOOK (via bookdown) --- class: inverse, middle ### .fancy[Summary] - Introduction - Research questions - Definitions - Data - Manually Extracted Features - Machine Learning Algorithm - Results - Conclusions --- class: inverse, middle, center ### .fancy[Introduction] --- # Introduction - Many factors are involved in the configuration of a music genre, such as style, historical context, and harmonic structures (Caldas (2010)) - Defining music genres is not a trivial task, and is an important problem in various aspects of music studies > The focus of this work is towards **verifying the connection between harmonic information and genre specification in Brazilian music**, through the evaluation of feature importance in machine learning models - Mid-level music features such as chords configure a rich resource of information regarding genres (Cheng, Yang, Lin, Liao, and Chen (2008), Corrêa and Rodrigues (2016)) - We use symbolic chords data and manually extracted harmonic features for genre classification - **The features represent the chords structures in different and meaningful ways** --- class: inverse, middle, center ### .fancy[Definitions] --- ## Data > Type: **Symbolic chords sequences for each song** <br> - The chords are extracted from the Cifraclub, an online collaborative page of music-sharing, through the `chorrrds` (Wundervald (2018)) package for `R` (R Core Team (2018)) - Crowd-sourced data is becoming more common in the literature (e.g. Odekerken, Koops, and Volk (2020)) - In total, **8 music genres** were used: Reggae, Pop, Forró, Bossa Nova, Sertanejo, MPB, Rock, and Samba - **106 different artists** were available on the online platform, for which the chords and keys for **8339 songs** were collected --- ## Manually Extracted Features - Interpretable summary features from the chords, to make them more informative .pull-left[ **First set, triads and simple tetrads:** - percentage of suspended chords (e.g. Gsus), - of chords with the seventh (e.g. C7), - of minor chords with the seventh (e.g. Em7, C#m7), - of minor (e.g. Em, C#m), - of diminished (e.g. Bdim) - and of augmented (e.g. Baug) chords **Second set, Tetrads:** - percentage of chords with the fourth (e.g. D4), - the sixth (e.g. E6), - the ninth (e.g. G9), - with the major seventh (e.g. F7+, Am7+), - with a diminished fifth (e.g. C5- or C5b) - and with an augmented fifth (e.g. C5+ or C5#) ] .pull-right[ <div class="figure" style="text-align: center"> <img src="img/feat_example.png" alt="Feature extraction example" width="100%" height="100%" /> <p class="caption">Feature extraction example</p> </div> ] --- ## Manually Extracted Features **Third set, main chord transitions:** - percentage of the first, second, and third most common chord transitions in the song **Fourth set, miscellany:** - popularity, - total of non-distinct chords, - year of album release, - indicator of the key of the song being the same as the most common chord, - percentage of chords with varying bass (e.g. C/E, C/G, C/Bb), - mean distance of the root note to ’C’ in the circle of fifths, - mean distance of the root note to ’C’ in semitones, - absolute number of the most common chord <br> > Supplementary features about the release year and popularity were obtained with the help of the well-known **Spotify API** --- ## Machine Learning Algorithm > Popular Random Forest (Breiman (2001)) model - Characterized by being a tree ensemble that only allows a random subset `\(m\)` of the features to be the candidates for a split, helping to create uncorrelated trees - The model equation can be written as `$$\hat f(\mathbf{x}) = \sum_{n = 1}^{N_{\text{tree}}} \frac{1}{N_{\text{tree}}} \hat f_n(\mathbf{x}),$$` where `\(\hat f_n\)` corresponds to the `\(n\)`-th estimated tree, out of a total of `\(N_{\text{tree}}\)` trees, and `\(\mathbf{x}\)` is the feature set > **Advantage**: We can easily access the importance (misclassification reduction) for each feature used in the model --- class: inverse, middle, center ### .fancy[Results] --- # Results - Four models were created in a nested fashion, with each new model being added with one of the features sets - Table 1 shows that, for all different models, there is evidence of their accuracy being significantly higher the non-information classification rate - The addition of the feature sets progressively increases the accuracy of the models - This shows how the 4 set of features are informative in predicting the music genres
Model
Accuracy
L.B.
U.B.
Kappa
p-value
Model 1
0.53
0.51
0.55
0.37
<0.0001
Model 2
0.57
0.54
0.59
0.42
<0.0001
Model 3
0.59
0.56
0.6
0.44
<0.0001
Model 4
0.62
0.6
0.64
0.49
<0.0001
--- class: middle - From Table 2, we can see that there is considerable confusion between MPB and Bossa Nova, highlighting their known harmonic similarities - The same happens to Forró, Sertanejo and Pop, which are music genres with a similar origin and, in general, more elementary harmonic structures
Genre
Bossa Nova
Forró
MPB
Pop
Reggae
Rock
Samba
Sertanejo
Bossa Nova
28%
0%
40%
0%
0%
5%
16%
12%
Forró
0%
0%
12%
0%
0%
12%
10%
65%
MPB
1%
0%
59%
0%
0%
11%
13%
15%
Pop
0%
0%
13%
0%
0%
28%
15%
44%
Reggae
0%
0%
25%
0%
8%
46%
8%
12%
Rock
0%
0%
16%
0%
0%
43%
5%
35%
Samba
1%
0%
20%
0%
0%
3%
66%
10%
Sertanejo
0%
0%
2%
0%
0%
7%
2%
89%
--- class: middle .pull-left[ <div class="figure" style="text-align: center"> <img src="img/imp_m3.png" alt="Figure 1. Importance plot for the fourth model with all the features. The top part of the plot is dominated by the harmonic features." width="100%" height="100%" /> <p class="caption">Figure 1. Importance plot for the fourth model with all the features. The top part of the plot is dominated by the harmonic features.</p> </div> ] .pull-right[ - Figure 1 shows that the first set of features is the most informative one - With the basic chords information we can already obtain good results in terms of informing the model about the genres - The external features (year and popularity), got a high rank in the plot, showing how the Spotify data is also pertinent - The position of the transitions and distances features strengthens the idea of harmonic characteristics being very important to discriminate between music genres ] --- class: inverse, middle, center ### .fancy[Conclusions] --- # Conclusions - Manually engineered harmonic features can be useful to characterize Brazilian music genres - The **most discriminative** features are: - the percentage of chords with the seventh note, - of minor chords with the seventh note, - of minor chords, - the year of release of the songs, - the popularity - and the behavior of the most common chord transitions - Our insights can be extended to other music genres that influenced or were influenced by the genres considered here, such as Jazz, Pop, and Rock music - Next steps of this work include the engineering of the new variables and applying different machine learning algorithms, as well as exploring more the use of chords crowdsourced data Links: - [To code and data](https://github.com/brunaw/genre_classification) - [To presentation repository](https://github.com/brunaw/genre_characterization) --- # References <p><cite><a id='bib-Breiman2001'></a><a href="#cite-Breiman2001">Breiman, L.</a> (2001). “Random forests”. In: <em>Machine Learning</em>. ISSN: 08856125. DOI: <a href="https://doi.org/10.1023/A:1010933404324">10.1023/A:1010933404324</a>. eprint: /dx.doi.org/10.1023%2FA%3A1010933404324.</cite></p> <p><cite><a id='bib-Caldas2010'></a><a href="#cite-Caldas2010">Caldas, W.</a> (2010). <em>Iniciação à Música Popular Brasileira</em>. Vol. 1.</cite></p> <p><cite><a id='bib-Cheng2008'></a><a href="#cite-Cheng2008">Cheng, H., Y. Yang, Y. Lin, et al.</a> (2008). “Automatic chord recognition for music classification and retrieval”. In: <em>2008 IEEE International Conference on Multimedia and Expo</em>. IEEE. , pp. 1505–1508.</cite></p> <p><cite><a id='bib-Correa2016'></a><a href="#cite-Correa2016">Corrêa, D. C. and F. A. Rodrigues</a> (2016). <em>A survey on symbolic data-based music genre classification</em>. DOI: <a href="https://doi.org/10.1016/j.eswa.2016.04.008 Short Survey">10.1016/j.eswa.2016.04.008 Short Survey</a>.</cite></p> <p><cite><a id='bib-odekerken2020decibel'></a><a href="#cite-odekerken2020decibel">Odekerken, D, H. V. Koops, and A. Volk</a> (2020). “DECIBEL: Improving Audio Chord Estimation for Popular Music by Alignment and Integration of Crowd-Sourced Symbolic Representations”. In: <em>arXiv preprint arXiv:2002.09748</em>.</cite></p> <p><cite><a id='bib-Rsoftware'></a><a href="#cite-Rsoftware">R Core Team</a> (2018). <em>R: A Language and Environment for Statistical Computing</em>. R Foundation for Statistical Computing. Vienna, Austria. URL: <a href="https://www.R-project.org/">https://www.R-project.org/</a>.</cite></p> <p><cite><a id='bib-chorrrds'></a><a href="#cite-chorrrds">Wundervald, B.</a> (2018). <em>The chorrrds package for extraction of music chords data in R</em>. URL: <a href="https://github.com/r-music/chorrrds">https://github.com/r-music/chorrrds</a>.</cite></p> --- class: middle, center, inverse <font size="60">Thanks! </font> <p>