I’m still working my way through the code that will, I hope, make it possible to compare effectively different sentimental modules in Python. While the code is available as a GitHub [gist], I wanted to post some of the early outcomes here, publishing my failure, as it were.
I began with the raw sentiments, which is not very interesting, since the different modules use different ranges: quite wide for Afinn, -1 to 1 for TextBlob, and between 0 and 1 for Indico.
To make them more comparable, I needed to normalize them, and to make the whole of it more digestible, I needed to average them. I began with normalizing the values — see the [gist] — and you can already see there’s a divergence in the baseline for which I cannot yet account in my code:
To be honest, I didn’t really notice this until I plotted the average, where the divergence becomes really apparent: