Analyzing the Impact of Political Messages with the Wolfram Language
Much effort and money are spent trying to analyze whether political messages resonate with the electorate. With the UK in its final days before a general election, I thought I would see if I could gain such insight with minimal effort.
My approach is simple: track the sentiment of tweets that mention each party. Since the Wolfram Language has a built-in sentiment classifier and connections to external services, we can analyze these messages with only a few lines of code.
Setting Up the Data
First I need to set up a database to store the accumulated data in the cloud. Our Wolfram Data Drop service takes care of this automatically. All I have to do is execute the following code:
Engage with the code in this post by downloading the Wolfram Notebook
✕
CreateDatabin[Permissions -> "Private"] |
Now that I have somewhere to store the data, I want to set up an hourly service in the Wolfram Cloud that will log the sentiment at that moment. The process is to connect to Twitter and fetch up to five hundred tweets about each party. We then clean the tweets up to remove non-words and Twitter syntax, and score them using the off-the-shelf sentiment analyser. Finally, we write the numbers into the database:
✕
With[{ databin = Databin["IDVxLzdw"], tweetClean = Function[{tweet}, ToLowerCase[ImportString[StringReplace[ tweet, { StartOfString ~~ "RT " -> "",(*remove the retweet label*) "@" ~~ Shortest[__] ~~ WhitespaceCharacter | EndOfString -> " ",(*remove user names*) "http:" ~~ Shortest[__] ~~ WhitespaceCharacter | EndOfString -> " ",(*delete hyperlinks*) "https:" ~~ Shortest[__] ~~ WhitespaceCharacter | EndOfString -> " ", "#" -> "",(*turn hashtags into words*) "\n" -> " "(*remove line breaks*) }, IgnoreCase -> True], "HTML"]]]}, With[{meanSentiment = Function[{tweets}, Mean[ ReplaceAll[ Classify["Sentiment", Select[ Normal[ tweets[DeleteDuplicates, tweetClean[ Slot["Text"]]& ]], StringQ]], { Alternatives["Neutral", Indeterminate] -> 0., "Positive" -> 1., "Negative" -> -1.}]]]}, CloudDeploy[ ScheduledTask[ Block[{twitter, conservativeTweets, labourTweets, conservativeSentiment, labourSentiment}, twitter = ServiceConnect[ "Twitter"]; conservativeTweets = twitter[ "TweetSearch", "Query" -> "conservative", MaxItems -> 500]; labourTweets = twitter[ "TweetSearch", "Query" -> "labour", MaxItems -> 500]; ServiceDisconnect[ twitter]; conservativeSentiment = meanSentiment[ conservativeTweets]; labourSentiment = meanSentiment[ labourTweets]; If[ And[ NumberQ[conservativeSentiment], NumberQ[labourSentiment]], DatabinAdd[databin, Association[ "Conservative" -> conservativeSentiment, "Labour" -> labourSentiment]]]], "Hourly"]]]] |
Having executed that code once, I now wait while the data accumulates. Fortunately, UK elections are brief and fairly intense affairs that are over in a month.
A Brief UK Election Primer
For the non-British, here is a quick primer on UK politics. There are two main parties: the Conservatives (who are more right wing, with a preference for free markets, low taxes and smaller government) and Labour (who are more socialist, with a preference for more government spending and more state control). Americans should probably think of Conservatives as equivalent to Republicans and Labour as Democrats; however, UK politics are generally more socialist than the US, but more right wing than much of Europe. While a third party, the Liberal Democrats, receives significant votes, due to the nature of our voting system it achieves few members of Parliament. There are a number of other parties mostly focused on regional issues that take a handful of seats in Parliament. To make my task more complicated, the Conservative Party is also known as the Tory Party (more often by their opponents who use the term pejoratively). I have ignored all these complexities by just looking for tweets with the two main official party names.
Sentiment Analysis
By now, my code has been in the cloud for a few weeks—let’s see what we collected:
✕
Dataset[Databin["IDVxLzdw"]] |
The results are easier to work with if we fetch the data as a set of TimeSeries:
✕
TimeSeries[Databin["IDVxLzdw"]] |
The first thing we can see is that most of the time, the average sentiment for both parties is negative. I don’t know if that is telling us something about the nature of discourse on Twitter or about the public’s opinion of its politicians:
✕
DateListPlot[data, PlotStyle -> {Red, Blue}] |
The data is pretty noisy. If I were actually running a party’s campaign, I would want really up-to-date metrics. But for a high-level overview, I am going to smooth that with a moving average of 12 hours:
✕
DateListPlot[MovingMap[Mean, #, Quantity[12, "Hours"]] & /@ data, PlotStyle -> {Red, Blue}, BaseStyle -> 16] |
The first observation is that sentiment is generally slightly more negative about the Conservatives (who are the incumbent government). It is interesting to see that there are times when sentiment becomes more positive for both parties, and other times where the sentiment moves in opposite directions.
If we want to see the relative movement, we can simply find the difference between the sentiments expressed about the two parties:
✕
trendTS = #Conservative - #Labour &[ MovingMap[Mean, #, Quantity[12, "Hours"]] & /@ data]; trendPlot = DateListPlot[trendTS, FrameTicks -> {{Automatic, {{-0.1, "Labour"}, {0.05, "Conservative"}}}, {Automatic, None}}, BaseStyle -> 16] |
So the obvious question is, what happened to cause the peaks and troughs of this curve? It does look a bit like Labour generally does better on weekends.
Headlines as Context
In a short election cycle, the narrative moves quickly, but I went through an archive of the newspaper front pages to see what the dominant story was at various times and annotated the plot:
✕
Show[trendPlot, DateListPlot[Callout[{First[#], trendTS[First[#]]}, Last[#]] & /@ { {"07:00, 16 Nov 2019", "Labour to\nnationalise broadband"}, {"07:00, 17 Nov 2019", "Prince Andrew\nInterview"}, {"20:00, 19 Nov 2019", "TV Debate"}, {"07:00, 22 Nov 2019", "Labour to\nspend £80bn"}, {"07:00, 25 Nov 2019", "Conservatives to\nhire 50k nurses"}, {"07:00, 28 Nov 2019", "Labour claim\nNHS to be sold"}, {"17:00, 29 November 2019", "Terrorist attack"}, {"20:00, 1 Dec 2019", "TV Debate"}, {"07:00, 3 Dec 2019", "NHS dossier\nRussia link"}, {"07:00, 4 Dec 2019", "Trump comments\non NHS"}, {"21:00, 6 December 2019", "TV Debate"}, {"07:00, 9 December 2019", "Brexit\nleak"}, {"12:00, 9 December 2019", "NHS photo gaffe"} }, Joined -> False] ] |
The only general conclusion that I can draw is that when a party announces a policy, Twitter reacts negatively. The depressing conclusion would be that politicians should avoid policies and stick to slogans.
Here are some interpretation notes on the key recent events in the UK:
- November 16: Labour announces a plan to nationalize British Telecom and offer free broadband. Nationalizing private industries hasn’t happened in the UK since the 1970s.
- November 17: The country is captivated for several days by Prince Andrew’s ill-judged interview about his friendship with Jeffrey Epstein. As well as considering the interview to be a distraction from politics, Labour supporters are more likely to be anti-royal and also anti-billionaire.
- November 22: Labour plans to spend an extra £88 billion on all kinds of public services. This is a lot by UK standards, even for Labour governments.
- November 28: Labour claims a leaked document proves that the Conservatives plan to sell the National Health System (NHS) to America. The NHS is a hot-button issue in the UK and always an important element of political debate.
- November 29: A recently released convicted terrorist kills two people in London. Initially, Labour focuses the blame on the Conservative government having reduced police funding, but gradually the story switches to decisions made by the previous Labour government that caused the earlier release of the culprit.
- December 3: Claims are made that Labour’s document on the NHS was supplied by Russia trying to influence the election.
- December 4: Donald Trump says that he isn’t interested in the NHS.
- December 9: Leaked documents show that the Conservatives’ plan for Brexit will be difficult to implement quickly. Brexit (the departure of the UK from the European Union) has dominated UK politics for several years and is a central question for this election.
- December 9: During an interview, Conservative leader Boris Johnson refuses to look at a photo of a child sleeping on a hospital floor.
While the Wolfram Language makes it easy to collect, score and visualize the data from Twitter, interpretation remains tricky. Some days with interesting sentiment data do not appear to be correlated to strong headline stories in the newspapers, making it difficult to understand what has driven the sentiment. In hindsight, I should have put all of the Twitter data into the databin so that I could have analyzed content as well as sentiment. It does appear that policy announcements drive negative Twitter sentiment against the party making the announcement, which perhaps explains why so much of politics is about slogans rather than policies. One thing we can be certain about, however the election turns out: these peaks and troughs of sentiment will continue well into the future.
Get full access to the latest Wolfram Language functionality with a Mathematica 12 or Wolfram|One trial. |
Dear Jon,
that is a very nice analysis indeed. I am aiming at annotating similar plots automatically, i.e. mine news in the internet (mostly vial IFTTT or https://www.gdeltproject.org) which I hope to use for annotations. So I am mining tweets for different topics and then hope to identify changes in sentiment or frequency of certain terms (all sorts of NLP) and then look that up in my news database to get an idea what could have caused this.
Thanks a lot for posting.
All the best from Aberdeen,
Marco