Thursday, November 21, 2019

Twitter, Trump & the Economy



Mentioning Twitter and Trump in the same sentence has become a lapalissade over the last few years.

It has reached the point where each new tweet is logged, analyzed, correlated. For a full comprehensive archive, I strongly recommend the Trump Twitter Archive. They're all there available for any analysis you can think of!

I recently stumbled across multiple articles on how the number of tweets from Trump were negatively correlated with stock market performance. A few examples:


The papers don't say it explicitly, but highly suggest causality, something we'll get back to later.

Reading all this, I decided to take a quick stab and see if:

  1. there were any interesting trends in Trump's Twitter usage that hadn't already been reported
  2. could replicate some of the stock market findings

Twitter activity

Much has already been said on Trump's Twitter patterns, and using activity as a proxy for determining when he sleeps, here are heatmaps for orginal tweets, retweets, total tweets and fraction of retweets:


Very dark from midnight to 6am as one would expect. We can also spot some sweet spots for original content on Monday mornings to kick off the week, and that same day right before midnight (lighter color zones in the upper right side graph).

I did want to explore the evolution of the fraction of retweets over time (# retweets / # total tweets). Very noisy data as one can imagine so slapped a smooth curve on top and the result is quite striking:


I'm guessing the retweet data wasn't collected prior to 2016, but since being elected there has been a clear upward trend.
It could be a sign that less time is available to generate a tweet from scratch and retweeting is a simpler way of manifesting presence, or it could be a way of building support ahead of the upcoming elections. It will definitely be interesting to continue monitoring this metric over the next few months and see if we ever cross the 50% mark.


Stock market performance

The original intent was to re-run the analysis mentioned ion various articles about the negative correlation between number of tweets and stock market performance.
The articles were not entirely clear on what data was being used, the best description I could get was:
"since 2016, days with more than 35 tweets (90 percentile) by Trump have seen negative returns (-9bp), whereas days with less than 5 tweets (10 percentile) have seen positive returns (+5bp) — statistically significant."

So time period will be Jan 1st 2016 until today (a little more data than what was originally used). In terms of tweets, it is not entirely clear if retweets are included or not, I will test both alternatives. As for the returns, I will look at three market indicators: Dow Jones, S&P500 and Nasdaq. Two more alternatives were possible based on whether we inspect changes in market performance in terms of absolute or relative changes.

Similarly to the quote, I identified in each case the 10th and 90th tweet percentile, and ran a simple t-test for market performance on days with less than the 10th percentile versus days above the 90th percentile. Unfortunately, in none of the 12 cases was I able to identify a significant relationship between Trump's Twitter activity and market performance:



There could be multiple reasons why I wasn't able to replicate the findings, from the data source (a good indication of this comes from the fact that my 10th and 90th quantile are lower than the values reported in the articles) to the formatting (I used Eastern time as the reference time but using a different time zone could result in tweets being shifted to another day) to the technique used for detecting a difference (though I also ran other non-parametric tests and results did not change).

One could be surprised by the lack of significance when seeing that Nasdaq has an average uplift of 0.2% on low tweet days but only 0.06% on high tweet days, but this is easily explained when visualizing the actual distributions and large standard deviations:



Independently of all these potential causes though, I feel that more importantly than the result is the interpretation of the result.


Correlation VS Causation

Let us assume we had surfaced an incredibly tight correlation between the two metrics: the number of tweets by Trump and market performance. What would that have meant, if anything?

When correlation between two variables A and B is observed, a lot could be going on behind the scenes:
  • A and B have nothing in common and the observed correlation is purely coincidental. This is called a spurious correlations. One of my favorite examples is the correlation between "the number of people who drowned by falling into a pool" and "the number of movies Nicolas Cage appeared in" (example taken from https://www.tylervigen.com/spurious-correlations which has lots of other great ones!)



  • It might seem that A causes B but it could actually be the reverse. For instance, it could seem that people going more frequently to the gym have lower BMIs, but perhaps those with lower BMIs tend to go more frequently to the gym...
  • The most vicious situation is that of the confounding or hidden variable that has a causal impact on both A and B separately. We might see a link between carrying a lighter and risk of lung cancer, but neither affects the other, the real hidden variable is whether the person is a smoker or not.

Back to the tweets. Assume there is a strong significant correlation between volume of tweets and market performance, one could argue there are other explanations than sheer volume creating market panic. Perhaps it is the reverse effect: when markets perform poorly, Trump is more likely to tweet and attempt to re-assure. Or there could be hidden variables in other economical events: if there are trade tensions with other countries, Trump could be more likely to tweet about those while in parallel the market tends to panic and go down...

Anyway, this was just a little tweet for thought...