StockTrendsBot: Using Python to Create a Stock Performance Bot for Reddit

 Link to full code:
https://github.com/ericlighthofmann/StockTrendsBot/blob/master/BotRefactorv2_git.py

Link to StockTrendsBot:
https://www.reddit.com/user/StockTrendsBot/


“The dark side of social media is that, within seconds, anything can be blown out of proportion and taken out of context. And it’s very difficult not to get swept up in it all.”
– Nicola Formichetti (1977 – present)

Purpose of the Bot

Using Python and the praw and pandas library, I wanted to create a bot for Reddit that would crawl new posts in various subreddits and look for company names being mentioned. I hoped that this bot would show trends in stock prices over various amounts of time so that whoever is reading the post will have the bigger picture of overall company and stock performance. It’s always important to put things in context, especially in the age of sensationalist click-bait.

How It Was Made: Getting Lists of Public Companies

First, I needed to find a list of all of the public companies across American stock exchanges (currently I’m only doing the NYSE, NASDAQ and AMEX, but this could be extended to international exchanges in the future). NASDAQ.com kindly provides these lists in a downloadable .CSV format here: http://www.nasdaq.com/screening/company-list.aspx. I then needed to format my lists and remove duplicate offerings. For example, 1347 Capital Corp. has the tickers TFSC, TFSCR, TFSCU, and TFSCW. If there were duplicates, I removed all of them except the highest market cap stock. I also formatted the company names so that they would be similar to colloquial use. For example, no one would ever write “International Business Machines Corporation” on Reddit, so I needed to change this to “IBM”. I also took off mentions of “Inc.,” “PLC,” “Corp.,” etc.

How It Was Made: Searching Reddit for Mentions of a Company

The bot uses the praw library for Python to loop through various subreddits, scanning the 5 newest posts. It also records the identification number of the post every time it posts a comment. That way it will only apply to each post once. I’m using business and investing related subreddits mostly, to lessen the chance for errors. It’s also got a sleep time exception in case the bot is new and without any karma. Reddit limits the amount you can post if you’re a new, karma-less account. If you’re interested, I’d suggest checking out the full code at the link above. The bot runs an unlimited amount of times by using a “while True: job()” statement.

login.PNG

In order to search for an exact match of the company, I used regex to put boundaries on the outside of the search terms:

subreddit for.PNG

How It Was Made: Getting the Current Price of the Stock

Once a mention of a company is found, it matches the ticker symbol with the company name (remember, both are included in the CSV file we originally used as our source data). It matches because it uses the “zip” function in Python to loop through two lists in parallel. It takes the ticker symbol and uses the Python library requests to query Yahoo Finance and BeautifulSoup to parse the text.

current_price

How It Was Made: Getting the Historical Prices

In order to get historical prices and changes over time, I used Pandas DataReader. This service automatically queries Yahoo Finance based on the dates you input. So, I had it get the current date and time, query with (year-1) to get the adjusted close price on this date one year ago. Ditto for weeks and months. The biggest issue was having it keep going backwards if the query date fell on a day when the markets were closed (e.g. a weekend or holiday). I ended up having to use a ton of nested try and except commands. I also had to make special considerations for the month of January. If it tried to go backwards, it would error out because 1-1 = 0 and there’s no month 0… There’s still a few problems with it, but if it cannot find data for a previous year, month or week, it will just return blank and the bot won’t post that data point. I suggest looking at the full code to see the try except loops. I’m always open to suggestions on how to improve this syntax – there’s got to be a better way :). Update: I found out about the timedelta function (part of datetime) which makes the process a whole lot simpler. Timedelta correctly goes backward from month to month and year to year. I still needed to use a for loop to make sure that when I was getting historical pricing, it would query the closest trading day. 

yearly_change

How It Was Made: Formatting the Output Text

Once the bot had collected all of the relevant data, it formats it to look aesthetically pleasing and change the text a bit based on the numbers. I used ASCII codes for the up and down arrows (&#x25B2 and &#x25BC).

output_text

Conclusion

The bot is currently running on 10 different subreddits and I’ve only received one death threat so far (it was a joke death threat, don’t worry!). In the future, it would be great to have it be able to run in the main subreddit (r/all) and have it scan the language in articles or text posts to determine whether the poster was indeed talking about the company. If I run it in a non-business focused subreddit, it will often post errors. For example, someone was discussing “sonic booms in South Carolina” and the bot posted Sonic’s (the fast food company) stock price.

Please let me know any feedback, suggestions for improvement, etc and thanks for reading!

Update 2/2/2016: I’ve completely refactored the code, removing almost all of the “try, except” clauses that were used previously. Instead, I’ve used timedelta, part of the datetime module, to subtract days using Python instead of manually coding for each day possibility. This significantly shortened my code (1,000 lines to about 250 lines) and made it much more readable. The current GitHub link is updated at the top of this page.


Leave a comment