Twitter Data Wrangling

For the Jupyter Notebook that runs this project (HTML format) click this link:

Twitter Data Wrangling

Technologies utilized are Python, Pandas, and a variety of plotting packages. The target was @dog_rates. Tweepy (Python library) was used to access the twitter API and receive JSON data. I also had access to a machine learning file that classified pictures of dogs (.tsv), which was downloaded via Python’s request library. This project includes data gathering, cleaning, storing, and analyzing the results. I was most interested in what this population of users likes in terms of breeds and how their ratings of those breeds have changed over time.

If this interests you please contact me. This project was done as part of my course work at Udacity (www.udacity.com) for the Data Analyst Nano Degree (DAND).

Red Wine (RStudio and ggplot)

For the RMD file that shows this project (HTML format) click this link:

RStudio Analysis of Red Wine Dataset

The technologies used were R, RStudio and ggplot. I explored a “tidy data set that contains 1,599 red wines with 11 variables on the chemical properties of that wine. At least 3 wine experts rated the quality of each wine, providing a rating between 0 (very bad) and 10 (very excellent). A linear model was produced which “reliably” predicted the quality of the wine. The model was somewhat like a friend of mine, they have both never met a bad bottle?

If you want to talk to me about this, please contact me.  This project was done as part of my course work at Udacity (www.udacity.com) for the Data Analyst Nano Degree (DAND).