Data Wrangling Twitter

Table of Contents

Project Description

We are using a twitter_archive from @dog_rates provided as a .csv file. Tweepy is being used to interface to the twitter API and receive JSON data. Finally we have been provided with a file that was produced at Udacity to classify pictures of dogs (.tsv), which is downloaded via a http request. This project includes data gathering, cleaning, storing, and analyzing the results. We are most interested in what this population of users likes in terms of breeds and how their ratings of those breeds have changed over time.




There is also a image file that contains twitter ids and urls for images. For example, this url gives you Stuart's picture (the dog) The part after status is the tweet id for Stuart. Udacity wants us to download this programatically via the requests library.


We need to setup access to the twitter api via tweepy.




After visually inspecting the text field, we may be able to extract gender.


Time for some fun. Which is the most extreme outlier for numerator?

Lets see who Atticus is.