Data Wrangling Twitter

Table of Contents

Contact Information

  • Lindsay Moir
  • 778 679 4406
  • tragoes@gmail.com

Project Description

We are using a twitter_archive from @dog_rates provided as a .csv file. Tweepy is being used to interface to the twitter API and receive JSON data. Finally we have been provided with a file that was produced at Udacity to classify pictures of dogs (.tsv), which is downloaded via a http request. This project includes data gathering, cleaning, storing, and analyzing the results. We are most interested in what this population of users likes in terms of breeds and how their ratings of those breeds have changed over time.

Gather

twitter_archive

image_predictions

There is also a image file that contains twitter ids and urls for images. For example, this url gives you Stuart's picture (the dog) https://twitter.com/dog_rates/status/889531135344209921. The part after status is the tweet id for Stuart. Udacity wants us to download this programatically via the requests library.

twitter_counts

We need to setup access to the twitter api via tweepy.

Assess

General

twitter_archive

After visually inspecting the text field, we may be able to extract gender.

Atticus

Time for some fun. Which is the most extreme outlier for numerator?

Lets see who Atticus is.