class: center, middle, inverse, title-slide # Twitter API ## IPSA-Flacso Summer School ### Tiago Ventura --- # Introduction Today, we will do a brief overview about using the Twitter Rest and Stream Apis. --- # Accessing Twitter Data. To access the Twitter API, we will use the [rtweet](https://github.com/ropensci/rtweet) package. Our first step is to install the package. ```r #devtools::install_github("ropensci/rtweet", force=TRUE) #install.packages("rtweet") # somente uma vez library(rtweet) library(tidyverse) library(broom) ``` --- ## Getting your Credentials As we learned in the last class, some APIs require prior registration to give proper authorization to access their data. This is the case with Twitter. To request credentials, you need to: - have a twitter account, - create a twitter developer account. The `rtweet` package has an excellent tutorial on how to request developer access. ```r vignette("auth", package = "rtweet") ``` --- After you get your credentials, you have to tell them to R. We do this creating objects (this is actually a functionality from thr rtweet package) ```r app_name<-"Tiago Ventura" consumer_key="X" consumer_secret="X" access_token<- "X" access_token_secret<-"X" ``` Create your accesss ```r create_token(app=app_name, consumer_key=consumer_key, consumer_secret=consumer_secret, access_token = access_token, access_secret = access_token_secret) ``` This functions saves your credentials in your environment. You don't need to repeat this in the future. --- ## Rest API: Work with to get old tweets. Twitter has two APIs: Rest API and Stream API. Let's start with the REST API. This API allows: - Access to tweets from the last 6-9 days. - 18,000 tweets in every 15-minute hit. To use this function, you need to enter a search term. Twitter's advanced search helps you format the appropriate search terms when your interest is in more than a single word. ```r bolsonaro_tweets<-search_tweets("bolsonaro", n=50, include_rts = TRUE) bolsonaro_tweets %>% as_tibble() ``` ``` ## # A tibble: 50 × 35 ## created_at id id_str full_text truncated display_text_ra… entities ## <chr> <dbl> <chr> <chr> <lgl> <dbl> <list> ## 1 Tue Aug 31 … 1.43e18 143282… "Na passage… FALSE 49 <named … ## 2 Wed Sep 01 … 1.43e18 143288… "Bastidores… FALSE 280 <named … ## 3 Wed Sep 01 … 1.43e18 143316… "Exclusivo … FALSE 176 <named … ## 4 Thu Sep 02 … 1.43e18 143323… "RT @Guilhe… FALSE 77 <named … ## 5 Thu Sep 02 … 1.43e18 143323… "RT @Hays88… FALSE 140 <named … ## 6 Thu Sep 02 … 1.43e18 143323… "RT @Deputa… FALSE 140 <named … ## 7 Thu Sep 02 … 1.43e18 143323… "RT @tietad… FALSE 128 <named … ## 8 Thu Sep 02 … 1.43e18 143323… "RT @PATRlO… FALSE 140 <named … ## 9 Thu Sep 02 … 1.43e18 143323… "Após press… FALSE 188 <named … ## 10 Thu Sep 02 … 1.43e18 143323… "RT @SigaGa… FALSE 140 <named … ## # … with 40 more rows, and 28 more variables: metadata <list>, source <chr>, ## # in_reply_to_status_id <dbl>, in_reply_to_status_id_str <chr>, ## # in_reply_to_user_id <int>, in_reply_to_user_id_str <chr>, ## # in_reply_to_screen_name <chr>, geo <lgl>, coordinates <list>, place <list>, ## # contributors <lgl>, is_quote_status <lgl>, retweet_count <int>, ## # favorite_count <int>, favorited <lgl>, retweeted <lgl>, ## # possibly_sensitive <lgl>, lang <chr>, retweeted_status <list>, … ``` --- ## get the timelines Allows you to get the timeline of certain profiles ```r timelines_covid <- get_timelines(c("renancalheiros", "ottoalencar", "EduGiraoOficial"), n=100) timelines_covid %>% as_tibble() ``` ``` ## # A tibble: 300 × 35 ## created_at id id_str full_text truncated display_text_ra… entities ## <chr> <dbl> <chr> <chr> <lgl> <dbl> <list> ## 1 Wed Sep 01 … 1.43e18 143312… "Há algo de… FALSE 275 <named … ## 2 Sun Aug 29 … 1.43e18 143202… "A CPI vem … FALSE 233 <named … ## 3 Sun Aug 29 … 1.43e18 143199… "Lúcido, cl… FALSE 206 <named … ## 4 Sat Aug 28 … 1.43e18 143162… "Tentaram n… FALSE 274 <named … ## 5 Thu Aug 26 … 1.43e18 143098… "No começo … FALSE 257 <named … ## 6 Wed Aug 25 … 1.43e18 143061… "O depoimen… FALSE 280 <named … ## 7 Tue Aug 24 … 1.43e18 143020… "Os atraves… FALSE 272 <named … ## 8 Mon Aug 23 … 1.43e18 142981… "Os coices … FALSE 271 <named … ## 9 Fri Aug 20 … 1.43e18 142883… "Nunca se v… FALSE 280 <named … ## 10 Thu Aug 19 … 1.43e18 142842… "A CPI tent… FALSE 280 <named … ## # … with 290 more rows, and 28 more variables: source <chr>, ## # in_reply_to_status_id <dbl>, in_reply_to_status_id_str <chr>, ## # in_reply_to_user_id <dbl>, in_reply_to_user_id_str <chr>, ## # in_reply_to_screen_name <chr>, geo <lgl>, coordinates <list>, place <list>, ## # contributors <lgl>, is_quote_status <lgl>, retweet_count <int>, ## # favorite_count <int>, favorited <lgl>, retweeted <lgl>, lang <chr>, ## # possibly_sensitive <lgl>, quoted_status_id <dbl>, … ``` --- ## Information about the users ```r users <- lookup_users(c("renancalheiros", "ottoalencar", "EduGiraoOficial")) users %>%as_tibble() ``` ``` ## # A tibble: 3 × 21 ## id id_str name screen_name location description url protected ## <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <lgl> ## 1 1.65e 9 165033… Renan… renancalhei… "" Senador da Rep… https… FALSE ## 2 2.38e 9 237750… Otto … ottoalencar "Bahia" Perfil Oficial… https… FALSE ## 3 1.02e18 102440… Eduar… EduGiraoOfi… "Fortale… Senador indepe… https… FALSE ## # … with 13 more variables: followers_count <int>, friends_count <int>, ## # listed_count <int>, created_at <chr>, favourites_count <int>, ## # verified <lgl>, statuses_count <int>, profile_image_url_https <chr>, ## # profile_banner_url <chr>, default_profile <lgl>, ## # default_profile_image <lgl>, withheld_in_countries <list>, entities <list> ``` --- ## Recent likes ```r favs <- get_favorites(c("renancalheiros", "ottoalencar", "EduGiraoOficial")) favs %>% as_tibble() ``` ``` ## # A tibble: 586 × 35 ## created_at id id_str full_text truncated display_text_ra… entities ## <chr> <dbl> <chr> <chr> <lgl> <dbl> <list> ## 1 Thu Aug 26 … 1.43e18 143093… "Ao menos u… FALSE 252 <named … ## 2 Mon Aug 23 … 1.43e18 142982… "Tentam pro… FALSE 196 <named … ## 3 Mon Aug 23 … 1.43e18 142981… "@renancalh… FALSE 35 <named … ## 4 Wed Aug 18 … 1.43e18 142803… "@renancalh… FALSE 290 <named … ## 5 Thu Aug 12 … 1.43e18 142588… "Aos cuidad… FALSE 166 <named … ## 6 Tue Aug 10 … 1.42e18 142492… "@rodaviva … FALSE 205 <named … ## 7 Tue Aug 10 … 1.42e18 142491… "Talvez se … FALSE 142 <named … ## 8 Tue Aug 10 … 1.42e18 142491… "Que imbeci… FALSE 155 <named … ## 9 Tue Aug 10 … 1.42e18 142489… "🍿 Boa noit… FALSE 76 <named … ## 10 Fri Aug 06 … 1.42e18 142345… "@renancalh… FALSE 56 <named … ## # … with 576 more rows, and 28 more variables: source <chr>, ## # in_reply_to_status_id <dbl>, in_reply_to_status_id_str <chr>, ## # in_reply_to_user_id <dbl>, in_reply_to_user_id_str <chr>, ## # in_reply_to_screen_name <chr>, geo <lgl>, coordinates <list>, place <list>, ## # contributors <lgl>, is_quote_status <lgl>, retweet_count <int>, ## # favorite_count <int>, favorited <lgl>, retweeted <lgl>, lang <chr>, ## # possibly_sensitive <lgl>, quoted_status_id <dbl>, … ``` --- ## And followers ```r follow_renan <- get_followers("renancalheiros") follow_renan %>% as_tibble() ``` ``` ## # A tibble: 5,000 × 1 ## user_id ## <chr> ## 1 1375834301611249664 ## 2 2835848034 ## 3 1364542712050053122 ## 4 818807714310520832 ## 5 1370688111114588162 ## 6 1376229663731720192 ## 7 1250763710660034561 ## 8 1409213818106900483 ## 9 944992259761516544 ## 10 1426647301137764357 ## # … with 4,990 more rows ``` --- # API Stream Twitter has a second API where you can sample the tweets being produced in real time. This API gives you more access to data, so it's the best way to collect it, you can leave a few hours -- or days -- running on your R. ```r election <- stream_tweets("", timeout = 100, file_name = "file.json") ``` ``` ## Error in open.connection(stream, "rb"): HTTP error 401. ``` ```r election ``` ``` ## Error in eval(expr, envir, enclos): object 'election' not found ``` --- That is it. Next week you will learn a lot of different ways to analyze all this data! --- ## Academic API Twitter has recently launched a new api, called academic API, that gives you access to historical data on Twitter. It is amazing. To use it, you need a different authorization. And also another r package. Here is some places where you can find the information. [a 101 course](https://github.com/twitterdev/getting-started-with-the-twitter-api-v2-for-academic-research) [academictwitteR](https://github.com/cjbarrie/academictwitteR)