class: center, middle, inverse, title-slide .title[ # Survey Professionalism: New Evidence from Web Browsing Data
] .author[ ###
Bernhard von Hohenberg (Gesis),
Tiago Ventura (Georgetown University)
, Jonathan Nagler (NYU), Ericka Menchen-Trevino (Independent Researcher), Magdalena Wojcieszak (UC-Davis)
] .date[ ###
SICSS Rutgers, 06/17/2024
] --- name: about-me layout: false class: about-me-slide, middle, center ## About me <img style="border-radius: 40%;" src="https://www.venturatiago.com/authors/admin/avatar_hu572be96beeaaf625099ef0e95dbea849_663109_250x250_fill_q90_lanczos_center.jpg" width="150px"/> ### Tiago Ventura #### Assistant Professor in Computational Social Science [<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M496,109.5a201.8,201.8,0,0,1-56.55,15.3,97.51,97.51,0,0,0,43.33-53.6,197.74,197.74,0,0,1-62.56,23.5A99.14,99.14,0,0,0,348.31,64c-54.42,0-98.46,43.4-98.46,96.9a93.21,93.21,0,0,0,2.54,22.1,280.7,280.7,0,0,1-203-101.3A95.69,95.69,0,0,0,36,130.4C36,164,53.53,193.7,80,211.1A97.5,97.5,0,0,1,35.22,199v1.2c0,47,34,86.1,79,95a100.76,100.76,0,0,1-25.94,3.4,94.38,94.38,0,0,1-18.51-1.8c12.51,38.5,48.92,66.5,92.05,67.3A199.59,199.59,0,0,1,39.5,405.6,203,203,0,0,1,16,404.2,278.68,278.68,0,0,0,166.74,448c181.36,0,280.44-147.7,280.44-275.8,0-4.2-.11-8.4-.31-12.5A198.48,198.48,0,0,0,496,109.5Z"></path></svg> @TiagoVentura_](https://twitter.com/_Tiagoventura) [<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M256,32C132.3,32,32,134.9,32,261.7c0,101.5,64.2,187.5,153.2,217.9a17.56,17.56,0,0,0,3.8.4c8.3,0,11.5-6.1,11.5-11.4,0-5.5-.2-19.9-.3-39.1a102.4,102.4,0,0,1-22.6,2.7c-43.1,0-52.9-33.5-52.9-33.5-10.2-26.5-24.9-33.6-24.9-33.6-19.5-13.7-.1-14.1,1.4-14.1h.1c22.5,2,34.3,23.8,34.3,23.8,11.2,19.6,26.2,25.1,39.6,25.1a63,63,0,0,0,25.6-6c2-14.8,7.8-24.9,14.2-30.7-49.7-5.8-102-25.5-102-113.5,0-25.1,8.7-45.6,23-61.6-2.3-5.8-10-29.2,2.2-60.8a18.64,18.64,0,0,1,5-.5c8.1,0,26.4,3.1,56.6,24.1a208.21,208.21,0,0,1,112.2,0c30.2-21,48.5-24.1,56.6-24.1a18.64,18.64,0,0,1,5,.5c12.2,31.6,4.5,55,2.2,60.8,14.3,16.1,23,36.6,23,61.6,0,88.2-52.4,107.6-102.3,113.3,8,7.1,15.2,21.1,15.2,42.5,0,30.7-.3,55.5-.3,63,0,5.4,3.1,11.5,11.4,11.5a19.35,19.35,0,0,0,4-.4C415.9,449.2,480,363.1,480,261.7,480,134.9,379.7,32,256,32Z"></path></svg> TiagoVentura](https://github.com/TiagoVentura) [<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M424,80H88a56.06,56.06,0,0,0-56,56V376a56.06,56.06,0,0,0,56,56H424a56.06,56.06,0,0,0,56-56V136A56.06,56.06,0,0,0,424,80Zm-14.18,92.63-144,112a16,16,0,0,1-19.64,0l-144-112a16,16,0,1,1,19.64-25.26L256,251.73,390.18,147.37a16,16,0,0,1,19.64,25.26Z"></path></svg> tv186@georgetown.edu](tv186@georgetown.edu) [<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M208,352H144a96,96,0,0,1,0-192h64" style="fill:none;stroke:#000;stroke-linecap:round;stroke-linejoin:round;stroke-width:36px"></path> <path d="M304,160h64a96,96,0,0,1,0,192H304" style="fill:none;stroke:#000;stroke-linecap:round;stroke-linejoin:round;stroke-width:36px"></path> <line x1="163.29" y1="256" x2="350.71" y2="256" style="fill:none;stroke:#000;stroke-linecap:round;stroke-linejoin:round;stroke-width:36px"></line></svg>https://www.venturatiago.com/](https://www.venturatiago.com/) .fade[SICSS 2019 Alumni and organizer of two editions in Brazil] --- class: middle layout: true <div class="my-footer"><span>Tiago Ventura                                               SICSS Rutgers 2024 </span></div> --- ## Digital Information Age <img src="output/digital.jpeg" width="70%" style="display: block; margin: auto;" /> --- ## Online Surveys <img src="output/survey.png" width="80%" style="display: block; margin: auto;" /> --- ## Social Science Research x Online Surveys The capacity to recruit participants online to answer surveys/complete tasks has profoundly affect social sciences: -- - Reducing the costs of running surveys using high-quality survey panels -- - Recruit participants from convenience samples ~ Facebook Ads, MTurk, Google ADs, Lucid, etc... -- - Rely less on students' samples for experiments & be more creative with online experiments. -- - Use surveys to collect digital trace data ~ augmenting survey data with social media/online behavioral data -- --- ## Challenges: Professional Survey Takers <img src="output/survey-ad.png" width="80%" style="display: block; margin: auto;" /> --- ## Our contributtion #### User high-quality digital trace data to identify the prevalence of survey professionalism, and then its consequences for researcher: -- - Previous research has relied entirely on self-reports, asking participants how often they do surveys or how many panels they belong to (e.g., Zhang et al. 2020; Matthijsse, De Leeuw, and Hox 2015) - Affected by social desirability bias: participants have incentices to hide -- - We use three samples, recruited through distinct methods (online panel, social media, and market place), combining: - multiple survey waves in each sample - Up to 90 days of browsing data for participants. -- --- class:middle ## Research Questions - **RQ1** What is the degree of survey professionalism among online panel members? - **RQ2** Do survey professionals differ from non-professionals sociodemographically and politically? - **RQ3** Do survey professionals exhibit higher between-waves response instability than non-professionals? - **RQ4** What is the extent to which participants take the same questionnaire more than once, and do survey professionals engage in more repeated participation than non-professionals? --- class:middle, center # Data, Measurement and Design --- ## Data We collect web-browsing (digital trace data), .red[roughly 90 days of data], from participants across three U.S. samples: - **Facebook**: participants recruited through Meta Ads; install web-historian app; decide to donate/or not their digital trace data; 707 participants, 16.4 million visits, .red[90 days of data] - **Lucid**: online market place for surveys; install web-historian app; decide to donate/or not their digital trace data; 2,222 participants, 73.8 million visits, .red[90 days of data] - **Yougov**: high-quality survey provides; use their own data donation system; users decide to register with the data donation; 957 participants, 6.4 million, only .red[up to 60 days] --- ## Definying a survey visit #### Three-steps to define what counts as a survey url using their domain names: - **Step 1:** Pre-Curated list of survey platforms (Bevec et al, 2021). We manually verify all the links, and end up with 229 platforms. - **Step 2:** Classify all hosts that contained the word ``survey'' as survey; Identify another 2,714 URL hosts. - **Step 3:** Manually coded the 500 most frequently visited hosts from each of our three datasets; identify 291 additional URL hosts --- ## Survey Professionals We provide four categories of survey professionalism. All results in the presentation us our first category. Results are largely robust across the different categorization. - **Definition 1:** a respondent that has .red[on average more than 100 survey visits] per browsing active day - **Definition 2:** a respondent that spends .red[more than 50 percent of all browsing time] on survey sites - **Definition 3:** a respondent that has .red[more than 50 percent of all visits] to survey sites - **Definition 4:** any of the three categories above. --- class:middle, center # Results --- ## RQ1: Time Spent on Survey Platforms .center[ <img src="output/desc_prop_survey_visits_blues.png" width="80%" /> ] --- ## RQ1: Distribution of Survey Professionals .center[ <img src="output/desc_densities_survey_visits.png" width="80%" /> ] --- ## RQ1: Prevalence of survey professionals .center[ <img src="output/desc_survey_profesionals_all4.png" width="90%" /> ] --- ## RQ2: Demographics and Political Differences .center[ <img src="output/tab1.png" width="100%" /> ] --- ## RQ3: Quality of Responses .center[ <img src="output/tab2.png" width="100%" /> ] --- ## RQ3: Stability Over-Time .center[ <img src="output/between_waves_density_controls.png" width="100%" /> ] --- ## RQ3: Stability Over-Time II .center[ <img src="output/between_waves_effects_no_controls.png" width="100%" /> ] --- ## RQ4: Repeated Survey Taking .center[ <img src="output/tab3.png" width="100%" /> ] --- ## Discussion -- <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M504 256C504 119 393 8 256 8S8 119 8 256s111 248 248 248 248-111 248-248zm-448 0c0-110.5 89.5-200 200-200s200 89.5 200 200-89.5 200-200 200S56 366.5 56 256zm72 20v-40c0-6.6 5.4-12 12-12h116v-67c0-10.7 12.9-16 20.5-8.5l99 99c4.7 4.7 4.7 12.3 0 17l-99 99c-7.6 7.6-20.5 2.2-20.5-8.5v-67H140c-6.6 0-12-5.4-12-12z"></path></svg> **Professional survey taking represents a .red[substantial portion] of the online activity of the analyzed samples** - 34.3% of Lucid, 7.9% of YouGov, 1.7% of Facebook -- <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M504 256C504 119 393 8 256 8S8 119 8 256s111 248 248 248 248-111 248-248zm-448 0c0-110.5 89.5-200 200-200s200 89.5 200 200-89.5 200-200 200S56 366.5 56 256zm72 20v-40c0-6.6 5.4-12 12-12h116v-67c0-10.7 12.9-16 20.5-8.5l99 99c4.7 4.7 4.7 12.3 0 17l-99 99c-7.6 7.6-20.5 2.2-20.5-8.5v-67H140c-6.6 0-12-5.4-12-12z"></path></svg> **Although prevalent, they .red[do not introduce substantive inferential problems]** - lack of robust cross-sample difference suggests that survey professionalism does not introduce systematic demographic or political bias - Professionals speed through survey, and are more likely to straightline - Observable behaviors:Easy to detect and control for - No evidence of random responses over time -- <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M504 256C504 119 393 8 256 8S8 119 8 256s111 248 248 248 248-111 248-248zm-448 0c0-110.5 89.5-200 200-200s200 89.5 200 200-89.5 200-200 200S56 366.5 56 256zm72 20v-40c0-6.6 5.4-12 12-12h116v-67c0-10.7 12.9-16 20.5-8.5l99 99c4.7 4.7 4.7 12.3 0 17l-99 99c-7.6 7.6-20.5 2.2-20.5-8.5v-67H140c-6.6 0-12-5.4-12-12z"></path></svg> **One problematic consequence: many participants take one and the .red[same questionnaire repeatedly]** -- --- class:middle, center ## Thank you!