Logo Utrecht University

UiL-OTS Colloquium


19 November 2020
15:30 - 17:00
Online in Microsoft Teams

Special session: Data collection outside of the lab

The purpose of this session will be to discuss specific issues that come up when collecting data from human participants outside the lab, for example by fieldwork, through questionnaires or surveys, or using existing data from social media. In these settings, obtaining “informed consent” for collecting data or using existing data from participants may not work the same way as in standard lab settings, with a range of ethical, privacy and practical implications. How to create the best circumstances for data collection? We will explore two dimensions: (a) eliciting data versus using existing data, and (b) formal versus informal procedures – ‘building trust’ between researchers and participants.

Four UiL OTS researchers: Alberto Frasson, Bregje Holleman, Paola Monachesi, and Olupemi Oludare will give brief presentations (20 minutes) on their research. Chaired by René Kager.


Alberto Frasson will present two types of data collection: on-field and remote. He will compare two data collections targeting the same isolated community of Friulian heritage speakers in Brazil: the first one took place during a fieldwork in spring 2019, while the second one took place online in spring 2020. The problems posed by the two methods are not dissimilar, but they need to be tackled differently. The talk will focus on the following points: finding local contacts, organizing a fieldwork in remote locations, interviewing elderly speakers, administering the questionnaire, obtaining a written informed consent. In conclusion, it will be shown that collecting data from isolated speakers online (an ‘online fieldwork’) is possible; however, some extra measures and arrangements need to be made. Research can benefit from the experience gained with the online fieldwork: it can help making data collection more accessible and flexible.

Bregje Holleman will talk about two ways of data collection outside of the lab. First, she will briefly go into the advantages and disadvantages of data collection through crowdsourcing platforms such as MTurk and Prolific. They offer possibilities for better and wider sampling (than snowball samples trough one’s own network) for a range of linguistic research questions, but there are some drawbacks and worries. Second, she will discuss data collection in cooperation with third parties, such as companies. She will sketch her previous experience in the area of politics (on Voting Advice Applications) together with VAA builder Kieskompas and a large city in election time; and together with Jelle Strikwerda, she will outline various points of attention based on his recent experience of gathering interview data working with employees and clients of Dutch pension insurance companies.

Paola Monachesi will talk about data collection from social media (i.e. Twitter). These data have the advantage of being abundant and spontaneous. However, they raise questions on privacy, ethics as well as on research methodology that needs to be re-defined both with respect to their collection and analysis. She will present a novel computational approach to collect data from a specific group of users – skilled migrants – in order to analyse their discourse in relation to sustainability. Identifying these users is not a trivial task since they don’t characterize themselves as creative migrants on Twitter. She will also discuss the challenges behind the identification of another group of users – Dutch elderly – and the role that language can play in age detection.

Olupemi Oludare will talk about practical issues of data collection outside the lab, through fieldwork. He will present his experiences as a Yoruba musicologist and ethnomusicologist, carrying out research in Nigeria and other West African countries, in discussing how to create the best circumstances for collecting data from human participants. With a focus on the three major geo-linguistic (Yoruba, Hausa, and Igbo) landscape of Nigeria, he will explore two dimensions: (a) ‘building trust’ between researchers and participants, and (b) eliciting and comparing new data with existing data. By explaining the problems and the tasks involved, which includes identifying authentic fields, finding local informants, and attaining acceptance by the participants, among others, he will attempt to address the issues and proffer possible solutions to challenges faced in such scenarios of data collection outside the lab.

A general discussion will take place at the end of the session.