Natural Language and Text Processing Lab


Discovering Dutchness: Identity Formation in the context of Rising Nativism using Social Data Analysis

The project, situated at the department of Cultural Anthropology and the department of Methodology and Statistics, uses a combination of ethnographic and text mining methods.

Hyperparameter Optimization to accelerate active Learning Models

The goal of this project is to increase the performance of active learning for screening large amounts of textual data by optimizing the hyperparameters of learning algorithms in the ASReview open-source software. Users from social sciences should be able to select a set of hyperparameters optimized for textual data from their domain instead of the currently implemented values obtained from medical datasets.

Project website:

Hiring & Discrimination in the times of AI

Algorithmic screening is increasingly used in talent acquisition across large and small corporations worldwide, with an ambitious promise of eliminating idiosyncratic human biases by using identical and consistent evaluation criteria. Nonetheless, very little is known about how these algorithms are trained across a myriad of behavioral dimensions, from texts in CVs and interviews to non-verbal features in interviews. This lack of evidence and transparency raises serious concerns among job seekers and policymakers, particularly when it comes to leveling the playing field for ethnic minorities and female job applicants. Importantly, there is virtually no evidence of how job seekers perceive and respond to these automated procedures, as well as how professional recruiters assess persuasion strategies from candidates of various backgrounds, in combination with algorithmic tools. This project seeks to address these pressing issues by conducting a series of field and online survey experiments using a combination of causal inference and NLP methods. Answering these questions would enable us to develop effective policies around AI in the recruitment processes, by uncovering the black boxes of application & recruitment strategies of job seekers and HR professionals in the face of AI.

Human-AI Collaboration Based on Explainable NLP (Application in Social Science)

Explanatory learning models in Natural Language Processing (NLP) are models that provide explanations for their predictions or decisions. They help humans understand why a particular decision or prediction was made. This is important because it fosters trust in the system. When humans understand the reasoning behind a decision, they are more likely to trust the system and use it effectively. Additionally, explanatory models can aid in identifying biases in the system and improving its overall performance.

In this project, we are trying to present a method to increase interpretability by knowing about different explanatory learning models in the field of NLP and trying to improve the interaction, understanding, and use of it in collaboration with humans.

Mining the Dutch Disposition towards Animals and Plants

In this project, we address the topic of the human disposition towards animals and plants. Our aim is to establish a better understanding of the history of both the knowledge about animals/plants and the cultural representation of animals and plants in the Netherlands. We study large sets of digitized texts and images produced and circulated in the Netherlands between 1550 and 2000.

Modeling Biases in News and Political Debates

Biases and stereotypes in society can be reflected in different sectors of our everyday life including work environment, education and politics. News and political speeches are only two examples of textual content in which stereotypes are present. In this project, we focus on both gender and racial bias. Using approaches from NLP, first we explore how adjectives are used by female and male politicians (and particular when they refer to male and female gender) . Next, we model the evolution of biased language over the decades using a collection of debates from the UK House of Commons. Finally, we use sentiment analysis to analyze how different races are described in news. Our results indicate that adjectives are used in a different way from male and female politicians and that bias exists across several decades. Finally, we found that articles that discuss ethnic outgroups have a negative tone and emotion.

Social Bias Detection in Text Using NLP and their Associations with People’s Opinions

Although recent works have applied sophisticated NLP methods to quantify social bias in large corpora , there is still limited research regarding the associations between users’ beliefs and social bias expressed in online text. In this project, we are interested in filling this gap by exploring whether there are any associations between implicit social bias that different texts express with people’s beliefs on societal topics. In this way we will attempt to understand whether people consume information that confirms their beliefs or not and to which extent. (Funded by UU Focus Area Applied Data Science)

Fairness of NLP-supported clinical decision-making process in healthcare

Text mining and natural language processing (NLP) systems are proving very useful for clinical care and research. Several decisions on patient inclusion/exclusion and coding of key study variables in clinical studies are taken out of the hands of clinicians and put into the care of NLP systems. However, clinical care is not always equitable; for example, women present very different stroke symptoms from men, leading to underdiagnosis and undertreatment of female stroke victims. These inequities are inevitably encoded in the electronic health records (EHR) corpora mined by clinical data scientists.

The aim of this project is to study the extent to which existing healthcare inequities are propagated by the application of text mining/NLP to EHR records and to develop methods to assess and reduce the impact of the encoded inequities.