Isabelle Augenstein, Dr. Scient.,is Professor at the University of Copenhagen, Department of Computer Science, Head the Copenhagen Natural Language Understanding research group as well as the Natural Language Processing section, and a co-lead of the Pioneer Centre for Artificial Intelligence.
Keynote Title: Quantifying gender biases towards entities
Abstract: Language is known to be influenced by the gender of the speaker and the referent, a phenomenon that has received much attention in sociolinguistics. This can lead to harmful societal biases, such as gender bias, the tendency to make assumptions based on gender rather than objective factors. Moreover, these biases are then picked up on by language models and perpetuated to models for downstream NLP tasks.
Most research on quantifying these biases emerging in text and in language models has used artificial probing templates imposing fixed sentence constructions and been conducted for English. In our work, we by contrast focus on detecting gender biases towards specific entities, namely politicians, and adopt a cross-lingual approach. This allows for studying more complex interdependencies, such as the relationship between the politician’s origin and language of the analysed text. Moreover, we complement the study of language-based cues for detecting gender bias in text with that of extra-linguistic cues, such as visibility. In our research spanning different types of social media as well as cross-lingual language models, we find strong evidence of well-known societal biases, including female politicians often being described with respect to their appearance and social characteristics. Moreover, we find that the way gender bias is expressed is dependent on the text source, and that complementing text-based measures for gender bias with measures focusing on visibility provides a more nuanced picture.
Lucas Beyer, Google Brain, Switzerland
Keynote Title: The convergence of vision and language
Outline: I’ll discuss how Computer Vision (CV) and Natural Language Processing (NLP) have traditionally been separate communities with very different models and approaches. This separation has started to disappear in recent years, in part thanks to the Transformer architecture, and several breakthroughs in scaling up multimodal models. I’ll outline the current trends and various possible paths forward, and hope my talk can contribute to getting both communities closer to each other.
Dr. Eduard Hovy is Executive Director of Melbourne Connect Research and Enterprise, Professor in the School of Computing and Information Sciences of Melbourne University and Research Professor at the Language Technologies Institute of Carnegie Mellon University.
Keynote Title: The complementarity of neural and symbolic approaches to NLP
Dr. Sandra Kübler is Professor at the Department of Linguistics, College of Arts and Sciences, Indiana University Bloomington
Keynote Title: Hate Speech Detection: What have we learned?
Abstract: Hate speech detection is popular topic with important real world applications. In this talk, I want to have a closer look at successes and issues in hate speech detection. Using transformers for hate speech detection tends to give good results, but such results are often deceptive. Issues that that have been raised are biases in the data, annotation quality, sampling strategies, domain effects, but also include the definition of what we consider hate speech. All of these issues have a profound effect on results, but do not have simple solutions.
Tharindu Ranasinghe, Aston University, UK
Keynote Title: Responsible Machine Learning: Challenges and Way Forward?
Abstract: Responsible and Trustworthy Machine Learning has emerged as a critical research area in the context of an increasingly data-driven and automated world. Many questions have been asked about the responsibility and trustworthiness of large language models such as GPT. With the widespread adoption of these models in diverse applications, concerns have arisen about potential biases, fairness, transparency, and ethical implications associated. Achieving responsible and trustworthy machine learning requires a multifaceted approach that addresses data quality, transparency, fairness, ethics, robustness and security. How far have we achieved this as a community, and what are the challenges ahead?
Efstathios Stamatatos, University of the Aegean, Greece
Keynote Title: Exploring Advances and Challenges in Authorship Verification
Abstract: The identification of author(s) of a given text has wide-range applications in multiple areas including literature studies, historical research, forensic investigations, cybersecurity, and social media analysis.
The research field of authorship attribution has witnessed a continuous and substantial increase in the number of published works. However, differences in experimental setup and utilized datasets pose challenges in drawing reliable conclusions and making fair comparisons between different methods.
Authorship verification can be seen as a fundamental task since it addresses the most basic question: given a certain author and a text of unknown authorship, has that person authored the text in question? Every authorship attribution case, whether involving a closed-class or an open-class set of candidate authors, can be broken down into a series of authorship verification cases. During the last decade, a series of shared tasks in authorship verification have been organized in the framework of PAN (a lab on digital text forensics and stylometry) and several benchmark datasets have been created using diverse text genres, languages and levels of difficulty. This talk will present the lessons learnt from these shared tasks in authorship verification and examine the robustness of approaches to handle very challenging scenarios where the texts of known and unknown authorship belong to different discourse types, including both written and spoken language. In addition, the discussion will address the role of the recently developed generative language models in enhancing authorship verification methods and their evaluation. Lastly, the relationship of authorship verification with machine-generated text detection will be discussed.