On stereotypes | DETESTS IberLEF

On stereotypes...

One of the components that reinforce toxic and hateful speech are stereotypes. Understanding how they emerge and spread is crucial to tackling this issue, since stereotypes are not always expressed explicitly. The presence of stereotypes on social media and the need to identify and mitigate them is leading to the development of systems for their automatic detection, especially in news comments. Therefore, it is a new task that is attracting growing interest from the NLP community.

Several works on stereotype detection and classification have been carried out in which specific social groups, e.g., women and immigration, have been the focus of research, since they are usually the target of such messages. For instance, Automatic Misogyny Identification (Fersini et al., 202) presents a classification subtask in which one of the categories of misogyny is Stereotype and Objectification understood as a fixed and oversimplified image or idea of a woman. Last year’s IberLEF 2021 edition task EXIST (Rodríguez-Sánchez et al., 2021) tackled the topic of sexism in social networks, while more specifically, studies on detection of gender stereotypes have also been addressed (Cryan et al.,2020; Chiril, 2021). Among the perspectives identifying stereotypes within narratives, there are studies on microportraits on Muslim stereotyping, in which a description of the target group is provided in a single text (Fokkens et al., 2018). Sap et al. (2019) addresses the issue of social bias frames driven by stereotypes. Evalita 2020’s HaSpeeDe 2 task includes a subtask on the identification of immigrants, Muslims and Roma (Sanguinetti et al., 2020). Narrowing down on the topic of immigration, Sánchez-Junquera et al. (2021) put forward a classification of such stereotypes as manifested in political debates.

The stereotype classification applied in this task is based on the latter work, but uses a corpus extracted from comments authored by web users on Spanish news articles related to immigration. In general, in these comments, a racial stereotype based on origin, ethnicity, race and religion is associated with a target group.

A stereotype is defined in social psychology as a set of beliefs about others who are perceived as belonging to a different social category. The stereotype oversimplifies the group and generalizes a characteristic, applying it to all its members (Allport et al., 1954). Stereotypes are a cognitive component and, in agreement with prejudice, their emotional counterpart, they model behavior towards others. One way of manifesting stereotypes is through language, in different degrees ranging from explicit to implicit, thereby becoming a complex concept when they must be operationalized for natural language processing. In order to narrow down this concept, we considered some criteria for deciding whether a message contains a stereotype.

Since not every linguistic expression about immigration carries a racial, national, or ethnic stereotype, the first criterion to observe is whether there is a homogenization of the target group in the comment. Homogenization involves a process of the generalization of a feature to the status of a social category, which negates individual diversity. In a second criterion, stereotypes are expressed in language through several communication acts, which can be explicit, that is, transparent and manifest, or implicit that is, a process of inference is necessary for the stereotype to be perceived.

References

Allport, G. W. (1955). The nature of prejudice. Cambridge, Mass: Addison-Wesley Pub. Co.

Chiril, P., Benamara, F. & Moriceau, V. (2021). ‘”Be nice to your wife! the restaurants are closed”: Can Gender Stereotype Detection Improve Sexism Classification?’ In Proceedings of Findings at the 2021 conference on empirical methods in natural language processing (EMNLP).

Cryan, J., Tang, S., Zhang, X., Metzger, M., Zheng, H., and Zhao, B. Y. (2020). ‘Detecting Gender Stereotypes: Lexicon vs. Supervised Learning Methods’. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1–11. DOI: https://doi.org/10.1145/3313831.3376488

Fersini, E., Nozza, D. & Rosso, P. (2018). ‘Overview of the Task on Automatic Misogyny Identification (AMI)’. In Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018). http://personales.upv.es/prosso/resources/FersiniEtAl_IberEval18.pdf

Fokkens, A., Ruigrok, N., Beukeboom, C., Gagestein, S., & Van Atteveldt, W. (2019). ‘Studying muslim stereotyping through microportrait extraction’. In H. Isahara, B. Maegaard, S. Piperidis, C. Cieri, T. Declerck, K. Hasida, H. Mazo, K. Choukri, S. Goggi, J. Mariani, A. Moreno, N. Calzolari, J. Odijk, & T. Tokunaga (Eds.), Proceedings of the LREC 2018, Eleventh International Conference on Language Resources and Evaluation (pp. 3734-3741). European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2018/pdf/989.pdf

Rodríguez-Sánchez, F., Carrillo-de-Albornoz, J., Plaza, L., Gonzalo, J., Rosso, P., Comet, M. & Donoso, T. (2021). ‘Overview of EXIST 2021: sEXism Identification in Social neTworks’. Procesamiento del Lenguaje Natural, Vol 67.

Sánchez-Junquera J, Chulvi B, Rosso P, Ponzetto SP. ‘How Do You Speak about Immigrants? Taxonomy and StereoImmigrants Dataset for Identifying Stereotypes about Immigrants’. Applied Sciences. 2021; 11(8):3610. https://doi.org/10.3390/app11083610

Sap, M., Gabriel, S., Qin, L., Jurafsky, D., Smith, N.A., & Choi, Y. (2020). ‘Social Bias Frames: Reasoning about Social and Power Implications of Language’. ACL.

Sanguinetti, M., Comandini, G., Nuovo, E., & Frenda S., Stranisci, M., Bosco, C., & Caselli, T., & Patti, V. & Russo, I. (2020). ‘HaSpeeDe 2 @ EVALITA2020: Overview of the EVALITA 2020 Hate Speech Detection Task’. http://ceur-ws.org/Vol-2765/paper162.pdf.