Computer systems that use machine learning algorithms detect patterns within a vast dataset and can predict complex results presenting a high degree of uncertainty. They are used in all sorts of fields: medicine, finance, justice, etc., and can exceed human capacities, especially when dealing with large datasets or dealing with a large number of factors.
“One example where machine learning algorithms and expert systems can better inform human decisions is in the prediction of criminal recidivism, defined as the act of a person who commits an offence after being convicted of a previous one”, says Carlos Castillo. However, the decisions made on the basis of machine learning models can become unfair as they introduce bias and discriminate against certain minority groups or populations.
This is the field of study of a paper by Carlos Castillo, director of the Web Science and Social Computation research group WSSC) and Emilia Gómez, a researcher at UPF and the European Commission’s Joint Research Centre (JRC), with Songul Tolan, Marius Miron, members of the JRC, which is to be presented at the 17th International Conference on Artificial Intelligence and Law (ICAIL ’19), from 17 to -21 June in Montreal, Canada, winner of the Best Paper Award. This is a prize for the best article in the conference, which is presented each year in honor of the American researcher Carole Hafner.
The authors propose a methodology to assess the possible underlying discrimination in machine learning methods used in predicting youth recidivism, and indicate the possible sources of injustice and discrimination. In general, a process or decision can be considered discriminatory if the result depends on people’s belonging to a protected group: sex, race, colour, language, religion, political opinion or others, national or social origin, association with a national minority, property, birth or other status.
Machine learning techniques can discriminate against some people if not evaluated exhaustively, so the use of algorithms should be made critically and in collaboration with experts in the area of justice
To inform judges about the risk of committing another offence, risk assessment tools are well established worldwide. One of the instruments most used is the Structured Assessment of Violence Risk in Youth (SAVRY). It is used to assess the risk of violence in children, leaving a high degree of participation to professionals of risk assessment.
The authors evaluated a database made available by the Centre for Legal Studies and Specialized Training (CEJFE) of the Generalitat (Government) of Catalonia comprising observations of 4,753 teenagers who committed offences between 2002 and 2010 in Catalonia and their recidivism was recorded in 2013 and 2015. “In this work we study the limitations of machine learning algorithms to predict juvenile recidivism. We assess the impartiality of machine learning models with SAVRY, in a set of data from Catalonia”, say the researchers, who compared machine learning computational models using this tool to evaluate existing risks in terms of predictive performance and equality.
The results indicate that in terms of accuracy in predicting recidivism, the machine learning models slightly outperform SAVRY, and the results are even better as the data or functions available for training increase. However, in three fairness metrics used in other studies, “we find that, in general, SAVRY is fair, while machine learning models could discriminate males by gender, foreigners, or persons belonging to specific national groups”, the authors suggest. “For example, in the dataset studied, compared to young Spaniards, young immigrants, even without reoffending, are almost twice as likely to be wrongly classified as high risk with machine learning models”, they add.
In criminology, the use of algorithms, under certain conditions, can be positive
In summary, the authors believe that the use of algorithms, under certain conditions, can be positive, the main contributions of this work being the following:
First, considering the fact that expert assessment is laborious and SAVRY assessment costly, the authors compared the predictive performance of SAVRY before a risk assessment generated by machine learning methods based on information about the demographics of the accused and their criminal record.
Secondly, SAVRY and machine learning models have been evaluated with regard to possible discrimination on the ground of sex or nationality.
Thirdly, by combining interpretability techniques, a comprehensive analysis of the data is performed to discuss and draw up the possible causes of the disparity in machine learning methods.
“Our findings provide an explanation as to why machine learning techniques may discriminate against some people if they are not comprehensively evaluated. In addition, algorithms should be used critically and in collaboration with experts in the area of justice”, state Gómez and Castillo, both researchers from the Department of Information and Communication Technologies (DTIC) at UPF.
This research is carried out in the framework of a collaboration agreement between UPF and the University of Barcelona (UB), with the special collaboration of Antonio Andrés-Pueyo, a researcher at the Department of Clinical Psychology and Psychobiology at the UB.
Songul Tolan, Marius Miron, Emilia Gómez, Carlos Castillo (2019), "Why Machine Learning May Lead to Unfairness: Evidence from Risk Assessment for Juvenile Justice in Catalonia", A: 17th International Conference on Artificial Intelligence and Law (ICAIL '19) , 17-21 juny, 2019, Montreal, Canadà. BEST PAPER AWARD!