But in The largest real mortgage data study everThe economists Laura Blattner of Stanford University and Scott Nelson of the University of Chicago showed that the difference in mortgage approvals between minority and majority groups is not just about prejudice. It also lies in the fact that there are fewer credit data for minority groups and low-income groups. history.
This means that when this data is used to calculate a credit score and this credit score is used to predict loan default, the prediction will be less accurate. It is this lack of precision that causes inequality, not just prejudice.
The implication is obvious: fairer algorithms will not solve the problem.
“This is a very surprising result,” said Ashsh Rambakan, who studies machine learning and economics at Harvard University, but did not participate in the study. Bias and incomplete credit records have been hot issues for some time, but this is the first large-scale experiment on loan applications from millions of real people.
Credit scores compress a series of socioeconomic data (such as employment history, financial records, and buying habits) into a single number. In addition to deciding on loan applications, credit scores are now used to make many life-changing decisions, including decisions about insurance, hiring, and housing.
In order to clarify the reasons for the differential treatment of minority groups and majority groups by mortgage lenders, Bratner and Nelson collected 50 million anonymous U.S. consumers’ credit reports and compared each of these consumers with their marketing data. Sets, their property deeds and mortgage transactions, and data about the mortgage lenders who provided loans to them.
This is the first study of its kind. One reason is that these data sets are proprietary and not publicly available to researchers. “We went to the credit bureau and basically had to pay them a lot of money to do this,” Bratner said.
Then, they experimented with different prediction algorithms to show that credit scores are not only biased, but also “noisy”, a statistical term for data that cannot be used to make accurate predictions. Take an ethnic minority applicant with a credit score of 620 as an example. In a biased system, we might expect this score to always exaggerate the applicant’s risk, for example, a more accurate score is 625. In theory, this prejudice can be explained by some form of algorithmic affirmative action, such as lowering the approval threshold for minority applications.