If data is often the source of bias, can it also rectify the issue?

While data is often the initial source of bias, it is also a core element of tackling the issue. Indeed, The Centre for Data Ethics and Innovation (CEDI) recently released a report which qualifies this claim.

Data as the source of bias

In the wake of the Cambridge Analytica scandal, more users are now beginning to recognise the potential destructiveness of data. As the recent past has demonstrated, data collection can potentially lead to discrimination and self-censorship.

Last year, for example, Amazon implemented an AI tool that discriminated against women. The model screened candidates by analysing patterns in resumes over a ten year period, but the data was overwhelmingly male.

The CDEI also identified bias in algorithmic decision making as an immediate concern. However, the independent advisory body added that "if designed well, can reduce human bias in decision-making processes."

Can data tackle bias?

For organisations seeking to use data in a responsible manner, tackling data bias can be incredibly challenging. Indeed, there is tension between the need to create algorithms which are blind to certain characteristics, while also checking for bias against those same characteristics.

As the volume and variety of data used to inform decisions is continuing to increase, algorithms used to interpret data are becoming more complex. As a result, the CDEI notes that "concerns are growing that without proper oversight, algorithms risk entrenching and potentially worsening bias."

In order to tackle this, companies require new approaches to identifying and mitigating bias. However, the CDEI observes that companies have a limited understanding of the full range of tools and approaches available and what constitutes best practice.

What constitutes best practice?

This lack of clarity makes it difficult for organisations looking to mitigate bias in their decision-making processes. It is also integral to decipher who should be responsible for governing, auditing, and assuring these algorithmic decision making systems.

For example, decision-makers are likely to face significant trade-offs between different variations of fairness and between fairness and accuracy. Moreover, there is a lack of consensus and guidance surrounding how to make these choices.

The CDEI insists that these potential choices are likely to be highly context specific. As a result, the manner in which they a third party governs and audits these systems will differ on a sector by sector basis.

At present, is is therefore unclear as to whether data can rectify the bias inherent to certain datasets. As the CDEI found, however, this could potentially change as more organisations actively seek innovative ways to operate ethically.

Interested in ethical data? Check out our podcast with Kasia Borowska, Managing Director at Brainpool AI, in which she helps to define AI's ethical issues around bias and non-existent guidelines