Big data ethics - charting a course through your data lake
With big data comes big responsibility, and sometimes unintended or unforeseen risks. Leah Beckbridge, Data Privacy Specialist, shares some insights for organisations on how to use AI in more ethical ways and avoid some pitfalls.
Ethics in business do not exist in a vacuum, and range from individual moral judgements up to organisation-wide business decisions. However, ethics have always evolved as societal norms have changed over time, as people worked out what is right or wrong for any given situation.
So, what are the societal norms when it comes to the relatively new field of AI driven data processing? Are we, by accident or by inadvertent design perhaps, risking handing over our moral decision making to technologies we ourselves barely understand?
We recognise this is a challenging area for a lot of organisations, and perhaps one you have not yet really thought about. But this technology is here to stay and its use will only increase, so we wanted to ask for five minutes of your time to hopefully share with you our thoughts and insights in how you might choose to respond.
A problem of scale
Artificial Intelligences (huge datasets and sophisticated fast computing) are changing our ability to ‘know’ about each other more rapidly than we can sometimes fully control. Our internal intuitive human risk calculations are not necessarily sophisticated enough to always comprehend the potential pitfalls - for us or others - of all the data processing currently being carried out with our data.
The nascent intelligences we are developing have imperfect parents; us. And like any new parent we can and will make mistakes as our offspring grow and develop. Some of these mistakes are accidental. Some are deliberate.
As people, we instinctively ‘know’ what is ethical in a personal and business context. This understanding has been built up over thousands of years as social animals, and we know that transgressing these norms can result in sanctions for those who choose to operate outside of them.
That inherited knowledge has been passed on from generation to generation, but distilling and transposing that innate understanding outside the human experience to a machine, is difficult.
Questions of trust
So what can we, as a responsible business, do to stay on the right side of people's intuitive understanding of an invasion of privacy? Here are some of the questions we ask ourselves at Clifford Chance:
- Are we being accountable enough? Do we live up to the strong privacy compliance expected of us, or can we be more open and transparent about the data we collect, and what we do (and intend to do) with it?
- Do we build systems with privacy conscious engineering techniques? We have developed a baseline set of requirements for all big data analysis that sets the boundaries on what data we'll use, and what we will (and just as importantly will not) do with it.
- Has a Data Protection Impact Assessment (DPIA) been carried out, and did we act on the findings? We have found these more challenging to do with AI systems, but open and honest consultations have been a great way of understanding some of the concerns people might have.
- Are we doing enough to eliminate bias (conscious or otherwise) from the data? Is the percentage breakdown of the data reflective of the target population? This is more than just ensuring a "50-50" split between male and female. Have we balanced the job roles, or incomes, or hobbies of these people properly, or for example, are there a disproportionate number of low incomes assigned to one gender?
- Is there enough oversight? Are our internal governance controls strong enough, and are there sufficient checks on which issues to report up to the board if needed?
- Lastly, have we empowered the individuals whose data we process? Do we treat them as people with agency, who are given the tools, both legal and informational to exercise real control over the use of their data?
The pitfalls of getting it wrong
It is easy to assume or overlook the smallest detail when it comes to big data analysis, which can, lead to entirely unanticipated outcomes. There is always the risk of the introduction of bias, unconscious or otherwise. And this bias can lead to adverse effects on individuals.
AI is being used to make more and more of our decisions for us. Recruitment, mortgage approvals, health screening and even prison sentencing. But these decisions are really just predictions based on inferences drawn from very large datasets. And if the dataset is not truly representative of the target population or skewed in some more subtle way, then the decisions will be wrong, and unfair.
What about the increasing use of facial recognition? The technology is now available to identify everyone in a crowd at once. How can we and organisations ensure our anonymity and privacy in large numbers?
Different scenarios, with different issues, but all sharing the same conflict at their core; that of finding the correct balance between the power and insight of new technologies, and the potential for negative, sometimes even harmful effects on people.
Treat people as people
Studies by the UK data protection regulator, the ICO, have shown that if individuals trust the organisation processing their information they will willingly provide greater personal data, in return for enhanced services or other benefits.
The balance that needs to be struck is not only not exploiting people's data, but being seen to not exploit it. To recognise and treat the individual as an individual.
Organisations need to allow the people whose data they use to be able to say of themselves: "I am more than a collection of datapoints to be mined for profit."