Page 25 - FIGI - Big data, machine learning, consumer protection and privacy
P. 25
4 THE ENGAGEMENT PHASE: CONSUMER PROTECTION AND PRIVACY IN THE OPERATION OF
AI-DRIVEN SERVICES
This section discusses engagement: the consumer’s that personal data in their databases is correct and
experience with big data and machine learning, and updated for the purposes for which it was gathered. 104
conversely the collection, use, storage and transfer of This raises the question about the accuracy of data
the consumer’s data by big data and machine learn- in the wider data ecosystem, and the extent to which
ing firms. Sections 6.1 and 6.2 consider consumer firms should be held responsible for inaccuracy or to
concerns and legal issues that arise from the substan- contribute to accurate information more broadly.
tive results of the data processing, in particular
responsibility for accuracy and biased decision-mak- Responsibility for data accuracy in financial
ing. Section 6.3 considers protections for consumers services
against the risk of the release of their data through Sector-specific laws governing financial services
data breach and re-identification, focusing on the often emphasize the importance of ensuring accu-
techniques of de-identification, pseudonymisation racy of data used for financial services. Data used
and anonymisation. Section 1.1 turns to the risks to for credit scoring is an example. Credit report-
105
consumers that arise through transfers of data in the ing bureaus are typically subject to regulation and
vibrant data broker market, and increased regulation strong internal controls to ensure accuracy of the
of this market segment. data they hold on individuals. Such credit reporting
systems reduce the costs of lending by reducing risk
4�1 Accuracy – protecting consumers from errone- (and thus loan default losses, provisioning for bad
ous and outdated data debt, and need for collateral) inherent in information
asymmetries between lenders and borrowers. They
Accuracy of data inputs provide lenders with information to evaluate borrow-
The successful functioning of machine learning ers, allowing greater access to financial services.
106
models and accuracy of their outputs depends on the Because of the importance of their data in credit
accuracy of the input data. Some of the vast volumes and other decision-making, credit reference bureaus
of data used to train the system may be “structured” provide individuals with a means of correcting inac-
(organized and readily searchable) and some may be curate information.
“unstructured.” The data may have been obtained However, this formal information system is now
103
in different ways over time from a variety of sources, only part of a wider data-rich environment, most of
some more and some less directly. The wider the which is not regulated. The advent of big data and
104
net of data that is collected, the greater the chances machine learning poses a risk that existing legisla-
are that data will be out of date and that systematic tion and policy guidance does not keep up with the
updating processes are not applied. Historical data data-rich environment. For instance, the first princi-
may have even been incorrect from the start. ple of the World Bank’s General Principles on Credit
These factors may result in questionable accuracy Reporting (GPCR), published in 2011 , is that “cred-
107
of data inputs to the algorithms. This may be true it reporting systems should have relevant, accurate,
both for the personal data about the individual who timely and sufficient data – including positive – col-
is the subject of an automated decision (to which the lected on a systematic basis from all reliable, appro-
machine learning model is applied), as well as for the priate and available sources, and should retain this
wider pool of data used to train the machine. If the information for a sufficient amount of time.”
training data is inaccurate, the model will not func- Questions arise about how exactly this sort of pol-
tion to produce the intended outputs when applied icy guidance should apply today – just eight years
to the individual’s personal data. All of these prob- later – to information about individuals supplied and
lems may give rise to erroneous inferences about the collected for purposes that may not initially have
consumer. related to making credit decisions. Big data and
Data protection and privacy laws thus increasing- machine learning may collect and use data that var-
ly set some form of legal responsibility on firms to ies greatly in its relevance, accuracy and timeliness.
ensure the accuracy of the data they hold and pro- These challenges apply also to laws that were
cess. Mexico’s data protection legislation applies a written before the advent of big data and machine
quality principle requiring data controllers to verify learning and even the internet itself. Firms that do
not consider themselves to be credit reference
Big data, machine learning, consumer protection and privacy 23