Page 31 - FIGI - Big data, machine learning, consumer protection and privacy
P. 31

sold at a discounted price to students and old age   bearing on creditworthiness. For example, a person
            pensioners. It can, however, also result in perceived   with a healthy salary and little debt may be treated
            unfairness, where some population groups are target-  adversely as a result of living in a community (or hav-
            ed to pay higher prices based on their profile result-  ing social media friends, or the same medical doctor,
            ing from geographic location or other attributes. 141  or shopped at discount stores) where people have
               In financial services, the focus of differential pric-  historically higher debt-to-income ratios. Machine
            ing relates primarily to a consumer’s risk profile. Pric-  learning models are thus among other trends in auto-
            ing based on risk can improve economic efficiency   mation of economic processes that may increase
            by discouraging behaviour that is risky, rewarding   inequality over time.
                                                                                 142
            individuals with no history of engaging in unlawful
            activities such as traffic accidents. It can improve   4�3  Protecting consumers in the event of data
            access to insurance by reducing adverse selection,   breach and re-identification
            when only individuals with  a high-risk profile will   The vast amounts of data held by and transferred
            enrol at a uniform price. However, differential pricing   among big data players creates risks of data secu-
            of insurance products can result in unfairness where   rity breach, and thus risk to consumer privacy. Even
            risk factors arise beyond an individual’s control, e.g.,   when the amount of data held on an individual is
            in health insurance.                               kept to a minimum, their identity may be uncov-
               Big data may engage in differential pricing by   ered through reverse-engineering from even a small
            drawing inferences from personal data about an indi-  number of data points, risking violation of their
            vidual’s need for the service, and his or her capacity   privacy.  The risk of this occurring arises where
                                                                     143
            to pay and price sensitivity. The machine may esti-  the data may be obtained by third parties, whether
            mate  a price as near as  possible  to the maximum   through unauthorised access through a data breach
            amount the profiled consumer may be willing to pay.   or by transfer of the data to a third party with the
            Due to an asymmetry of information, the consumer   agreement with the firm controlling or processing
            does not know enough about the provider to negoti-  the data. In both cases, measures to protect the
            ate the price down to the minimum amount the pro-  release of data about identifiable individuals include
            vider would be willing to accept (e.g., for it to achieve   de-identification, pseudonymisation and anonymis-
            a reasonable return on investment).                ation. Such measures and the challenges that they
               In a dynamic market, competition would be       face in the context of big data are discussed in this
            expected to impose downward pressure on the pro-   section 6.3. Section 1.1 discusses the role and regula-
            vider’s price, driving it downward towards its costs.   tion of third-party intermediaries who acquire data
            However, policy concerns arise where differential   by agreement in the data market.
            pricing disadvantages persons who are already dis-
            advantaged. An individual may be more desperate    The limits of de-identification, pseudonymisation
            for a financial service, and thus be willing to pay a   and anonymisation
            higher price. A lender may be able to charge a higher   Personal privacy may be protected in varying degrees
            price that does not so much reflect the higher risk of   by using privacy enhancing technologies  (PETs)
                                                                                                    144
            default as the borrower’s urgency. This may preju-  such as de-identification, which involves suppressing
            dice low income individuals and families.          or adding noise to directly identifying and indirect-
               Differential pricing can also become discrimina-  ly identifying information in a dataset, or otherwise
            tory where prices are set according to criteria that,   introducing barriers (making it statistically unlikely)
            while seemingly objective, result in adverse treat-  to identifying a person:  145
            ment of protected groups. For instance, if an algo-
            rithm sets higher prices for consumers with a post-  •  Directly identifying data identifies a person with-
            code from a neighbourhood that has historically had   out additional information or by linking to infor-
            higher levels of default than those from other neigh-  mation in the public domain (e.g., a person’s name,
            bourhoods, individuals who do not themselves have    telephone number, email address, photograph,
            other attributes to suggest a higher risk may face   social security number, or biometric identifiers).
            higher prices.                                     •  Indirectly identifying data includes attributes that
               Certain historically disadvantaged population     can be used to identify a person, such as age,
            groups share particular attributes (such as a post-  location and unique personal characteristics.
            code). Individuals with those attributes may there-
            by suffer discrimination even if they do not have a



                                                             Big data, machine learning, consumer protection and privacy  29
   26   27   28   29   30   31   32   33   34   35   36