Tuesday, December 06, 2016

What are competition authorities talking about when they talk about big data?

OECD Hearing: Providing help to understand big data analytically

Last week, on a sunny and frosty November day (“Cold frost and sunshine: day of wonder! - Мороз и солнце; день чудесный”), competition authorities gathered at the OECD in Paris with the intention of figuring out analytically what big data could mean for competition enforcement.
In essence, discussions mainly focused on the impact on competition policy of the massive processing and exploitation of private data for commercial gain. While it goes without saying that big data is already significantly affecting business strategies, both on and offline, the question competition authorities are ever warming to is how big data might affect their own competition enforcement activities.

The topic "Big data: Bringing competition policy to the digital era"
was discussed by invited experts from academia and the private sector, followed by contributions provided by BIAC (with a Microsoft representative as ad hoc speaker), Germany, France, the EU (DGCom and EDPS), the United States and the United Kingdom.
The Committee Chair, after noticing that big data for competition authorities was a bit like sex for teenagers (“everybody talks about it, nobody has experienced it, everybody thinks that the others know more about it than this"), warned the distinguished attendees that views on this subject tended to be quite polarized, also due to the novelty of the economic and legal questions raised. Despite a few genuinely irreconcilable views, my understanding was that none of the experts and other contributors would have gone so far as to deny that big data had at least some implications for consumers and competition authorities (“too big an issue to be ignored”). The discussions within the OECD Committee were mainly theoretical, however, as case-related experience by competition authorities is still much limited.  It should also be noted that, while the attention of the Competition Committee and its invited experts was almost entirely focused on the online tracking of activities by users of digitally-enabled services, big data logically comprises also data produced directly by machines (non-personal data). Besides, a series of advances in the area of artificial intelligence and virtual/advanced reality, as well as the increased popularity of the Internet of Things, provide strong indications that the competitive significance of data is reasonably expected to grow in the future. Indeed, we might still be at a comparatively early stage in the transition towards an increasingly, and potentially paradigm-shifting, data-based economy.

What follows is a brief account of the some of the discussions that took place at the hearing, for personal memory and sharing with my students and other patient @wavesblog readers.

What is data in economic terms?

Basically, data can be in input or an output. As an input, or asset, data in digital markets would seem rather on the cheap side in the sense that nowadays there are inexpensive ways to generate and collect data produced by consumers while they are connected to the Internet (“web logs, sensors, etc.”). While firms offering digitally-enabled services get a lot of data almost automatically, data becomes really helpful and valuable “if it can be turned into information, knowledge and action”. This type of data analytics requires investment in complementary assets such as hardware, software, and skilled labour. Arguably, improvements in terms of value extraction (“information, knowledge and action “) are directly related to improvements in software and hardware and not only to improvements in terms of the amount and variety of data that is available.

On the demand side, increasing returns to scale that have attracted much attention in the case of online platforms such as Google and Facebook are so called direct and indirect network effects. Here what matters is share: the firm that has the larger share of users has power, as more users automatically make the digitally-enabled service more attractive relative to another online platform. On the supply side, classic returns to scale are often caused by high fixed costs, entry costs, relatively low marginal costs. In this respect, size is what matters for supply side economies of scale. The cost per unit decreases as the quantity produced increases. That said, the digital economy is characterized nowadays by less fixed costs (thanks to cloud computing, software as a service, freely available productivity tools, etc.). In this scenario, firms can relatively easily enter new markets and scale up their business.

The classic demand and supply side returns to scale are produced almost automatically (“positive feedback loop”). A critical aspect in the world of big data, however, is that it is mainly user generated data which directly helps with product improvement. This specific “return to scale” has also been called a “data network effect”. For instance, the consumer uses a digitally enabled service that is connected to the Internet, like Google Maps. In this case, the user is not only producing the data while using the service so that the supplier of the service can learn, but the user directly produces the data that enables the service and improves it in real time. It could be argued that this is a supply side phenomenon rather than a demand side phenomenon. More specifically, considered from the supply side, it’s not really a network effect but rather a matter of learning by doing, as the cost per unit decreases and quality increases as experience increases. This continuous improvement on the production side would be reminiscent of the Japanese Kaizen, with its focus on constant learning on how to optimise production and how to optimise the characteristics of the product to make it more attractive to user. Experience is particularly relevant for learning by doing: How long firms have been around and what they have learned. As such, this supply side phenomenon known as learning by doing wouldn’t be specific to the data economy but is present in virtually all industries (“companies from all industries have long sought to improve the relevance of their data, in order to inform their decision making, and better serve consumers”). Differently from more traditional returns to scale, this type of positive feedback would require a serious investment and commitment. In particular, learning by doing by online platforms requires data, hardware, software, tools, and expertise.

In this respect, a really intriguing question discussed at the hearing is whether there is something genuinely new going on here. The user is not only producing the data while using the service, from which the online platform can obviously learn, but the user produces the data that directly contributes to the making available of the service and its improvement in real time.
This specific feedback loop might be something peculiar to the data economy, as using the service has an effect not only on the capacity of the firm to improve the service, but on the capacity of the firm to provide it. Thus, using a digitally-enabled service such as Google Maps would have a direct influence on the production of the service, as if by driving a car the user "might make the car production line goes faster". Consumption would thus be intertwined with production, from the demand side to the supply side.

Is there a data barrier to entry?

The fact that data is relatively cheap and available, or that some companies are better than others at utilising data to extract value, would not seem to exclude per se that there might be serious data-related impediments or even barriers to entry.
Conversely, there is no automatic barrier to entry because of  the increasingly importance of data as a parameter of competition.

As seen above, an interesting peculiarity characterizing big data-driven economic sectors is that valuable data is generated by the interaction between the consumer and the digitally-enabled service, that is when the firm is already in the market. In this respect, a distinction should be made between data for entry and data that can  be generated only after entry. This would raise the question about the minimum quantity and variety of data necessary in order to enter the specific big data-driven market (a sort of "minimum efficient scale").

At any rate, there seemed to be general consensus that the attention of
competition authorities should be primarily focused on sectors in which a certain quantity/variety of data is  indispensable for firms and alternative data sources (such as data brokers) cannot be tapped. Future examples of this situation could arise in the world of so called Industry 4.0, i.e. the trend towards the full computerization of manufacturing where machine directly produce and exchange  valuable data.

Should privacy be protected as a component of quality competition?

It is clear to anyone who has been around in the “last 45 years” that competition policy has become quite price-centric. Antitrust is parameterized to assessing short-term price effects in narrowly defined markets. But, as Hayek has been warning, "unlike the position that exists in the physical sciences, in economics and other disciplines that deal with essentially complex phenomena, the aspects of the events to be accounted for about which we can get quantitative data are necessarily limited and may not include the important ones”.

With regard to competition enforcement, it is often the case that quality and innovation concerns that are non-price-related (and therefore not easily quantifiable) are what should really matter in terms of this policy’s exclusive goal, i.e. to make competition processes work best in the interest of consumers.  In this respect, big data could be like a canary in the coalmine, i.e. could serve to expose a crucial weakness in the current antitrust framework.

A more specific question discussed during the hearing is whether privacy can be treated as a parameter of quality competition. In this respect, for instance, privacy degradation (i.e., to demand more data than firms could in a competitive market, or to use data in ways that are contrary to the consumer’s wishes) could come across as an exercise of market power and, possibly, as an abuse of dominant position. The CJEU in ASNEF-Equifax held that “(a)ny possible issues relating to the sensitivity of personal data are not, as such, a matter for competition law, they may be resolved on the basis of the relevant provisions governing data protection”. Based on the Court’s “as such” language, the European Commission seems perfectly entitled to look “at big data and privacy where an antitrust concern may exist”. 

Of course, it is not a matter of incorporating privacy concerns into antitrust analysis as such. Only those product/service characteristics that are important to consumers are likely to be relevant in terms of competition enforcement, as specific practices can only be anticompetitive as long as consumers are negatively affected by them. Therefore, competition authorities have to decide whether consumers affected by the practice under analysis see privacy as a significant factor of quality, and companies compete with each other on this parameter. Thus, with regard to the Google/DoubleClick merger, the US Competition Authority decided that the combination of Google’s user search information with consumer data that DoubleClick was collecting on behalf of the publishers did not enhance Google’s market power in the intermediation market. The Authority also concluded that privacy as an element of quality competition would not have been negatively affected by the merger.

Last March, the German Competition Authority initiated proceedings against Facebook for possible abuse of dominant position. This case attracted much attention during the OECD hearing for its potential significance at the interface between competition policy and data protection. The dominant social network generates data from users on one side of the platform and directly monetizes that data on another side of the platform. The question is whether Facebook is abusing its dominant position by degrading privacy, that is by lowering the quality of the services it provides to its users. In particular, the German Competition Authority is considering whether the possible breach of data protection by Facebook’s terms and conditions of service may also constitute an abuse of dominance under competition law. It was also recalled that Germany’s Bundesgerichtshof sort of “paved the way” by stating that the “use of illegal general terms and conditions by a dominant company can constitute an abuse under the terms of German competition law”. Excessive “data extraction” from consumers is not that much dissimilar from excessive pricing. Of course, this would imply a data collection by the dominant firm going well beyond what is necessary in order to improve the quality of the service and/or lower its costs. In specific cases, identifying harm and distinguishing it from pro-competitive platform and service design can pose significant challenges.

Understandably, some expressed concerns that, by embracing a less price-centric view, competition authorities and courts would daringly depart from traditional, well established economics-based theories of harm. But it is also clear that it would make little sense to stick with traditional theories and tools while markets and competition processes undergo technology-driven “revolutions” (yes, someone actually employed the word).

While it is not yet clear to what extent brand new theories of harm might be needed in order to assess anticompetitive practices in (big) data-driven markets, competition authorities are already reflecting on the need to adapt their enforcement tools to some of the peculiarities of digital markets. Thus, the OECD Competition Commission in 2013 discussed the idea of defining a relevant market based on product quality by replacing the customary SSNIP test (small but significant non-transitory increase in price) with a SSNDQ test (small but significant non-transitory decrease in product quality).

From (big) data to (big) market failures?


It is a well-established fact that many users of online service do not feel in control of their data. If online service providers are not offering the level of data privacy that consumers would appreciate, another way to look at this issue is from a more regulatory perspective. In this regard, it might be helpful to start investigating what could be the current obstacles to well-functioning data privacy competition. Lack of confidence and trust by consumers are a matter of interest also to market regulators, especially if this risks undermining the functioning of digital markets. Thus, for instance, if the maximizing strategy in some digital markets is the creation of confusion, new entrants committed to transparency do not necessarily succeed. In other words, if market incentives for transparency and data privacy are insufficient, a form of procompetitive regulation becomes necessary.

Another type of possible and potentially significant failure of big data-driven markets is that, as seen above, consumers are only partially benefiting from the valuable data that they are producing and that fuels a set of extremely profitable digital services.

Conclusion

Competition authorities' adolescent fascination with big data has quickly matured into full grown interest and analysis. In any event, new theories of harm and enforcement tools will need to be tested on the basis of real cases. Time might be ripe for more competition enforcement on digital markets. 


Fly on the Wall Series, Episode 1