A Committee of Experts led by Infosys co-founder Kris Gopalakrishnan has recommended that the Indian government, through a proposed regulator called the ‘Non Personal Data Authority’, will be the final arbiter of whether the government or a private company can take proprietary “non-personal data” from any company in India. This could even be raw data but stripped of any personal identifiers.
Thus, invoices from a toy distributor in Pune could be available to a competitor in Satara, with buyer details removed. Data about consumer demand for Patanjali’s Neem soap in Varanasi and Prayagraj could be available to Dabur. While factual data will be given for free, value added data like estimates of regional demand for the next year, for any company, could mandatorily be available at a “fair, reasonable and non-discriminatory” price.
The message from the committee to entrepreneurs and businesspersons in India is: the Indian state or a data trustee (which could be a government department) has the rights over the data your company collects or creates. You don’t.
Why is this being done? Data is critical to competition, the development of artificial intelligence, and the economy of the future.
In emails from 2012 made public earlier this week during antitrust hearings in the US, Facebook CEO Mark Zuckerberg explains why he wanted to buy Instagram: it had established networks, was a meaningful brand, and could disrupt Facebook if it grew to a large scale. Buying Instagram would allow Facebook to neutralise a competitor, benefit from integrating their unique “dynamics”, and buy time, before a similar competitor emerges.
There is global concern about dominance that data and network effects create for large Internet companies. Network effects are created by analysing data to enable content and product recommendations, to create new services and improve products. The Committee says that mandatorily breaking data silos which companies hold, and making data available to anyone, will benefit the country.
Where is the market failure in data to recommend such a drastic move? Data in silos did not prevent the rise of Snapchat, Twitter, Sharechat or even TikTok emerging as a global challenger to Facebook. The idea of data in silos as a restriction to competition ignores the non-exclusive nature of data generation: users can be on multiple platforms at the same time.
A legitimate concern is that of vertical integration by platforms. An analysis of 15,000 search results by The Markup has found that 41% of the first page of Google search results is dominated by Google-run products. Both Google and Apple services grow by being default on their operating systems. Amazon allegedly uses data of products on its platform to decide what to sell under its own brands.
If the idea, which is where India’s China-envy comes in, is to use this data nationalisation model to make India an AI superpower, and grow the AI technology sector, it won’t work.
Firstly, research shows that AI can be trained on less data and processing vast amounts of data doesn’t necessarily produce a better output. Secondly, just sharing data won’t be enough. To build an AI ecosystem, you need more computing power and storage, more data scientists to do feature engineering and creating models for training datasets. Work goes into cleaning datasets to be useful and usable. Most importantly, you need AI businesses to be economically viable. AI businesses will themselves be disincentivised by risk of appropriation of their curated data.
The destruction of commercial viability of India’s AI sector will cause it to stagnate and invalidate the business case for future creation of training data. Lastly, with more anonymous data made freely available, the risks of re-identification, surveillance and group privacy harms increase.
There is no doubt that we need to improve competition law to address the abuse of dominance against suppliers, acquisition of future competitors by dominant players, vertical integration and the lack of neutrality of platforms, and market crowding which could prevent the rise of competitors. The committee’s approach of nationalising data will do more harm than good.
We need to enable open licensing, which the Indian government considered in 2016, to allow sharing of non-competitive data in public interest. We need data marketplaces, but like the European Union has suggested, that should be voluntary instead of mandatory. Data portability should enable users to choose to take their data to new services.
What we don’t need are competition concerns and AI dreams being used as a distracting rhetoric, validating appropriation of trade secrets and business data by the Indian state. Nationalisation of this data and the risk of mandatory distribution to competitors will destroy value for businesses and investors, and for India.