Data Unions: Questions on data selling

Crowdselling data via Data Unions
Data Unions: Questions on data selling

Firstly, consent is a big deal. At the moment, the data broking industry does not work on the basis of informed and overt consent. People are pressured into signing away permissions which are themselves buried on page 34, subsection B of a 62-page contract. It’s fake consent. Just shifting that, to make the whole transaction overt and informed – what we’d call rich consent –  massively changes the ethical basis for engagement.

Secondly, not all data monetisation is about individual targeting. I think that’s where most of the ethical issues arise. I sell data about myself, ‘Shiv Malik’, so that advertising companies out there can manipulate me. Well, I agree, that isn’t great. But there are other forms of data that are really incredibly useful.

When it comes to deindividuated or aggregated data, the ethical harms posed are quite different. If I can sell my data, anonymised and part of a much larger collective, then I’m allowing third party companies to make decisions based on collective behaviour. Assuming that data cannot be disaggregated and I can not be meaningfully identified, (granted that is often hard to achieve),  my own privacy is not jeopardised.

Collecting behaviour information about humanity is of course what, say, universities, do all the time. In today’s world, we all depend on other people receiving decent information about society; where to put investment, what drugs work, modelling weaknesses in infrastructure, how to improve sales, where pollution is greatest. The list goes on and on. And the markets for aggregated data – where the data buyer really isn’t interested in targeting the individual, but they do care about collective behaviour – are pretty vast. If that is a sordid business, then we should shutter modern society now and go back to living in caves.

Let me give some examples:

A hotel maker wants to know where to invest next – data about travel patterns would be good. That’s not the same as wanting likely holidaying individuals to advertise to. Or a city needs to know about road planning. Or Tesco wants to know about footfall. Or I want to know about local pollution. Or I’m a TV producer who wants to know what people are watching on Netflix. None of these data products require that the individuals be identified and targeted as an inherent part of the product. It might happen that they are, or could be. But the product doesn’t need to know who the individuals are and how they can be targeted to be a highly useful product

Finally, you should, I’d humbly suggest, really be turning your question on its head. Unfortunately, you are already involved in this “sordid” business whether you like it or not. You are already a data product to Silicon Valley. So the question really should really be this: what are you doing about it? If your answer is to try and keep washing your hands of it all, is that really going to work? You can’t Pontius Pilot your way out of this problem.

When people are buying Amazon Alexas in their millions, it’s clear that the privacy movement has failed. Simply standing to one side and calling for more legislation will not itself improve people’s privacy. And it will not stop Silicon valley from monopolising the information we all create because this is not just about an individual’s privacy. It’s also about socio-economic power. They not only suck up all the capital and cash, Big Tech effectively governs our lives because of the monopolised data they collect.

We hope that Data Unions lead the way in breaking those data monopolies. By creating governance structures and organisations that ensure that professional people work on your behalf and in your interest,  information should only then be licensed and utilised by people with the highest ethical standards. By the way, these ideas are not just Streamr’s – they are supported by thinkers and practitioners like Jaron Lanier, the MyData movement, RadicalxChange, and very soon, we hope, the European Union.

2). Surely user data can still be exploited for unscrupulous means by unethical data buyers? Can users choose who their data is sold to? Why not now? When?

Great question. The short answer is yes, users can choose who their data is sold to. From Streamr’s technological perspective you can do it now. We’ve just implemented buyer whitelisting so if Data Union admins want to restrict sales of the product to approved buyers, they can enable that. But for end users to have their opinions recognised,  application builders must implement the buyer whitelisting mechanism at their end.

For example, you can simply imagine an interface asking if you’re happy to sell your data to only a charity, charities and government organisations or anyone. At the backend this would mean creating different buckets or data products on our Marketplace. It would be down to a Data Union builder to ensure they KYC’d potential buyers and that they were whitelisted to get access to only the right buckets of data where all the users who make up that real-time data stream were happy to sell to that sort of company.

So it’s easy enough to do for Data Union administrators and I believe it is something Swash is working on. It’s also something which needs to be moulded into Data Union governance. Our hope is that there is supra-national legislation to deal with Data Union governance which ensures these sorts of standards must be implemented and aren’t just a ‘nice to have’.

3) Are there any types of data that are off-limits to collect?

This would be a personal opinion but I think there are real issues with deeply personal data that is unlikely to change over time, for example your genetic code. Of course it turns out this is getting traded all the time anyway. But for Data Unions to get into that would disturb me, because we’re far too early into this game to know the consequences with data that is so high stakes.

Otherwise I’m fairly liberal. This data only gets collected if people want to have it collected. They have to take proactive actions to ensure that’s the case, like downloading an app that specifically tells you it’s there to collect this and that real-time data. (And in return – you get paid). That’s very unlike today, where that information is being taken from you. There are literally hundreds of apps – you’re bound to have one of them on your phone – that hoover up your location data. They literally know where you sleep, eat, walk, and go to the toilet. You think you’re searching for weather or finding out where the cheapest gas is, but in fact you’re supplying deeply private information.

4). Are there any general TS agreements to consider for integrating, collecting and selling data from third-party devices, such as a FitBits or mobile phones?

If I understand your question right then yes. There are few companies out there that don’t defend their data silos with a ring of lawyers! Many platforms are happy for third parties to integrate their services. To do that, those third parties often need access to user data. However how that user data is then licensed for use is often set out in third party developer T&Cs. (See for example subsection h. of Spotify’s T&Cs here).

However the European Commision has stated that they want to change this in two ways in the next few months. Firstly, is that they want to ensure hardware manufacturers open up their device data. Secondly they want to revamp Article 20 to ensure that everyone has realtime programmatic access to their data from any platform. Hopefully that happens sooner rather than later but when it does, that is going to be a huge revolution for Data Unions and the world.

A revamped Article 20 would allow people to port machine readable data from their Netflix, LinkedIn, Google or Spotify accounts for example and allow them to send those real-time streams to a DU. Buyers for such data – stripped of personally identifying information -might be bought by production houses looking to create better TV programmes, recruitment agencies, developers looking to create better map applications, or developers/musicians looking to create a cooperative music platform alternative.

5). How are regional differences in data collection laws managed?

To be honest, this is still a bit of an unknown, and one for Data Union admins, rather than Streamr itself. Of course we have US and EU policy/legal experts to draw upon from our Data Union advisory network. And like with other parts of the Data Union building experience we’ll be looking to integrate basic best practice information into the general Data Union builder resource pool pretty shortly.

6). Do you have any guidelines on price setting and market value estimation?

Every data product is going to be worth something completely different so it’s a bit pointless trying to give guidance based on guesstimates. Of course that doesn’t mean that those numbers can’t be known. For most Data Union products, there will likely be an already existing (if nascent and under the table) market to draw pricing expectations from. We worked like that to help Swash price it’s data and we would happily work with other viable projects to find those answers.

7). Can Data Unions themselves be sold (ownership transfer)?

That is a REALLY good question and one that I am concerned about. As one person put it in our market research we conducted at the start of the year, what’s the point in contributing to building a Data Union if they just get bought out by Google?

So getting this right is going to be a multi-pronged strategy. Firstly, from the ground up, the Data Union builders themselves need to bind themselves into the right structures. Cooperatives (and Data Unions are a sort of platform cooperative) are meant to be owned by users The issue with them is that it is always hard to raise investment from a small bunch of prospective users to the point that it can compete with larger commercial enterprises. The cooperativist Nathan Schnieder believes he has answers to this, which is why we’ve been working with him on those solutions.

Secondly, Data Unions need to be regulated within the next few years. In return for legally being the only type of organisation that should be able to handle the licensing of consumer data, they should be responsive to their users and have cooperative equity structures so companies can’t be easily sold without members saying so.

On the other hand Data Unions should also be subject to the same provisions as any other platform when it comes to porting data so, since they can be fairly easy to build, users should be able to port their information and streams to new Data Unions if serious governance issues do arise.

Is there more that tech can bind in the equity from being captured by adversarial interests? Yes! That’s also why we are working with DAOstack and other DAO builders to get a Data Union DAO off the ground as a PoC. Now THAT is really exciting.

8). How do I prove the authenticity of Data Union data?

As a buyer? As we know better now, data buying does not, and is likely to never happen at the click of a button. Organisations that purchase data spend tens of thousands of dollars on it and won’t simply click a button. They do their due diligence regardless of the tech on offer. It is up to Data Union builders to ensure their data products are secure, refined, and provide clean feeds of information, otherwise everyone loses. Swash, in their latest release has improved that for example, when they introduced a Captcha button to deter bots. There are of course tools that we can integrate into the Streamr Data Union framework but for now we anticipate that the open source community will fill this gap.

9). What level of support can Streamr provide in setting up a Data Union?

Building a Data Union is not an easy process. Partly it’s about the tech and we are absolutely here for that part of the journey. At the ground level, we are always improving our developer tooling, video tutorials and technical documentation. Our Growth team, including our Head of Dev Rel, Matthew Fontana, is also always happy to jump on a video call to guide you through any technical issue you may be having or to just chat through a business idea. Our developer forum is also there as a repository of information for past learnings.

We also have a Community Fund which can provide substantial financial support right the way from idea to the point where you get VC funding.

But as our first Data Union, Swash, has grown, we’ve realised that building up your user base and also connecting with potential data buyers are also very much integral parts of building a Data Union that we also have to support. That’s why we have extra resources available including ground breaking market research and access to our Data Union advisory board who not just believe in Data Unions but who can provide advice and mentorship on; growing your user base, legal and policy issues, and negotiating data sales.

10). Do Data Unions need to be open source?

Being open source is important to build trust with the users of the Data Union but is not a strict requirement of Data Unions. We’re glad that Swash has done this and we will always encourage other Data Union builders to do the same.

Stay up to date

Get the latest Streamr news and articles delivered to your inbox