Twitter shakes up the big data supply chain: What are the privacy law implications?

Author: Jillian Friedman, lawyer and student DRT6929E

Some big news came out of Twitter late Friday night. The social media behemoth announced it will be cutting off “fire hose” access to its data and will no longer licence its stream of half a billion daily tweets to third party data resellers. Here on out Twitter will be using its own in-house data analytics team, a predictable shift considering the company’s 2014 acquisition of Gnip.

The play revolves around the use of unfiltered, full stream Tweets and related metadata associated with the tweets, called “fire hose data”. The Gnip firehose product provides access to every tweet with a certain link or user mention allowing businesses to capture and analyze every activity and every publicly available tweet ever made through full real-time streams of public data from social networks.

The announcement marks a shift in the supply chain of big data. Many companies use raw Twitter data for commercial purposes such as building products and analyzing business practices. Now Twitter will be in greater control of this ecosystem, by requiring these companies to deal directly with them. Prior to the change, mega data resellers such as DataSift, would provide data analytics services to thousands of businesses, who would then serve thousands more. Twitter will be terminating agreements with third party re-sellers of Twitter firehose data. In order to shift towards working directly with data customers, the company says it is transitioning towards having its data customers receive raw data for commercial use directly from Twitter.

Twitter now has one less privacy issue to worry about. By consolidating and controlling its mega data, Twitter removes an additional risk of data breach by removing a link in the data transfer chain. By keeping the analysis in house, Twitter is creating both a revenue stream by selling the data analytics and eliminating one extra party that they would otherwise share users’ personal information with, and from whom they would require the maintenance of certain privacy standards.

Big data is the resulting union between the ever increasing volumes of information through online activities and ever more sophisticated technologies that enable analysis of the data for commercial purposes. The acquired knowledge from big data analysis can be an extremely powerful tool for business. Companies are investing in big data analytics hoping to be able to reap the same advantages other businesses have gained from big data insights, such as the company behind Candy Crush game. Big data also triggers privacy law concerns. The more sophisticated and profitable the industry gets, businesses are re-orienting their business models, like Twitter did a few days ago.

The other consequence of this growing industry is that data analytics services are finding new types of data to analyze. Without having Twitter to rely on, some fire hose data resellers will be recalibrating their focus on other types of data, such at that derived from Facebook. One buzzword that has begun to pop up more and more is “topic data”, a new type of data with an alleged broad range of applications.

The most important source for topic data is Facebook. Topic data is data that includes what audiences are saying on Facebook about events, brands, subjects and activities. According to Facebook, this data is shown to marketers in a way that keeps personal information private. The topic data analytics can be used to inform marketing decisions on Facebook and other channels. An example, given by Facebook, is that with topic data a business selling hair products can see the demographics on people talking about the effect of the weather on their hair. Availability of this type of information has caused great excitement among the Internet marketing industry. The availability of this data means that marketers can start analyzing consumer dynamics better and earlier in the product research phase.

Topic data may be extremely useful for Internet marketers. That is, as long as privacy protections can be properly implemented and maintained. According to the data analytics businesses involved, topic data will maintain the privacy of the personal information because the information is anonymized and aggregated. To begin with, Facebook has said that topic data will only be available to partners in the US and the UK. Its not clear whether or not the topic data made available to these companies will be derived from Facebook users located in other jurisdictions. Certain uncertainties are raised in the context of tracking and profiling activities, namely the issue of personal identifying information. Whether or not information can be associated with an identifiable individual is a triggering event for the application of privacy laws, such as PIPEDA. These ambiguities will likely extend to topic data, insofar is “it is not always clear at what point an online profile may actually be associated with and identifiable individual”. We will have to keep close watch as to what type of data “topic data” includes.


This content has been updated on April 30, 2015 at 21 h 49 min.