Social Data: What Defines It? And, Who Owns it?

Yesterday’s announcement got me thinking, and it certainly had a lot of us internally thinking.

But, truth be told, the kind of data services story (and the implications of data ownership in a social/cloud world)  behind the Jigsaw deal has been on the minds of us social CRM thinking types for some time. In fact, at SugarCon last week I even brought the subject up during my panel after thinking of some open-ended questions with Mitch.

Mitch has distilled a lot of these ideas into an interesting blog post on his own blog – which enlists the ideas and feedback of two of the social CRM panel members: Esteban Kolsky and Sameer Patel. I’d like to personally thank them for their continued great insight, both in the panel and just in general.

I could have easily just re-ran Mitch’s post here, and be done with it. But, I think there are some subtle differences between my ideas and his. Well, in truth I think that my ideas are probably more influenced by the Jigsaw deal than Mitch’s.

To that end – before we really start to discuss the “ownership” issue around social data, I think we must first define what we are talking about. Now, you may use tools like Jigsaw, Hoover’s CrunchBase, ZoomInfo, InsideView etc. to get updated data – but is all that data really “social data?” I do not think so. And this, my friends, is where an ownership issue really matters most. When we are talking about company data – addresses, personnel info, phone numbers – these data sets are prone to use and of course misuse. Thus, ownership and how this data is leveraged is a big issue.

But is this what we mean by social data?  I don’t really think so. I would in fact define social data as any data given up by individuals that is anonymous in nature (meaning – no address, phone, email etc. data involved) and only adds color to a company, a trend, topics, etc.

Adds color?  What is this clown talking about?

To define color, let me use an analogy. When you watch baseball, many stats and numbers will be thrown around – either on screen or spoken. That is like the company data (in fact most of those numbers are the intellectual property of the Elias Sports Bureau). Now, all the cute stories and fluff of the “color commentators” (see where I’m going here) is more like the social data that is piling up in databanks around the universe.

Call it conversational, call it social…call it hearsay. The fact is, this type of highly unstructured and highly unqualified data is hard to really “own” – mainly because it only holds value in very specific circumstances. (think: Citing Ty Cobb’s lifetime batting average versus simply saying – Pete Rose was an awesome ball player! – While both are true, which one is an actual qualified fact and an own-able statistic?)

What I mean is that some social data will be highly valued…a lot is useless noise…and some is just, there. Now, the wise souls who sift through the garbage and find the valued gems thrown in with the banana peels and coffee grinds of social interactions – they can take that data and use it to do all the awesome stuff of social CRM: sentiment analysis, feedback management, keyword trending etc.

Am I saying no one owns social data? Not necessarily. I think there are implications around data mining when the origin of the data is a corporate owned network (one that is not your corporation especially) or even a broad-based network like facebook.

Really – I think we are just at the beginning of this type of discussion, because we are really just at the surface of doing cool stuff with all this information. I welcome any and all comments, your own definitions, name-calling, etc…

  1. Martin,

    Interesting perspective, I agree with parts of it, but there are parts which I think need further exploration. I do not think Social Data is limited to what I share anonymously. I believe Social Data to be what I am comfortable putting out there, knowing it is public. For example, when I write my blog, that is far from anonymous. Knowing that the opinion expressed is mine is far more interesting (not going to say valuable 🙂 that having it be anonymous.

    Take your example – If you say Pete Rose is an awesome ball player, well you are from Philly (enough said). If I say it, being a New York fan growing up, does it have the same meaning? Getting pragmatic, I know full well what I put on my blog, in LinkedIn has a certain public face to it, I do consider that social. Does social = Public, that is also an interesting question.

    I have another post that has been rumbling around in my head – Is anonymous information really valuable – at all? It is the sister conversation to “Brave behind email” the conversation of 5 years ago, now it is “Brave behind Social Media”.


  2. GREAT POINTS Mitch – I hadn’t really thought about that.

    I agree in retrospect, perhaps I was a bit off when I used the term anonymous. I think a broader – “non-Quantitative” or something like that would be a better term.

    But you make a very great point, and have my brain juices flowing….how do we create a measurement tool that takes subjective comments/sentiments – and then uses the non-anonymous data surrounding that information (from LinkedIn or facebook profiles, the “about” section of a blog etc.) to create a relative value?

    Think about it – this is a valuable asset for any sentiment analysis tool – being able to better define the import behind a statement, taking in values such as context, background, influencer/advocate level etc. Think of it as getting NPS right…

