Earlier this month, BrightPlanet’s Vice President of Business Development, Tyson Johnson, was invited to be a panelist at Citibank’s Big Data event in New York. The event brought together 50 of the world’s leading hedge funds with technology and Big Data experts from across a wide array of industries.
In conjunction with the event, Citibank released a white paper titled, “Big Data & Investment Management: The Potential to Quantify Traditionally Qualitative Factors” . The white paper digs into Citibanks’ newest findings in the Big Data space as it relates to investment managers. In this post, we’re going to explore two of those findings and how they relate to BrightPlanet, Big Data, and open source intelligence (OSINT).
One of the first findings in the white paper is the concept of datafication. Datafication is the idea that everything is turning into some type of data that can be tracked and stored into some type of database.
Within the last 15 years, we’ve seen this concept come to life through the use of social media and the Web. Friendships among people are now stored as data in Facebook and followers on Twitter; something that couldn’t be tracked at scale until now.
Peoples’ product preferences are turned into data through likes on Facebook pages and reviews on Yelp and Amazon. Everything is turning into data and the white paper has stats to back it up. In 2000 it was estimated 25% of the world’s stored information was digital, now less than 2% of all stored information is non-digital.
N = All
The second major finding is that the concept of sample sizes has been completely turned on its head because of advancements in technology and Big Data. The paper points out that in a traditional study or research, smaller samples of populations or datasets have been used to determine what would happen at the aggregate level. The sample size (N) was then solely a sample. Until now.
Sample sizes now don’t have to be samples because of advances in data warehousing, storage, and processing. Where as a traditional sample size was a subset, the sample size now or N is all. You don’t have to take a look at a subset to explore what happens at the aggregate, you now collect and analyze the aggregate.
This concept of sample size equal to all also complicates the fact that data is no longer the clean and easy to use data it once was. Data doesn’t come from spreadsheets and structured databases containing nicely organized columns and rows. Data comes from multiple sources in multiple formats and data scientists have to be able to embrace all sorts of data regardless of size and structure.
How does this relate to OSINT and BrightPlanet?
The white paper further notes that these concepts, datafication and changing sample size, encourage the use of external data to help further advance Big Data projects. Adding in external datasets from sources like Web data can greatly increase the value but are often difficult for companies, even with data scientists, to manage internally.
That’s where BrightPlanet comes in. We gather (we say harvest) external datasets from all over the Web, convert it into usable data, and provide it to our users in the format they need for efficient analysis. Datafication and changing sample size are concepts that are not going away and we are excited to be able to help companies leverage them with our data harvesting technology.
Provide us the sources you want to harvest or the digital data you are looking to gather, and we’ll help you find and gather it.
Request a Demo
Interested in learning more? Sign up for a free demo of our technology and consultation on how you can leverage external datasets.