Big Data, the Deep Web, and Election 2012

In the days leading up to and following the election, there have been many articles about the role of Big Data in the election. Here are just a few:

The candidates used Big Data to target voters, New York Times blogger Nate Silver used Big Data to predict who would win, and the news networks used Big Data to announce result predictions but what is Big Data and how is Big Data collected?

In politics, data can be the difference between winning and losing a political campaign. Big Data helps candidates know where they stand in the polls and how they compare to their opponents. Big Data helps manage campaign budgets, harness media attention, and discover which demographics are still undecided in their vote.

What is Big Data?

In our What is Big Data whitepaper we go into more detail, but one definition of Big Data is datasets whose size is beyond the ability of typical database software tools to capture, store, manage and analyze. Many of the datasets are hidden beyond the reach of popular search engine indexes in website databases comprising the Deep Web.

How does BrightPlanet collect Big Data?

Big Data is nearly impossible to navigate without the proper tools. Even if you can find the data, harvesting the data at scale, analysis and visualization often requires additional licenses, subscriptions, and money.

At BrightPlanet we have developed tools to harvest data at scale, curate, and create understanding from that information within one platform. We specialize in collecting open-source, unstructured Big Data at the scale of the Internet. Examples of content BrightPlanet can harvest at any scale include:

  • Any HTML websites
  • Deep Web websites (content available through a search box query)
  • Social Media (Facebook , Twitter, LinkedIn, etc.)
  • Blogs/Message Board posts
  • RSS feeds

How have political campaigns used Big Data?

At BrightPlanet we recently worked with a campaign to create a listening platform for political brand management and competitive intelligence. BrightPlanet created a Deep Web Content Silo to harvest, curate and analyze any mention of the political targets in a searchable, topic-specific repository.

BrightPlanet customized a daily report for campaign headquarters and also stored copies of each webpage every time that page was harvested, tracking any changes. This feature allowed the campaign to track every change to every webpage within the website they had designated.

Additional details on the specific campaign and case study can be found in our recent whitepaper on pages 7-9.

 

Photo: Vectorportal