Stage 3: Extracting Intelligence out of Big Data from the Deep Web

Welcome to the third and final installment of our three-part blog series on how data goes from the Deep Web to actionable intelligence. In the first installment, we learned about harvesting unstructured content/data at large scale from the Deep Web. In the second posting, we then learned about the normalization and enrichment of that data to prepare it for analytics.

Today we are going to learn about the analytics.

The first part of analytics BrightPlanet offers to the end user is BrightPlanet’s Intel Silo Dashboard. BrightPlanet’s dashboard offers a number of different features that help you find the answer your team is looking for. The dashboard focuses heavily on search capabilities.

Advanced Search Capabilities

Faceted searching is the ability to break down results of a search into specific categories to more easily find what you are looking for – more than likely you use faceted searching on a daily basis. On shopping websites you can sort shoes by their brand, appliances by their color, and cars by their price range. On search engines like Google and Bing, you can refine your results by the date they were indexed, by the language the page is in, and the type of web page (news, blog, etc). Imagine being able to do faceted searching on any type of web data set with any type of categories. With BrightPlanet and Rosoka, you can.

The ability to pair Rosoka custom entity tagging technology (check out our previous post to revisit that technology) with faceted search technology significantly increases the value of the extracted data for BrightPlanet users – it gives users the ability to tag documents based on any category they wish.


Pharmaceutical companies want to see which domains are selling a certain drug.

  • Custom Tag: Drugs / Domains / Pricing / Dosage / 

Health Researchers want to see the top companies and people receiving grants (Shown right).

  • Custom Tags: Grant Type / Companies / People

Human Resource organizations want to see the top companies hiring Java Developers.

  • Custom Tags: Hiring Company / Job Qualifications / Job Titles


OpenPlanet Platform – When you need more than just search

Many customers require access to harvested content using faceted search and light analytics features that the Deep Web Intel Silo Dashboard can offer. However, some customers need more than just advanced search capabilities – they need Big Data analytics. For those customers, BrightPlanet has developed a platform known as the OpenPlanet Platform.

The OpenPlanet platform is based on a simple workflow that completely separates the harvesting and analytic components of data collection and analysis. This concept allows BrightPlanet to easily swap in different analytic technologies with no knowledge of where the data came from previously. This allows customers to integrate multiple datasets, not just harvested web data, with multiple analytic technologies in one workflow without significant development.

Discover some of the analytic technology possibilities in the next section.


Looking for a Big Data Solution?

Whether every word in this post made sense to you or you just know you are interested in what Big Data could mean for your organization, sign up to schedule a free demo with one of our Deep Web Investigators to see what might work for you.

Also, download our free whitepapers on Big Data and the Deep Web.



Photo: Search Engine People Blog