Harnessing Deep Web Intelligence for Law Enforcement

The World Wide Web provides law enforcement agencies an incredible opportunity to legally collect information. This information can include content about illegal activities and threats to public safety that are very relevant to law enforcement and other public safety agencies.

Our BlueJay service is a Twitter crime scanner for law enforcement, but often agencies want to search additional sources online besides Twitter. In this post we’ll talk about:

  • Where these sources can be found
  • How to get the information from these sources
  • A Deep Web intelligence case study from the Detroit Crime Commission

Where can intelligence be found online?

At BrightPlanet, we specialize in harvesting Big Data from the Deep Web to create new intelligence. The Deep Web is the part of the Internet not accessible by link-crawling search engines like Google. The only way to access the Deep Web is by conducting a search that is within a particular website. For the purposes of this post, we won’t go in-depth on what the Deep Web is but if you want to learn more, download our Understanding the Deep Web in 10 Minutes whitepaper.

Police Badge ImageFor law enforcement BrightPlanet can harvest publicly-available Deep Web and Surface Web content from thousands of sources including:

  • Social Media (Twitter, Facebook, Blogs, Message Boards, etc.)
  • Consumer Classified Sites (Craigslist, Backpage, etc.)
  • Open Source Databases containing public records

The content is harvested into a Deep Web Intel Silo that is tailor-made for each specific law enforcement agency; allowing you to have a completely customized data set.

Creating a Deep Web Intel Silo for Law Enforcement

The process for creating a Deep Web Intel Silo involves collaboration between the law enforcement agency and BrightPlanet’s Deep Web Investigators. Deep Web Investigators are BrightPlanet’s content managers that work directly with our clients to ensure that the content going into the Intel Silo is timely and relevant.

The process for creating a Deep Web Intel Silo is continuous, but can typically be broken down into two phases:

  1. What types of publicly-available sources should be included in the Silo?
  2. What type of analytics, visualizations and entity tagging will help the end-user create the intelligence they need?

Identifying Sources for Harvest and Investigation

BrightPlanet works with subject matter experts from law enforcement agencies to first identify key publicly available sources. Examples include Twitter, Facebook, discussion boards, forums, etc. BrightPlanet’s Deep Web Investigators will then work with agency members to establish keywords used to search the websites and further filter down results, ensuring only relevant documents are harvested. Once the initial harvests are set-up, the collection process is continuously scheduled anywhere from every hour to every month to ensure the Silo is up to date.

Turning Data into Intelligence

After targeted Deep Web sources are harvested, BrightPlanet runs all the harvested content through an enrichment process. Law enforcement professionals again play a crucial role in identifying key entities to tag within each source. BrightPlanet automatically tags the names of people, companies, and places, but can also write custom rules that allow for the tagging of other entities. Some additional tagged entities could include:

  • Drug slang terms
  • Event names
  • Gang names
  • Department mentions
  • Phone numbers
  • Addresses

Tagging documents is crucial when creating intelligence from large datasets. Tags help the end-user simplify the data by entering search terms. All these custom specifications are set up in the Deep Web Intel Silo Dashboard that the end-user uses to access the harvested information.DCC SiloScreenShot - Smudged

In addition to creating intelligence through entity tagging and the Intel Silo Dashboard, BrightPlanet’s technology platform allows plug-and-play integration with nearly any third-party analytics solution. One of those solutions is GeoTime.

GeoTime is a data visualization and analysis software application specializing in the display of events over time and focused on law enforcement. Capabilities include:

  • Instantly seeing movement, speed, meetings and communications
  • Automated pattern detection
  • Automated display of cell towers and sectors

Law Enforcement Deep Web Case Study: Detroit Crime Commission

Fair ImageIn May 2013, federal and local law enforcement agencies were concerned about security at a local public fair in a Detroit suburb where violent youth had disrupted the event in previous years.

Numerous fairs in the metropolitan Detroit area had also experienced this problem. Gangs had threatened each other on social media e.g., Twitter, Facebook, etc., and engaged in physical altercations at these events. Many public event promoters considered ending the fairs out of concern that violence would continue to occur at the events thus threatening the safety of the fair goers.

By monitoring these conversations on social media, local law enforcement had determined there were credible threats made against the St. Mary’s Fair in Walled Lake, Michigan, a Detroit suburb. While working to identify the subjects threatening the fair, the local law enforcement agency turned to the Detroit Crime Commission (DCC) for assistance in identifying other threats made against the fair.

The first thing DCC did was search their Deep Web Silo for the term “St. Mary’s Fair” and a viable threat against the fair was identified. DCC prepared and forwarded an intelligence report to the local law enforcement agency.

The DCC Deep Web Intel Silo contains intelligence from publicly-available social media content specific to Detroit and keywords of interest. Additionally, it collects geo-located, publicly available Twitter content within 25 miles of Detroit.

As a result of their investigation of the first subject, another juvenile was arrested on state charges of domestic terrorism. The investigation of the first subject identified by the DCC is ongoing and it is anticipated that he will also be charged with domestic terrorism.

A third person of interest was actually stopped and investigated at the fair after an additional intelligence report published by DCC as a result of information found on the Deep Web Silo was disseminated to law enforcement at the event. The third person of interest was released but the investigation most likely served as a warning to this person who may have changed their plans to disrupt that fair knowing that law enforcement had positively identified them and was aware of their presence.

Your Deep Web Intel Silo

Interested in more information about a Deep Web Intel Silo for your law enforcement or security agency? Sign up for a free consultation with one of our Deep Web Investigators who specialize in law enforcement here.




Rafael Chacon Photography