Using BrightPlanet’s Compare Function to Analyze Data in Daily News

The fiery 2016 election cycle brought increased wariness and scrutiny of the media — and the news content we digest.

Misleading facts and deceptive memes were disseminated through news articles, blogs, and social media at ever-increasing volumes. In this new era of post-trust politics, it is more critical than ever to research and verify the authenticity of data.

And BrightPlanet’s services allow you to do so.

Monitor Breaking News

As a regular consumer of online news, you probably don’t realize how many edits a typical news story goes through before it reaches its final form.

When breaking news hits, media outlets will post a quick blurb to get something online first, and as a reporter gathers more information, they backfill content to the article.

In addition to the author expanding a story, news editors are typically re-wording and fact-checking simultaneously. It’s not unusual to see a news article go through three or more major edits, and even more minor changes.

The average user won’t notice these edits because they only see the most updated version of each page.

How does BrightPlanet know what’s going on behind the scenes? A powerful, yet seldom-utilized feature of our harvesting platform is the versioning of every harvested document.

A search engine, such as Google, is constantly indexing the most recent version of pages. This is useful to find the latest information, but what if you want to see an early version of a web page? You’re out of luck. Unless you have our help.

Harvest News Data

When BrightPlanet harvests a document for the first time, it is flagged as “new.” The next time BrightPlanet harvests this URL, it does a differential analysis of the text to check if any of the text has changed.

If the text is the same, the document gets an updated tag of “unmodified.” If the text changes, the document is tagged “modified.” All of these changes are visualized in the Dashboard – green text is added text; red text is deleted text. Let’s take a look at BrightPlanet’s Global News Data Feed and the “Compare” feature in action.

For this analysis, we’ll look at a Yahoo! News article on President Trump’s EPA pick. This article is interesting because it was originally posted as a preview story, but as the day progressed, it switched from future to past tense and was updated to include the events of the Senate confirmation hearings.

The article went through at least three distinct edits after being initially posted. In the sample below, you can see the first edit on the left, compared to the fourth version on the right.

Analyze Data on Global News Data Feed | Bright Planet

The headline started at “U.S. Senate panel to question Trump’s EPA pick over energy ties” and ended at “Trump EPA pick expresses doubts on climate, defends oil industry funding.” Through those changes, you can read the growth in tone and tense.

The word count also increased by 169 words, from 627 to 796. The biggest increase happens during the second edit, where the site also added a second author to the byline.

Analyze Data for Your Business

As a news reader, it might not benefit you to know about each and every version of a story, but consider the possibilities of tracking text changes to harvested data for your business. 

BrightPlanet can help you find and analyze data.

Contact us to meet with a Data Acquisition Engineer or let us know what you are working on so we can show you how to capitalize on the web data available to you.