Over 13 years ago, we released a study that revealed search engines were missing the vast majority of content online in a portion of the Internet called the Deep Web. The study titled, “The Deep Web: Surfacing Hidden Value” revealed that the estimated size of the Deep Web was 400-500 times larger than that of the Surface Web and only .03% or 1 in 3,000 Web pages were actually being indexed by a traditional search engine.
A lot has changed in the way the Web works in the last 13 years, so much in fact, that it has become near impossible to replicate the 2001 study because of the sheer size of the Web. In today’s blog post, we are going to take a look at why it’s now become impossible to accurately answer the question: ‘How big is the Internet?’
The Infinite Web
We often receive outreaches from customers, media, and simply interested individuals requesting information on how big the Deep Web and Internet are. These questions are impossible to answer now as the Internet has turned into a rapidly expanding database that can only be classified as infinite. So how big is the biggest database in existence? The answer, “nobody knows”.
Causes of Infinity
There are a number of reasons contributing to the rapid growth of the internet and its ability to be classified as infinite. Two of the main factors we’ll cover in today’s posting are:
- User-created content
- Personalization of the Internet
Social media and the ability for anyone to create content is the major contributing factor to the huge growth of the Internet’s size. Let’s take a look at one social media site as an example. Twitter, the microblogging site, reports to have an average of 271 million monthly active users that contribute to 500 million individual Tweets sent each day or an average of 5,700 Tweets per second. It’s worth mentioning to that the record for most Tweets within an individual second is 143,199.
Each individual Tweet creates a single unique Web page within the Twitter domain like this Tweet from BrightPlanet about our recent partnership: https://twitter.com/brightplanet/status/527114086556110849.
— BrightPlanet (@brightplanet) October 28, 2014
Within the single domain of Twitter, on average, 5,700 individual Web pages are added every single second. More remarkable is that 500 million individual Web pages are added every 24 hours to Twitter.com.
By the time you finish reading this blog posting, over 1 million unique Web pages will have been added into to Twitter’s domain. Keep in mind that Twitter makes up only 1 of an estimated 500 million Web domains. Estimating the size of the Web at its current state to even the closest 100 million has already become impossible.
Personalization of the Internet
You likely experience another factor on a daily basis that causes the Internet to be infinite. The personalization of Web pages based off of your viewing history, previous searches, and purchases is another cause of the Internet’s infinite size.
Taking a look at how the e-commerce site Amazon uses this method. You can see custom results on Amazon’s website for an individual who recently searched for a Halloween costume. On future pages Halloween costumes are displayed that are also related.
By viewing this page, the individual is viewing an entirely unique Web page that is custom made for them based off their previous activity. This contributes to the overall size of the Web. This unique page would classify as a unique page on the Internet, making the Internet truly infinite.
Even with the infinite size of the Internet, all sorts of industries are capitalizing on using the publicly available data using our services. What content is your company missing every second because you are not leveraging Deep Web content?
Our subject matter experts are here to help you find it in the insurance, law enforcement, banking, luxury goods, pharmaceutical and financial service markets.
Request a demo today to speak directly with one of our Data Acquisition Engineers to dig into how these industries are finding, harvesting and leveraging infinite Web data today and how you can to.