How Anonymization on the Web Works
Not using anonymization when accessing your competitor’s Web pages can turn you into competitive intelligence. In our last post, we covered how this happens.
The question we didn’t answer is how anonymizing systems work. In the following technical post, we’ll uncover how anonymizing systems, including BrightPlanet’s, keep you anonymous on the Internet.
How Anonymizing Works
There are predominantly two ways anonymizing services work. The primary way that anonymizing services function is through the use of something known as a proxy server. Put simply, a proxy server acts as an intermediary between your machine and the websites you are accessing.
An analogy would be having a friend pass the note to your middle school crush anonymously instead of you passing it directly yourself. The friend is the proxy server in this analogy. This means that web-masters no longer see that your IP address or machine affiliated with your company is accessing their Web server; they see the proxy server instead.
The second tactic that is often deployed with anonymization is cycling IP addresses on a periodic basis. This tactic of cycling IP addresses allows for stealth when harvesting data from websites and helps ensure that servers looking for robots do not detect the same IP address and potentially block the traffic.
Our Anonymization System
BrightPlanet has built their own anonymizing proxy system to use both these methods. Our solution leverages the Amazon Web Service (AWS) cloud service to support instances that cycle IP addresses on a scheduled basis. The entire solution relies on features provided by Amazon within their cloud platform. Using AWS further separates the relation of all Web traffic from our data harvest engine because all resolved IP addresses from their traffic will resolve to Amazon.
Using an Amazon cluster does have some downsides though. Since many bots operate within Amazon’s servers, some services, notably CraigsList, have out-right blocks on all Amazon IP addresses, preventing access to CraigsList.
Building your own anonymization proxy server requires access to a large supply of IP addresses and the ability to programmatically change or cycle IP addresses periodically. Both can be very difficult to obtain on premise.
Alternatively, there are commercial services that provide a more thorough or different anonymization service than an AWS cluster provided by BrightPlanet. Commercial services often provide access to thousands of IP addresses from around the world, guaranteed IP re-use policies, and more stealth cycling or re-use of IP addresses. These services come at a significant cost.
Figure Out the Best Anonymization Method For You
If you are keeping an eye on your competitors on the Web, you should be anonymizing your searching to prevent yourself from becoming competitive intelligence.
Want to learn more about how anonymization might benefit your competitive intelligence gathering? Request a consultation below.
//