Data Mining and Tracking 102

Welcome to the Fast-Air Tech Talk newsletter. The Tech Talk newsletter is a free service for all Fast-Air customers. Please suggest newsletter topics.

Being anonymous and protecting privacy are not the same.

Being anonymous means being unrecognizable such that correlating information is impossible or challenging.

In every day life many people know one another. They are not anonymous. How much each person knows about others is an element of privacy. Most people do not reveal everything about their lives to other people. Each day people decide what personal information to reveal to others. This is a core of privacy.

Being anonymous on the web is possible but is not the same as retaining some degree of privacy. Protecting privacy is much like ogres and onions — there are complicated layers involved.

While true for some people, most people do not have a great need or desire for anonymity. Many people are concerned with retaining some degree of privacy — and dignity.

Without significant effort toward being anonymous, avoiding all data mining and tracking probably is impossible. The goal then is to limit the collection or skewer the collected data.

Every computer and router connected to the web is assigned a public IP address. Without appropriate precautions that IP address is an identification tag, much like a house number.

Each connected device also has a unique private IP address. With many ISP companies, devices might have a unique private IP address but share the same assigned public address. This is a result of limited resources that requires using technology called Network Address Translation (NAT).

Having a shared public IP address provides a nominal layer of privacy protection.

NAT provides only a thin layer of privacy protection. Personal web usage habits reveal much about users. Common web browser configurations leak information about a computer’s private IP address, which exposes a person’s identity despite sharing a common public IP address.

There are three fundamental ways to protect privacy online. Each of these methods include encryption. Each method provides a different type of protection. Each method has advantages and disadvantages.

Encryption is a healthy step toward protecting privacy. People with direct access to encrypted network traffic can view packet streams, but they are unable to view the actual contents because of the encryption. An encrypted connection limits what is knowable.

An important technical rule about Tor and a VPN is the encryption always ends when network traffic leaves the final exit node of the Tor system or VPN. The only way to continue encryption thereafter is first encrypting information at the user’s computer. This important distinction often is forgotten or not understood by many people.

The only way to eliminate tracking through IP addresses is to use anonymity tools such as Tor or a VPN.

To remain anonymous online, users must refrain from using personal accounts and services and must use aliases rather than given names.

Browsing the web while being anonymous requires learning to use Tor. Tor is designed to hide a user’s original IP address.

Tor works by relaying connections through multiple servers. These multiple connections causes a degree of latency, which means browsing the web with Tor tends to be slow.

For most users Tor is intended only for encrypting and anonymizing the connection from a web browser. Tor can encrypt other forms of network traffic generated outside the web browser, such as from email, chat, or torrent clients, but requires additional computer skills.

Yet even when limiting usage to the web browser, various features of a web browser can leak information. Most notorious for this is enabling JavaScript as well as various browser add-ons or plugins.

The bottom line is using Tor requires patience and computer savvy to use correctly.

A VPN encrypts all network traffic and not just web browser connections. With a VPN all end-point computers, servers, and web sites do not see the user’s original IP address of the source computer. Instead the end-point computer sees the IP address of the VPN computer or service.

Using a VPN service provides some protection against data collection but often only moves the goal posts. The same game is being played. A fixed IP address is still involved. Remember that people who provide VPN services have to use an ISP too. There is no way to know whether a VPN provider is trustworthy, is tracking users, is selling data, or the respective ISP is doing likewise. There is no way of knowing whether VPN providers or associated ISPs are modifying packet streams with special headers or zombie cookies.

Both Tor and a VPN provide a degree of anonymity and privacy. The trap is usage habits often expose a person’s identity. For example, connecting to a Facebook account through a Tor or VPN immediately identifies the person to Facebook servers because only that user knows the account password. The Facebook servers collecting user data are not fooled. The only thing the Facebook servers do not know is the user’s actual physical location.

Likewise with email accounts, such as those offered through GMail or Yahoo.

Using Tor or a VPN in this manner prevents an ISP from knowing the connection contents but does not protect identity or fully provide privacy end-to-end. With such use cases the only benefit of using Tor or a VPN is the encryption. The computers involved still reveal who is doing the connecting.

One caveat with using a VPN is many devices that connect to the web do not support connecting to a VPN, such as various router models.

A Tor or VPN connection does not prevent an ISP from collecting data through DNS usage when the user still connects to the ISP’s DNS servers. In technical parlance this is called DNS leaking. Anybody using Tor or a VPN needs to learn how to avoid DNS leaking.

Another caveat is using Tor or a VPN service might prevent using online services such as Netflix or Hulu. The servers of these types of services often are designed not to allow access through Tor or a VPN. Other video streaming sites might similarly be affected. Some online services such as Netflix are designed to block access from any known VPN.

The result is using Tor a VPN is challenging for many non technical users.

The third method of encryption is HTTPS. This is the easiest encryption to use. Unlike Tor or a VPN, HTTPS is limited to a site-to-site encrypted connection. Only that specific connection is encrypted.

A caveat is not all web sites support HTTPS.

Using HTTPS prevents ISPs from viewing content but does not prevent learning which sites are connected.

With all three methods the only data an ISP can collect is that the customer is connecting to another computer or server. HTTPS cannot hide which specific sites are connected.

Although more of a security issue than a privacy issue, closely related to HTTPS is SSL/TLS for email. Users who connect to email accounts using webmail should do so only when the service uses HTTPS. People using email clients need to use an SSL/TLS connection to protect passwords.

Email was never designed to be truly secure or private. Without encryption, emails travel through the Internet in plain text and can be intercepted and viewed at any server relay. With plain text there is no privacy protection. This is true even for people using webmail and HTTPS. While the connection to a mail server is encrypted when using HTTPS, once the email leaves the mail server the messages are sent in plain text. Likewise for email clients, the SSL/TLS component only encrypts the password exchange.

Adding full end-to-end encryption with email is possible but is technically challenging for most people.

There is one additional method to protect privacy and reduce tracking. This method does not include encryption. That method is to block access to web sites that are designed to track. Many people block access through ad-blocking software. Another method is using a hosts file to block access to those web sites.

Various technical solutions are not magic. Most people expose who they are through usage habits. Most people expose who they are by freely sharing personal information, such as Facebook users and search engine queries. Anyone unwilling to change usage habits should not expect a high degree of privacy while online.

There are ways to protect privacy. There is no simple all-in-one solution. No silver bullets. Protecting privacy online requires sweat equity. Some protections are painless. Other methods require sacrificing convenience to some degree. More to follow.

Technical trivia: In the early days, the World Wide Web was called the World Wide Wait. Although the Internet has existed since the 1960s, most users were limited to private usage, such as government and university users. The World Wide Web changed that but the hardware infrastructure did not yet exist for millions of users. Almost everybody connected using a telephone modem, which was slow. Today many people call the web the World Wide Wait because of all the useless bloat that is added to web sites — advertisements, videos, scrolling text, etc.

Family time: In the company name “3M,” from what words is the name originally derived? Think you know? Search the web.

Next issue: Web Browser Security and Privacy

How About A Nice Bicycle Ride in Scotland?
Video

Latest posts by Backwoods Geek (see all)