Home Page
Archive > Posts > Tags > Privacy
Search:

The Pitfalls and Use of VPN and TOR
AKA Privacy Online

A friend of mine recently asked me about the TOR network because of a PC world article he had read. First, I’d like to state that the article actually has a lot of good general information, covering a lot of general security problems with solutions to them that have been time proven and useful to millions of people (VPNs, privacy/incognito mode in browsers, cookie management, bugmenot, etc). However, I think the article does not cover the realities of TOR and VPNs at all, so I figured I’d write up an article on these topics that I could share with my inquisitive friend and anyone else who is interested.


I used TOR back in the early 2000s and it’s not cracked up to what the article would have you think. Basically, it securely routes your connection through a few other people’s internet connections (we’ll say 3 for examples sake). The computers/nodes between you and the “exit node” in the route can’t read what your traffic data says because it’s all encrypted, but the final person/computer (the “exit node”) literally sees, in clear text, 100% of your data as if you were sending/receiving it out of your own machine without the TOR network. So if you are doing anything that isn’t natively encrypted (instant message chatting without OTR, going to a site via http instead of https) the exit node can snoop on everything you do. They can even see the domain (not the entire URL) of WHERE you are going with https1. If I recall, you can’t really control the exit node as, I think, it semi-randomly picks it from any person in the world running a TOR router node.


So all TOR really does for you is make servers that you connect to not know from where you are coming. So one day it may think you are coming from Michigan, and another day, from Singapore. And honestly, for most people that isn’t even really all that important. Do you really care if servers you go to on the internet know you are coming in from your home town? (They generally can’t pinpoint further than that without getting a warrant and asking the ISP). All that's really done with this data is correlation. Seeing that someone from this IP address that went to this one website also went to this other website.


And even worse, TOR is known for being ungodly slow. Back when I was using it I was LUCKY to get 15KB/s throughput on my connections, and I doubt it has changed much (though you could get lucky too on your “randomly” chosen connection nodes). This means to download a normal webpage (~1.5MB for arguments sake) it would take ~2 minutes to download the page instead of 1-2 seconds for normal broadband users.


The more important thing (than anonymity) for online security is making sure everything you do is encrypted end point to end point (privacy). That means using securely encrypted (usually SSL) connections (https is SSL on top of http). That makes it so no one can snoop on conversations between your computer and the server you are communicating with. Location anonymity isn’t really that important unless you have something to hide that you think someone may try to find you for, though taking appropriate precautions (next few paragraphs) could never hurt. TOR is actually probably more hurtful in the long run since the exit node is an untrusted user who can spy on your unencrypted traffic.


Now, if you really wanted an appropriate solution for privacy (not anonymity), you only ever let your unencrypted traffic exit out of trusted networks. This generally means your house (and maybe your office), though even from those places their ISPs could easily “spy” on your unencrypted traffic. And technically, any router in between you and the server you are connected to can spy on your unencrypted traffic, though there is too much traffic going on for anyone in between ISPs to really even want to try this sort of thing. So it’s not a bad idea to set up a VPN server at a secure locations for yourself so you can connect in and route your traffic through the secure location when you are anywhere on the planet. For this I would recommend OpenVPN, and make sure you configure your client to route all traffic through the VPN tunnel. This approach could severely reduce your connection speed as most broadband connections have a much lower upload than download (meaning when your VPN server sends data back to you, it’s most likely slower than you would normally get it).

However, the speed issue can be solved by setting up your VPN server at a collocation (or on a cloud like Amazon’s), as these collocation ISPs route through so much traffic it would be unfeasible for them to snoop, nor often would they have as much inclination to do so. This wouldn’t give great anonymity since only a handful of people would most likely be using these VPNs, and they will generally exit from the same IP address, but it gives a great amount of privacy when on untrusted (or any) internet connection, and there are no noticeable speed decreases if at a good collocation.


The best solution is to use a paid-for VPN service. However, you would have to of course trust this service to not be spying on your unencrypted traffic, which they generally wouldn’t do. These services are good because they (should be) fast, they are secure exit points, and best of all they can be anonymous to a large degree. Since so many people are coming from the same exit points, and your exit point’s IP could change in between each connection with these VPNs, there’s no easy way to know who the traffic is coming from on a monitoring perspective outside of the VPN provider.


However, there are also downsides to using these VPN services since many providers depend and filter based on location data. For example:

  • If you are coming from outside of the country many services inside the USA may block you
  • Providers needing your location to provide a service for you would have the wrong location. For example, Google Maps wouldn’t know what area to search around when you asked for “restaurants”. You would have to specify “restaurants around my address
  • Some banks and services check to make sure you are always coming in from the same IP addresses. If you aren’t, it makes you go through additional, often convoluted, security checks.


Some networks you may connect to (hotels for example) may also block VPNs, which can be a major pain. However, I can usually get through using dynamic SSH tunnels (“ssh -D” for a socks proxy) at the very least.


If I were to recommend a paid-for VPN service, it would be PirateBay’s ipredator. This service was set up to help the people of Sweden get around some bad laws passed regarding user privacy. I’m sure they have enough users so you would become one of the crowd, and The Pirate Bay has proven themselves to be trustworthy advocates of internet freedom.


1Modern browsers include the domain you are visiting in the https connection packet in plain text, via the Server Name Indication TLS extension. This means if someone is snooping on your packets, they will see the domain you are visiting via https.

Managing Firefox History
Software likes hiding sensitive information and keeping it persistent :-(

Since version 3 of Firefox, the browser has moved over from using flat files for keeping track of browsing history (history.dat) and bookmarks (bookmarks.html) to using SQLite databases (places.sqlite). This change over was required because the old flat file formats were badly implemented, clunky, and not able to handle the new demands of the location bar and browser history. Using a SQL database was the perfect solution for the complexity brought in with the new location bar and its dynamic searching of previous URLS, as SQL is easy to implement, is mostly compatible against multiple SQL application implementations (removing dependency on a single product), and powerful for cross referencing lookups. As a matter of fact, most of the data Firefox keeps now is stored in SQLite databases.

SQLite was also a good choice for the SQL solution because it can be implemented minimally straight into a product without needing a large install and a lot of bloat. While I like SQLite for this purpose and its ease of implementation, it lacks a lot of base SQL functionality that would be nice, like TABLE JOINS inside of DELETE statements, among many other language abilities. I wouldn’t suggest using it for large database driven products that require high optimization, which I believe it can’t handle. It’s meant as a simpler SQL implementation.


Anyways, I was very happy to see that when you delete URLs from the history in the newest version of Firefox that it actually deletes them out of the database as opposed to just hiding them, like it used to. The history manager actual seems to do its job quite well now, but I noticed one big problem. After attempting to delete all the URLs from a specific site out of the Firefox history manager, I noticed there were still some entries from that site in the SQLite database, which is a privacy problem.

After some digging, I realized that there are “hidden” entries inside of the history manager. A hidden entry is created when a URL is loaded in a frame or IFrame that you do not directly navigate too. These entries cannot be viewed through the history manager, and because of this, cannot be easily deleted outside of the history database without wiping the whole history.

At this point, I decided to go ahead and look at all the table structures for the history manager and figure out how they interact. Hidden entries are marked in places.sqlite::moz_places.history with the value “1”. According to a Firefox wiki “A hidden URL is one that the user did not specifically navigate to. These are commonly embedded pages, i-frames, RSS bookmarks and javascript calls.” So after figuring all of this out, I came up with some SQL commands to delete all hidden entries, which don’t really do anything anyways inside the database. Do note that Firefox has to be closed to work on the database so it is not locked.

sqlite3 places.sqlite
DELETE FROM moz_annos WHERE place_id IN (SELECT ID FROM moz_places WHERE hidden=1);
DELETE FROM moz_inputhistory WHERE place_id IN (SELECT ID FROM moz_places WHERE hidden=1);
DELETE FROM moz_historyvisits WHERE place_id IN (SELECT ID FROM moz_places WHERE hidden=1);
DELETE FROM moz_places WHERE hidden=1;
.exit

This could all be done in 1 SQL statement in MySQL, but again, SQLite is not as robust :-\. There is also a “Favorite’s Icon” table in the database that might keep an icon stored as long as a hidden entry for the domain still exists, but I didn’t really look into it.

LinkedIn Policies Part 2
Tech Support Hell

Continued from Part 1. Once again, I received another notification of a friend joining from an email I gave to the LinkedIn system. I contacted LinkedIn before writing the previous post on the topic with the following message:

For reference, your privacy policy states the following
Information about your Contacts
In order to invite others to connect with you directly in LinkedIn, you will enter their names and email addresses. This information will be used by LinkedIn to send your invitation including a message that you write. The names and email addresses of people that you invite will be used only to send your invitation and reminders.
I decided to search for accounts through your "Address Book Contacts" function, and manually entered my email contacts. I only used this function to find existing users, and not invite new ones. I expected the information to be immediately deleted from your servers, as it had no more use for the contacts I gave, but I found out today they were still there when one of said addresses was used to sign up a new account and LinkedIn informed me of such. While this is a nice feature, it would have been appropriate to allow the user to opt out of having LinkedIn keep the emails for further use, and downright shady that the user is not informed at all that given email addresses are kept by LinkedIn on your servers.
And this is the non-auto-generated response I received back 2 days later:
Dear Jeffrey
We are aware of the issue you are currently experiencing and we are working diligently to resolve the issue. We appreciate your patience while this issue is being resolved.

I thought it obvious from this reply that they did not take what I said into consideration, and a high probability that they didn’t really even read it. I mentioned in the last post this exact thing happened to my friend who was trying to communicate with LinkedIn about a problem he was having with errors with their site code. This kind of thing is typical from large corporations that receive a large amount of communications and do not have the staff to handle it. I consider this practice almost as bad as out-sourced tech support (usually India), another pet peeve of mine, as communication is often hard and the tech support agents often don’t know what they are talking about... at least very much more so than when with other first-tier tech support channels provided in-country ^_^; . I went ahead and contacted eTrust a few days ago in hopes that I get a more personal response from them.

LinkedIn Policies
It's always a bit of a risk giving out email addresses

Since I just added my résumé which mentions my LinkedIn page, I thought I’d mention something else I just discovered about LinkedIn.


I would normally never give out any of my contacts’ email addresses to third parties under any circumstance, but I decided there was very little risk to do so at LinkedIn because it is a largely used website with many users that is also eTrust certified. Unfortunately, I have also heard eTrust certification isn’t exactly hard to get and shouldn’t have too much stock put in it, but it is still something.

Anyways, after reading LinkedIn’s privacy policy, I decided it would be ok to list some of my email contacts to discover if they also used LinkedIn or not. I, of course, added in a dummy email address of mine into this to watch for spam or advertisements, and it has to date not received anything, though I’m sure any company that illegally released email addresses wouldn’t be stupid enough to let go of newly acquired addresses immediately, but then again, I always assume too much of people/corporations... but I digress. I have discovered that they keep all the emails you give them because one of the emails I gave was recently used to sign up for a new account and LinkedIn immediately informed me of this.

While this is a nice extension to the "find your contacts through their emails" function, LinkedIn really should have given me an option to opt out of this, or at the very least informed me that it was keeping the emails I gave it on record. Unfortunately, even if they do have a good privacy policy and abide by it, there is still the chance a rogue staff member could harvest the emails and sell them.


Oh, LinkedIn is also a very buggy system in and of itself. I very often get timeout errors and many other errors to the extent of “The server cannot perform this operation at this time, please try again later”. A friend of mine has also been having trouble linking our profiles together for more than a week now, with no response to his email to them… besides a type of auto response he got back that had absolutely nothing to do with the reported problem.