Relays and Relay Operators on the Tor Network

My thoroughly interesting experience contributing to Tor as part of the Outreachy Program: Part 1

Relays and Relay Operators on the Tor Network

Overview

Tor is a decentralized anonymity network that is made up of over 5000 nodes that are operated by volunteers (relay operators) all over the world. It is the most widely used a free and open-source anonymization technique that not only provides its users with a censorship-resistant access to the internet, but also enables "hidden services" e.g. by hosting websites in a secure and anonymous fashion. The Tor project employs onion routing to ensure its users enjoy private access to an uncensored web, by "creating and deploying free and open-source anonymity and privacy technologies". The Tor Browser is a great example of a privacy-enhancing tool provided by Tor. By enabling anonymous computing, The Tor network significantly contributes to efforts to uphold human rights to privacy, and freedom of expression without tracking and surveillance censorship by governments or internet service providers (ISPs).

ℹ️ Project Name: Mapping Values and Motivations of the Tor Network's Relay Operators

Project Goal: To create the foundation for Tor's gamification project for relay operators.

The Tor network is a "labor of love produced by an international community of people devoted to human rights". Among these wonderful people are the relay operators who volunteer their time and resources to ensure the stability, robustness and safety of the network. Tor recognizes their endless efforts and immeasurable value, and would like to devise a recognition system to improve motivation amongst existing long-term contributors, whilst encouraging new operators to join the network. This project aims to serve as a foundation for this endeavour by employing a gamification strategy that provides a clear and healthy path for all relay operators to stay engaged with the network. The anticipated results of the techniques implemented include increased consistent participation or contribution, which consequently leads to long-term engagement with the network by relay operators.

🚧 The Incentive Problem: Why Tor wants more relays:

At the moment, there are more Tor clients than relays. Bandwidth capacity remains a limiting factor in the network's growth in size and goodput.

The more the relays in the Tor network, the more stable, reliable and secure it will be. It would distribute trust among a larger, more diverse set of distributed peers, increasing robustness, thus making the network harder for adversaries to break. A larger network, transferring large quantities of more diverse data types makes it more difficult for an adversary to determine where traffic is coming from and transferring to, ergo, cannot easily discern who is communicating with who in the network.

This project is not the first in Tor's gamification efforts. There have been previous attempts at the strategy:

but more on that later.

As an Outreachy applicant in the December 2021 - March 2022 cycle, I, along with other applicants, have been given an opportunity to contribute to this project during the allocated contribution period (October 8th - November 5th). I have been allocated three tasks, which I shall record below as part of my contribution,and under the guidance of my mentor, Gustavo Gus, of course with the help and support of other members of the wider Tor community forum on IRC.

[✅] Task One: Complete Self-guided Education About Tor Relays and Relay Operator Community.

Types of Relays on The Tor Network

A relay refers to public listed server in the Tor network that forwards traffic on behalf of clients, and that registers itself with the directory authorities.There are various types of nodes or relays on the Tor network, all of which are important, but differ in terms of role, technical requirements, legal implication (including risk), and impact on users or clients.

The Tor circuit is made up of a chain of relays: guard & middle relays exit relays and bridges. As a relay operator, one would need to have sufficient understanding of all these different relays so as to make an informed decision concerning their choice of relay to run.

Guard and Middle Relays (Non-exit Relays)

The guard and middle relays are are both non-exit relays, meaning that their exit policies do not allow their clients to exit; they cannot send traffic from the network to their destination, thus the services that Tor clients are connecting to cannot see these relays' IP addresses. THey are configured in the folllowing way:

ExitPolicy reject *:*

Guard Relay- This is the entry relay. It is the first relay in the chain of three relays building the Tor circuit. It has certain technical requirements:

  • It must be stable.
  • It must be fast (requires at least 2MBps).

If a guard relay is not stable, and has a bandwidth below 2 megabytes per second, then it remains as a middle relay.

Middle Relay - A middle relay is the second hop after the guard relay. It is positioned between a guard relay and an exit relay.

Tor Relay Interaction Diagram

Exit Relay

This is the final relay in the three-relay Tor circuit. Its function is to send traffic out to its destination. This means that, unlike the guard and middle relays, the exit relay's IP address can be seen by the services that Tor client nodes are connecting to, e.g. websites, chat services, email providers, etc.

The exit policy of an exit relay is permissive; it allows clients to exit, and this poses a huge risk factor. Of all the relays in the Tor network, these have the biggest liability and legal exposure. A relay operator maintaining an exit relay runs the risk of getting in trouble with the Digital Millennium Copyright Act (DMCA) if a user downloads copyrighted material using their exit relay. Because of this affinity to the risk of legal exposure, it is crucial that relay operators do not run exit relays from their homes. Ideally, they should be affiliated with some kind of institution such as a university, library or hackerspace or a privacy-related organization because such institutions are better positioned to handle DCMA notices and other law enforcement inquiries. They are also able to provide the much needed bandwidth to facilitate exit of traffic.

Defining the exit policy of an exit relay is one of the most important steps of the relay configuration by an operator. Here's an example of how the exit policy of such a relay may be configured:

ExitPolicy accept *:22 # SSH

The configuration defines the destination ports a relay operator is willing to forward traffic to. Of course, the more ports an operator allows, the more the risk of legal exposure that would attract many abuse complaints. A relay operator would, therefore, have to be cautious about the number of ports they allow, while also not limiting them to too much since allowing too few ports would make their exit relays less useful.

A more comprehensive exit policy for an exit relay:

reject 0.0.0.0/8:*
reject 169.254.0.0/16:*
reject 127.0.0.0/8:*
reject 192.168.0.0/16:*
reject 10.0.0.0/8:*
reject 172.16.0.0/12:*
reject 146.59.234.220:*
accept *:20-21
accept *:22
accept *:23
accept *:43
accept *:53
accept *:79
accept *:80-81
accept *:88
accept *:110
accept *:143
accept *:194
accept *:220
accept *:389
accept *:443
accept *:464
accept *:465
accept *:531
accept *:543-544
accept *:554
accept *:563
accept *:587
accept *:636
accept *:706
accept *:749
accept *:853
accept *:873
accept *:902-904
accept *:981
accept *:989-990
accept *:991
accept *:992
accept *:993
accept *:994
accept *:995
accept *:1194
accept *:1220
accept *:1293
accept *:1500
accept *:1533
accept *:1677
accept *:1723
accept *:1755
accept *:1863
accept *:2082
accept *:2083
accept *:2086-2087
accept *:2095-2096
accept *:2102-2104
accept *:3128
accept *:3389
accept *:3690
accept *:4321
accept *:4643
accept *:5050
accept *:5190
accept *:5222-5223
accept *:5228
accept *:5900
accept *:6660-6669
accept *:6679
accept *:6697
accept *:8000
accept *:8008
accept *:8074
accept *:8080
accept *:8082
accept *:8087-8088
accept *:8232-8233
accept *:8332-8333
accept *:8443
accept *:8888
accept *:9418
accept *:9999
accept *:10000
accept *:11371
accept *:19294
accept *:19638
accept *:50002
accept *:64738
reject *:*

ℹ️ Note

To operate a useful exit relay, one must, at the very least, allow ports 80 and 443.

 policy = ExitPolicy('accept *:80', 'accept *:443', 'reject *:*')

Comparing Exit to Non-Exit Relays

Similarities
  • Both of them (all relays, in fact) are publicly listed on the Tor network. Unfortunately, this opens them up to the possibility of being blocked by services that do not understand how Tor works, or those that just want to censor Tor users. This can be circumvented by bridges, but more on that later.
Differences
Non-Exit (Guard & Middle) RelaysExit Relays
1. As their name suggests, their exit policy does not allow them to exit traffic.1. Their exit policy allows clients to exit.
2. May be run by independent relay operators from their homes.2. Relay operators should be affiliated with an institution due to their high risk level as well as resources they demand.
3. They usually do not receive abuse complaints as client services cannot see their IP addresses.3. They receive varying numbers of abuse complaints depending on the number of ports the relay operator allows traffic to be forwarded to when configuring their exit policy.
4. Require minimum maintenance.4. They are are higher-maintenance; they require more effort by the relay operator in terms of configuration, resources e.g. time, bandwidth, and caution against legal exposure.
5. Bandwidth usage is highly customizable in the Tor configuration.5. Bandwidth usage is not as customizable as with non-exit relays.
6. Less prone to liability and legal exposure.6. Have the greatest legal exposure and liability.

Bridge

Bridges are nodes in the Tor network that work like other relays, except that they are not listed in the public Tor directory (directory authorities). Instead, they register themselves in the bridge authorities. As earlier mentioned, Tor users can be blocked or censored by some services. Furthermore, governments, especially oppressive regimes, and ISPs can blacklist the IP addresses of Tor relays, which are in the public domain. Tor bridges come in handy in these cases as they are publicly listed. They offer an extra layer of security by strengthening anonymity, and enable users access to the internet services they want to connect to freely.

However, some governments, like, China and Iran, have figured out ways to detect and block users' connections to Tor bridges. Enter pluggable transports: circumvention tools that the Tor network uses to disguise the traffic it sends out, thus adding a layer of obfuscation.

Characteristics of Tor Bridges

  • They are relatively easy to operate.
  • They require low bandwidth (at least 1 mbit/s).
  • Low risk, high impact- since they are not publicly listed, they offer users more privacy; if someone were prying on a user, then they wouldn't know that they are contacting a Tor relay IP address.
  • They are not likely to receive any abuse complaints.
  • They are unlikely to be blocked by popular internet services- since they are not listed as public relays.

⚠️ Warning

Bridges are a great option for a relay operator if they can run a Tor node from their home network.

It is advisable to do so (run bridges rather than the publicly listed relays) especially if have one static IP.

This will ensure that their non-Tor traffic isn't blocked or censored as though it's coming from Tor. However, if they have multiple static IPs or a dynamic IP, then they should be good to go.

Comparing Bridges to Other (Publicly-listed) Relays

Similarities
  • Both are operated by volunteer relay operators.
  • Bridges are similar to non-exit relays in terms of ease of maintenance and low risk.
  • Bridges and non-exit relays do not usually receive abuse complaints as their IP addresses aren't visible to the services the client nodes connect to.
  • Like non-exit relays, bridges can be conveniently run from home by individual relay operators, as long as they have a dynamic or multiple static IP addresses.
Differences
BridgesOther (Publicly-listed) Relays
1. Their IP addresses are not publicly listed in the Tor directory1. Their IP addresses not publicly listed in Tor directory.
2. They register themselves with the Tor bridge authorities2. They register themselves with the Tor directory authorities
3. They are low-maintenance; easy to operate.3. Exit relays are high-maintenance.
4. Require very low bandwidth; suitable for people that don't have a lot of bandwidth to donate.4. Guard relays (entry relays) require a relatively significant amount of bandwidth (2 MBps), while exit relays need huge amounts that are more likely to be available at an institutional network.

Bad Relays on the Tor Network

A bad relay is one that either does not work as expected, or tampers with the users' connections. Since Tor is an open network, i.e. anyone can contribute by setting up relays, some of these bad relays are bound to emerge. This can happen in either of two ways:

1. When an operator misconfigures the relay they are running.

This is mostly accidental and can easily be committed by new operators with less experience configuring and running Tor relays. An example of misconfiguration is dishonoring the exit policy of an exit relay- this could be by creating a conflict between the announced policy and the actual possible destination e.g. by including an invalid port in the exit policy; or, rejecting the exit policy in totality. Following this technical setup would be useful in helping to avoid the emergence of bad relays due to misconfiguration.

So, what happens when a bad-relay-by-misconfiguration is caught, you ask?

It is important to note that most bad relays are usually caught and reported by the members of the Tor community. Bad relays should be reported by writing to Tor's bad relays mailing list: bad-relays AT lists DOT torproject DOT org, providing a detailed description of the bad relay. The description should include:

  1. The relay's IP address or finger print (a 40-char hex string that looks like this: 203933ED4E55EF8A3C3518427D1A1ED6A4CC285E, for example).
  2. The bad behavior observed.
  3. Any other information that would be helpful in investigating the issue.

It is possible for Tor users to know the exit they are using by visiting the tor check page. From there, Tor bad-relays teams springs into action:

  • Look into the issue by scanning reports to verify the nature of the bad relay.
  • Contact the relay operator in an attempt to get them to fix the misconfiguration issue. The relay configuration contains a has a ContactInfo field that ensures that Tor has a way to reach the operators.
  • If the operator responsible can't be reached, the team assigns the BadExit flag to the exit node, to alert the client node that it should no longer be used in an exit position. Most of the time, the bad-relays team is able to contact the operators; however, in cases where they did not update their contact information in the ContactInfo field, they would have to flag the relay as a bad exit to protect Tor users. There are other types of flags that are applied to bad relays, depending on the severity of the issue:

  • BadExit- Never to be used as an exit relay (for relays that appear to tamper with exit traffic).

  • Invalid- Never to be used unless AllowInvalidNodes is set in the config.
  • Reject- Dropped from the Tor consensus entirely.

ℹ️ Note

However, these relays are left in the network; they can still be useful as non-exit (middle or guard) relays.

2. Malicious intent by operator to harm Tor users.

Relays in the Tor network are considered malicious if potentially harmful behaviors such as:

  • DNS poisoning.
  • Sniffing of user traffic.
  • Excessive logging during normal operation (over notice).
  • Publishing traffic destination/IP information.
  • Tampering with statistics.
  • Flooding the network with new relays (sybil attacks).
  • Re-routing exit traffic back to the network; not actually exiting any traffic.
  • Evading the MyFamily restrictions even though the operator is running more than one relay. Except for bridges which do not advertise their family members in their configuration, relays must include their family size (either effective or alleged family size, or both- more on that later). In the Tor Metrics Portal, the effective family size would be appended to the Nickname config, appearing as such:

Nickname†

represents the total effective family size of the relay, including the relay itself. Let's say the value of the Nickname is papua, and that of is 3:

Nickname†
papua (3)
  • Any other behavior that would put Tor users at risk.

Dealing with Malicious Bad Relays

Unlike in the case of misconfigured relays, the Tor bad-relays team does not contact operators running malicious relays. Instead, they:

  • Reject the relays from the network as fast as possible to ensure the safety of Tor users. flag: BadExit

ℹ️ Note

Corner cases: sometimes, a bad relay is ambiguous, and it is not easy for the bad-relays team to tell straightforwardly if it is as a result of misconfiguration or malicious intent. In such cases, the ContactInfo would come in handy in verification. But admittedly, this could be risky if the operator is dishonest. Kicking them out would be potentially bad for relay diversity, but keeping them could also be potentially harmful to the user- a classic catch-22 situation, but it is often aided by some heuristics based on past experiences with different relay groups and operators, that the team applies.

To learn more about bad relays in the Tor network, check out:

Relay Search: The Tor Metrics Portal

The Tor Metrics portal provides a Relay Search service that displays data about relays, and sometimes bridges, in the Tor network. This service has three tools, namely:

1. Simple Search- that displays data about single relays and bridges in the network. To look up these relays, you can use keywords such as:

  • (Partial) nicknames e.g. 'papua'
  • (Partial) IP addresses e.g. '128.31'
  • (Partial) fingerprints e.g '9695DFC3'

You can also perform combined searches using these keywords e.g 'papua 128.31'.

Aside from keywords, you can use qualifiers to perform a simple search for single relays and bridges. These qualifiers include:

  • Country e.g. 'papua country:kenya'
  • Specific contact info e.g. 'contact:arma'
  • Specific flags e.g. 'flag:exit' (More on Tor relay flags in a later section.)

This search provides important information about how the relays are configured, along with their history represented graphically.

ℹ️ Note

If you are searching for a bridge, then you must include a hashed fingerprint in your query to prevent a leakage of the bridge fingerprint during the search. The hashed-fingerprint file is usually located in the Tor data directory, specified as DataDirectory in your config file (torrc).

Building a query with partial keywords, or using qualifiers entirely, would return a list of entries meeting those criteria. I'll demonstrate: Let's use a partial IP address, 128.199. We get a list of 7 entries with relay IP addresses containing 128.199.

Partial Search

⚠️ Warning

The current version of the Tor Relay Search service does not support more than 2000 results.

In the above example, we were able to see all returned results, because they were only 7 entries. If they were >2000, then we'd have had to narrow our search further.

Components of a Tor Relay's Configuration

If we click on one of the nicknames, say, anonymous, we get more details about the relay, from their configuration to their properties such as uptime, flags, country, etc.

Relay Details

Relay Details Continued

A bridge would have a lot less public details, with the only configuration info being Nickname, Onion Router (OR) address and Advertised Bandwidth. The 'properties' section, on the other hand, lacks most of the info found in the relay search details, but includes the following extra fields not found in relays:

  • Hashed Fingerprint
  • Transport protocols- as in pluggable transports e.g obfs4, meek, Snowflake.
  • Bridge distribution mechanism e.g HTTPS, Moat, Email, Reserved, or None. Find out more about these in the Tor Project's BridgeDB.

Bridge Details

ℹ️ Info

Notice, in the initial list of entries returned, that every relay or bridge has either a green or red indicator before the nickname field. Green indicates that the relay is running, while red means that it is offline. There's a third state: overloaded, which is shown by an orange indicator.

If you want to search for currently running relays, add the following to your query:

running: true

This would return all running relays, including overloaded ones.

running: false would return only relays flagged as online.

2. Aggregated Search- this tool displays aggregated data about relays in the network filtered by the search keywords and qualifiers mentioned in the above subsection. An aggregated search provides useful insights into diversity within the network (determined using parameters such as country, IP version, ISPs, etc. I will cover diversity in a later section). This search also reveals the probabilities of using relays in a particular country or Autonomous System (AS) as either guard, middle or exit.

ℹ️ Info

  • The results of an aggregated search are restricted to only currently running relays. This means that no matter what query you send, offline relays will not be displayed.
  • Bridge data will also not be displayed.

An aggregated search of a relay using a full nickname, for example, would display the following details:

Aggregated Search,

while a search using a qualifier such as flag:exit would return the following aggregate results:

Aggregated Search Using Qualifier

3. Advanced Search- this tool enables you to build advanced or more detailed queries to search for data about either single relays and bridges (as in simple search), or aggregated data about currently running relays (as in aggregated search). The following search parameters are used to build an advanced search query:

  • Nickname
  • Fingerprint (with an option to include family members)
  • Contact
  • Flag
  • Autonomous System (AS)
  • Hostname
  • First Seen
  • Last Seen
  • Version
  • Type (relay or bridge)
  • Running (either running or offline)

Advanced Search Parameters

ℹ️ Note

When performing an aggregated search, the Type and Running parameters (marked with †) are usually ignored because aggregated searches are restricted to only currently running relays.

A relay search in the Tor Metrics portal also visualizes the history of the relay or bridge over 5 years (in 1 month, 6 month, 1 year and 5 year installments), using graphs.

Each relay history is represented by two graphs; the first one mapping time (X-axis) versus written and read bytes per second, and the second mapping time vs the following parameters:

  • Guard Probability
  • Middle probability
  • Exit probability
  • Consensus weight fraction

Relay Graph

The first graph visualizing the history of a bridge in the Tor network is similar to that of a normal relay. The second one is different, though. It maps time versus the average number of connected clients. This is because aggregate searches that produce relay probabilities do not include bridge data.

Bridge Graph

References and other resources you might find helpful:

Cheers!