Relays and Relay Operators on the Tor Network
My thoroughly interesting experience contributing to Tor as part of the Outreachy Program: Part 1
18 min read
Tor is a decentralized anonymity network that is made up of over 5000 nodes that are operated by volunteers (relay operators) all over the world. It is the most widely used a free and open-source anonymization technique that not only provides its users with a censorship-resistant access to the internet, but also enables "hidden services" e.g. by hosting websites in a secure and anonymous fashion. The Tor project employs onion routing to ensure its users enjoy private access to an uncensored web, by "creating and deploying free and open-source anonymity and privacy technologies". The Tor Browser is a great example of a privacy-enhancing tool provided by Tor. By enabling anonymous computing, The Tor network significantly contributes to efforts to uphold human rights to privacy, and freedom of expression without tracking and surveillance censorship by governments or internet service providers (ISPs).
ℹ️ Project Name: Mapping Values and Motivations of the Tor Network's Relay Operators
Project Goal: To create the foundation for Tor's gamification project for relay operators.
The Tor network is a "labor of love produced by an international community of people devoted to human rights". Among these wonderful people are the relay operators who volunteer their time and resources to ensure the stability, robustness and safety of the network. Tor recognizes their endless efforts and immeasurable value, and would like to devise a recognition system to improve motivation amongst existing long-term contributors, whilst encouraging new operators to join the network. This project aims to serve as a foundation for this endeavour by employing a gamification strategy that provides a clear and healthy path for all relay operators to stay engaged with the network. The anticipated results of the techniques implemented include increased consistent participation or contribution, which consequently leads to long-term engagement with the network by relay operators.
🚧 The Incentive Problem: Why Tor wants more relays:
At the moment, there are more Tor clients than relays. Bandwidth capacity remains a limiting factor in the network's growth in size and goodput.
The more the relays in the Tor network, the more stable, reliable and secure it will be. It would distribute trust among a larger, more diverse set of distributed peers, increasing robustness, thus making the network harder for adversaries to break. A larger network, transferring large quantities of more diverse data types makes it more difficult for an adversary to determine where traffic is coming from and transferring to, ergo, cannot easily discern who is communicating with who in the network.
This project is not the first in Tor's gamification efforts. There have been previous attempts at the strategy:
- Relay Awards, 2016
- Relay Awards, Web Mirror
- Roster, 2015; GitHub
- Roster, GitLab
- Tor Incentives Research Roundup,
but more on that later.
As an Outreachy applicant in the December 2021 - March 2022 cycle, I, along with other applicants, have been given an opportunity to contribute to this project during the allocated contribution period (October 8th - November 5th). I have been allocated three tasks, which I shall record below as part of my contribution,and under the guidance of my mentor, Gustavo Gus, of course with the help and support of other members of the wider Tor community forum on IRC.
[✅] Task One: Complete Self-guided Education About Tor Relays and Relay Operator Community.
Types of Relays on The Tor Network
A relay refers to public listed server in the Tor network that forwards traffic on behalf of clients, and that registers itself with the directory authorities.There are various types of nodes or relays on the Tor network, all of which are important, but differ in terms of role, technical requirements, legal implication (including risk), and impact on users or clients.
The Tor circuit is made up of a chain of relays: guard & middle relays exit relays and bridges. As a relay operator, one would need to have sufficient understanding of all these different relays so as to make an informed decision concerning their choice of relay to run.
Guard and Middle Relays (Non-exit Relays)
The guard and middle relays are are both non-exit relays, meaning that their exit policies do not allow their clients to exit; they cannot send traffic from the network to their destination, thus the services that Tor clients are connecting to cannot see these relays' IP addresses. THey are configured in the folllowing way:
ExitPolicy reject *:*
Guard Relay- This is the entry relay. It is the first relay in the chain of three relays building the Tor circuit. It has certain technical requirements:
- It must be stable.
- It must be fast (requires at least 2MBps).
If a guard relay is not stable, and has a bandwidth below 2 megabytes per second, then it remains as a middle relay.
Middle Relay - A middle relay is the second hop after the guard relay. It is positioned between a guard relay and an exit relay.
This is the final relay in the three-relay Tor circuit. Its function is to send traffic out to its destination. This means that, unlike the guard and middle relays, the exit relay's IP address can be seen by the services that Tor client nodes are connecting to, e.g. websites, chat services, email providers, etc.
The exit policy of an exit relay is permissive; it allows clients to exit, and this poses a huge risk factor. Of all the relays in the Tor network, these have the biggest liability and legal exposure. A relay operator maintaining an exit relay runs the risk of getting in trouble with the Digital Millennium Copyright Act (DMCA) if a user downloads copyrighted material using their exit relay. Because of this affinity to the risk of legal exposure, it is crucial that relay operators do not run exit relays from their homes. Ideally, they should be affiliated with some kind of institution such as a university, library or hackerspace or a privacy-related organization because such institutions are better positioned to handle DCMA notices and other law enforcement inquiries. They are also able to provide the much needed bandwidth to facilitate exit of traffic.
Defining the exit policy of an exit relay is one of the most important steps of the relay configuration by an operator. Here's an example of how the exit policy of such a relay may be configured:
ExitPolicy accept *:22 # SSH
The configuration defines the destination ports a relay operator is willing to forward traffic to. Of course, the more ports an operator allows, the more the risk of legal exposure that would attract many abuse complaints. A relay operator would, therefore, have to be cautious about the number of ports they allow, while also not limiting them to too much since allowing too few ports would make their exit relays less useful.
A more comprehensive exit policy for an exit relay:
reject 0.0.0.0/8:* reject 169.254.0.0/16:* reject 127.0.0.0/8:* reject 192.168.0.0/16:* reject 10.0.0.0/8:* reject 172.16.0.0/12:* reject 18.104.22.168:* accept *:20-21 accept *:22 accept *:23 accept *:43 accept *:53 accept *:79 accept *:80-81 accept *:88 accept *:110 accept *:143 accept *:194 accept *:220 accept *:389 accept *:443 accept *:464 accept *:465 accept *:531 accept *:543-544 accept *:554 accept *:563 accept *:587 accept *:636 accept *:706 accept *:749 accept *:853 accept *:873 accept *:902-904 accept *:981 accept *:989-990 accept *:991 accept *:992 accept *:993 accept *:994 accept *:995 accept *:1194 accept *:1220 accept *:1293 accept *:1500 accept *:1533 accept *:1677 accept *:1723 accept *:1755 accept *:1863 accept *:2082 accept *:2083 accept *:2086-2087 accept *:2095-2096 accept *:2102-2104 accept *:3128 accept *:3389 accept *:3690 accept *:4321 accept *:4643 accept *:5050 accept *:5190 accept *:5222-5223 accept *:5228 accept *:5900 accept *:6660-6669 accept *:6679 accept *:6697 accept *:8000 accept *:8008 accept *:8074 accept *:8080 accept *:8082 accept *:8087-8088 accept *:8232-8233 accept *:8332-8333 accept *:8443 accept *:8888 accept *:9418 accept *:9999 accept *:10000 accept *:11371 accept *:19294 accept *:19638 accept *:50002 accept *:64738 reject *:*
To operate a useful exit relay, one must, at the very least, allow ports 80 and 443.
policy = ExitPolicy('accept *:80', 'accept *:443', 'reject *:*')
Comparing Exit to Non-Exit Relays
- Both of them (all relays, in fact) are publicly listed on the Tor network.
Unfortunately, this opens them up to the possibility of being blocked by services that do not understand how Tor works, or those that just want to censor Tor users. This can be circumvented by bridges, but more on that later.
- Both exit and non-exit relays register themselves with the network's directory authorities.
|Non-Exit (Guard & Middle) Relays||Exit Relays|
|1. As their name suggests, their exit policy does not allow them to exit traffic.||1. Their exit policy allows clients to exit.|
|2. May be run by independent relay operators from their homes.||2. Relay operators should be affiliated with an institution due to their high risk level as well as resources they demand.|
|3. They usually do not receive abuse complaints as client services cannot see their IP addresses.||3. They receive varying numbers of abuse complaints depending on the number of ports the relay operator allows traffic to be forwarded to when configuring their exit policy.|
|4. Require minimum maintenance.||4. They are are higher-maintenance; they require more effort by the relay operator in terms of configuration, resources e.g. time, bandwidth, and caution against legal exposure.|
|5. Bandwidth usage is highly customizable in the Tor configuration.||5. Bandwidth usage is not as customizable as with non-exit relays.|
|6. Less prone to liability and legal exposure.||6. Have the greatest legal exposure and liability.|
Bridges are nodes in the Tor network that work like other relays, except that they are not listed in the public Tor directory (directory authorities). Instead, they register themselves in the bridge authorities. As earlier mentioned, Tor users can be blocked or censored by some services. Furthermore, governments, especially oppressive regimes, and ISPs can blacklist the IP addresses of Tor relays, which are in the public domain. Tor bridges come in handy in these cases as they are publicly listed. They offer an extra layer of security by strengthening anonymity, and enable users access to the internet services they want to connect to freely.
However, some governments, like, China and Iran, have figured out ways to detect and block users' connections to Tor bridges. Enter pluggable transports: circumvention tools that the Tor network uses to disguise the traffic it sends out, thus adding a layer of obfuscation.
Characteristics of Tor Bridges
- They are relatively easy to operate.
- They require low bandwidth (at least 1 mbit/s).
- Low risk, high impact- since they are not publicly listed, they offer users more privacy; if someone were prying on a user, then they wouldn't know that they are contacting a Tor relay IP address.
- They are not likely to receive any abuse complaints.
- They are unlikely to be blocked by popular internet services- since they are not listed as public relays.
Bridges are a great option for a relay operator if they can run a Tor node from their home network.
It is advisable to do so (run bridges rather than the publicly listed relays) especially if have one static IP.
This will ensure that their non-Tor traffic isn't blocked or censored as though it's coming from Tor. However, if they have multiple static IPs or a dynamic IP, then they should be good to go.
Comparing Bridges to Other (Publicly-listed) Relays
- Both are operated by volunteer relay operators.
- Bridges are similar to non-exit relays in terms of ease of maintenance and low risk.
- Bridges and non-exit relays do not usually receive abuse complaints as their IP addresses aren't visible to the services the client nodes connect to.
- Like non-exit relays, bridges can be conveniently run from home by individual relay operators, as long as they have a dynamic or multiple static IP addresses.
|Bridges||Other (Publicly-listed) Relays|
|1. Their IP addresses are not publicly listed in the Tor directory||1. Their IP addresses not publicly listed in Tor directory.|
|2. They register themselves with the Tor bridge authorities||2. They register themselves with the Tor directory authorities|
|3. They are low-maintenance; easy to operate.||3. Exit relays are high-maintenance.|
|4. Require very low bandwidth; suitable for people that don't have a lot of bandwidth to donate.||4. Guard relays (entry relays) require a relatively significant amount of bandwidth (2 MBps), while exit relays need huge amounts that are more likely to be available at an institutional network.|
Bad Relays on the Tor Network
A bad relay is one that either does not work as expected, or tampers with the users' connections. Since Tor is an open network, i.e. anyone can contribute by setting up relays, some of these bad relays are bound to emerge. This can happen in either of two ways:
1. When an operator misconfigures the relay they are running.
This is mostly accidental and can easily be committed by new operators with less experience configuring and running Tor relays. An example of misconfiguration is dishonoring the exit policy of an exit relay- this could be by creating a conflict between the announced policy and the actual possible destination e.g. by including an invalid port in the exit policy; or, rejecting the exit policy in totality. Following this technical setup would be useful in helping to avoid the emergence of bad relays due to misconfiguration.
So, what happens when a bad-relay-by-misconfiguration is caught, you ask?
It is important to note that most bad relays are usually caught and reported by the members of the Tor community.
Bad relays should be reported by writing to Tor's bad relays mailing list:
bad-relays AT lists DOT torproject DOT org, providing a detailed description of the bad relay. The description should include:
- The relay's IP address or finger print (a 40-char hex string that looks like this:
203933ED4E55EF8A3C3518427D1A1ED6A4CC285E, for example).
- The bad behavior observed.
- Any other information that would be helpful in investigating the issue.
It is possible for Tor users to know the exit they are using by visiting the tor check page. From there, Tor bad-relays teams springs into action:
- Look into the issue by scanning reports to verify the nature of the bad relay.
- Contact the relay operator in an attempt to get them to fix the misconfiguration issue.
The relay configuration contains a has a
ContactInfofield that ensures that Tor has a way to reach the operators.
If the operator responsible can't be reached, the team assigns the
BadExitflag to the exit node, to alert the client node that it should no longer be used in an exit position. Most of the time, the bad-relays team is able to contact the operators; however, in cases where they did not update their contact information in the
ContactInfofield, they would have to flag the relay as a bad exit to protect Tor users. There are other types of flags that are applied to bad relays, depending on the severity of the issue:
BadExit- Never to be used as an exit relay (for relays that appear to tamper with exit traffic).
Invalid- Never to be used unless
AllowInvalidNodesis set in the config.
Reject- Dropped from the Tor consensus entirely.
However, these relays are left in the network; they can still be useful as non-exit (middle or guard) relays.
2. Malicious intent by operator to harm Tor users.
Relays in the Tor network are considered malicious if potentially harmful behaviors such as:
- DNS poisoning.
- Sniffing of user traffic.
- Excessive logging during normal operation (over notice).
- Publishing traffic destination/IP information.
- Tampering with statistics.
- Flooding the network with new relays (sybil attacks).
- Re-routing exit traffic back to the network; not actually exiting any traffic.
- Evading the
MyFamilyrestrictions even though the operator is running more than one relay. Except for bridges which do not advertise their family members in their configuration, relays must include their family size (either effective or alleged family size, or both- more on that later). In the Tor Metrics Portal, the effective family size would be appended to the
Nicknameconfig, appearing as such:
† represents the total effective family size of the relay, including the relay itself.
Let's say the value of the
papua, and that of
† is 3:
- Any other behavior that would put Tor users at risk.
Dealing with Malicious Bad Relays
Unlike in the case of misconfigured relays, the Tor bad-relays team does not contact operators running malicious relays. Instead, they:
- Reject the relays from the network as fast as possible to ensure the safety of Tor users.
Corner cases: sometimes, a bad relay is ambiguous, and it is not easy for the bad-relays team to tell straightforwardly if it is as a result of misconfiguration or malicious intent. In such cases, the
ContactInfowould come in handy in verification. But admittedly, this could be risky if the operator is dishonest. Kicking them out would be potentially bad for relay diversity, but keeping them could also be potentially harmful to the user- a classic catch-22 situation, but it is often aided by some heuristics based on past experiences with different relay groups and operators, that the team applies.
To learn more about bad relays in the Tor network, check out:
Relay Search: The Tor Metrics Portal
The Tor Metrics portal provides a Relay Search service that displays data about relays, and sometimes bridges, in the Tor network. This service has three tools, namely:
1. Simple Search- that displays data about single relays and bridges in the network. To look up these relays, you can use keywords such as:
- (Partial) nicknames e.g. 'papua'
- (Partial) IP addresses e.g. '128.31'
- (Partial) fingerprints e.g '9695DFC3'
You can also perform combined searches using these keywords e.g 'papua 128.31'.
Aside from keywords, you can use qualifiers to perform a simple search for single relays and bridges. These qualifiers include:
- Country e.g. 'papua country:kenya'
- Specific contact info e.g. 'contact:arma'
- Specific flags e.g. 'flag:exit' (More on Tor relay flags in a later section.)
This search provides important information about how the relays are configured, along with their history represented graphically.
If you are searching for a bridge, then you must include a hashed fingerprint in your query to prevent a leakage of the bridge fingerprint during the search. The
hashed-fingerprintfile is usually located in the Tor data directory, specified as
DataDirectoryin your config file (
Building a query with partial keywords, or using qualifiers entirely, would return a list of entries meeting those criteria. I'll demonstrate: Let's use a partial IP address,
128.199. We get a list of 7 entries with relay IP addresses containing
The current version of the Tor Relay Search service does not support more than 2000 results.
In the above example, we were able to see all returned results, because they were only 7 entries. If they were >2000, then we'd have had to narrow our search further.
Components of a Tor Relay's Configuration
If we click on one of the nicknames, say,
anonymous, we get more details about the relay, from their configuration to their properties such as uptime, flags, country, etc.
A bridge would have a lot less public details, with the only configuration info being
Onion Router (OR) address and
Advertised Bandwidth. The 'properties' section, on the other hand, lacks most of the info found in the relay search details, but includes the following extra fields not found in relays:
Transport protocols- as in pluggable transports e.g obfs4, meek, Snowflake.
Bridge distribution mechanisme.g HTTPS, Moat, Email, Reserved, or None. Find out more about these in the Tor Project's BridgeDB.
Notice, in the initial list of entries returned, that every relay or bridge has either a green or red indicator before the nickname field. Green indicates that the relay is
running, while red means that it is
offline. There's a third state:
overloaded, which is shown by an orange indicator.
If you want to search for currently running relays, add the following to your query:
This would return all running relays, including overloaded ones.
running: falsewould return only relays flagged as online.
2. Aggregated Search- this tool displays aggregated data about relays in the network filtered by the search keywords and qualifiers mentioned in the above subsection. An aggregated search provides useful insights into diversity within the network (determined using parameters such as country, IP version, ISPs, etc. I will cover diversity in a later section). This search also reveals the probabilities of using relays in a particular country or Autonomous System (AS) as either guard, middle or exit.
- The results of an aggregated search are restricted to only currently running relays. This means that no matter what query you send, offline relays will not be displayed.
- Bridge data will also not be displayed.
An aggregated search of a relay using a full nickname, for example, would display the following details:
while a search using a qualifier such as
flag:exit would return the following aggregate results:
3. Advanced Search- this tool enables you to build advanced or more detailed queries to search for data about either single relays and bridges (as in
simple search), or aggregated data about currently running relays (as in
The following search parameters are used to build an advanced search query:
- Fingerprint (with an option to include family members)
- Autonomous System (AS)
- First Seen
- Last Seen
- Type (relay or bridge)
- Running (either running or offline)
When performing an aggregated search, the
Runningparameters (marked with †) are usually ignored because aggregated searches are restricted to only currently running relays.
A relay search in the Tor Metrics portal also visualizes the history of the relay or bridge over 5 years (in 1 month, 6 month, 1 year and 5 year installments), using graphs.
Each relay history is represented by two graphs; the first one mapping time (X-axis) versus written and read bytes per second, and the second mapping time vs the following parameters:
- Guard Probability
- Middle probability
- Exit probability
- Consensus weight fraction
The first graph visualizing the history of a bridge in the Tor network is similar to that of a normal relay. The second one is different, though. It maps time versus the average number of connected clients. This is because aggregate searches that produce relay probabilities do not include bridge data.
References and other resources you might find helpful: