Building Fault Tolerant Industrial Ring Networks
with Ethernet Ring Protection Switching (ERPS)

In today's interconnected world, industrialized networks have become an essential part of our modern society. Protecting these systems against network failures or system outages is paramount, and adding redundancy technologies is one surefire method used by administrators and system engineers to ensure operations can continue even in the event of a link failure or network outage. One redundancy protocol gaining popularity is Ethernet Ring Protection Switching (ERPS).

What is Ethernet Ring Protection Switching?

The ITU-T G.8032 Ethernet Ring Protection Switching (ERPS) is an open-standard layer 2 Ethernet protocol designed to detect and prevent switching loops from occurring. Originally developed by the Telecom industry for Metro-Ethernet topologies, today, ERPS is primarily used in industrial networks to create fault-tolerant networks. Like other redundant layers 2 protocols such as spanning-tree and rapid spanning-tree, its primary focus is to prevent looping when a node failes or link failure occurs. Its advantage over other layer 2 redundant protocols, is its ability reconvergence and reroute passing traffic in sub-50 millisecond time when a link fault has been detected. However, due to the rapid speed of the reconvergence, ERPS-specific hardware is required.

How Ethernet Ring Protection Switching Works?

ERPS works by first creating a topology of interconnected nodes forming a ring. Each node has a dedicated set of switch ports known as ring ports programmed with unique attributes used for messaging and link status notifications. Ring ports interconnect adjacent nodes forming the ring.

ERPS rings could be configured as a single or multi-site domain. Each domain consists of a primary ring with the option for sub-ring configurations. Each domain uses a control channel or VLAN, which sends status messages throughout the ring domain.

Each ERPS ring can be managed by a single node or a pair of nodes. In a single-node configuration, the primary node is the Owner. In a dual-node configuration, the two management nodes are known as the Owner and Neighbor. The Owner node decides which ring port will block data traffic. The designated link interconnecting the Owner to its adjacent node is known as Ring Protection Link (RPL). Under normal conditions, the RPL link is blocked, allowing only control messages to traverse this link while blocking all other passing traffic. However, if a fault is detected in the ring, the RPL link transitions from blocking to a forwarding state, creating an alternative path for traffic flow. This switching mechanism is what allows ERPS to maintain its network integrity while preventing switching loops.

ERPS Link Fault Detection

When a failed link conditions occurs or Signal Failure (SF) condition is detected between two interconnected nodes, the link between the nodes is blocked and Ring Automatic Protection Switch (R-APS) messages are sent over a dedicated control channel or "CONTROL VLAN" notifying other switches of the occurrence.

Upon receipt of this R-APS message, nodes will perform a Forwarding Database (FDB) flush and send out an R-APS(SF) message out of both ring ports. The adjacent nodes receiving this message will perform an FDB flush and send out R-APS SF messages along with the failed node ID (MAC address) and Bidirectional Path-protected ring (BPR) (Port ID) out of both ring ports. All nodes receiving the R-APS(SF) messages perform an FDB flush and forward the message. When the RPL owner and neighbor receive the R-APS(SF) messages, the owner performs an FDB flush and transitions from blocking to unblocking the RPL link.

Once the fault has been cleared and the link restored, the nodes adjacent to recovered links send R-APS NR messages and initiate their guard-timer, which prevents the ring ports from receiving outdated R-APS messages. When the RPL owner receives the R-APS NR messages, it initials the Wait-To-Restore (WTR) timer. When the WTR expires, the owner blocks its RPL port, sends out R-APS NR, RB (Root Blocked) message, and performs a FDB flush. Transition nodes receive the R-APS NR, RB messages and perform a flush of their FDB. When the recovered ring ports receive the R-APS NR, RB message, they remove the blocks from their recovered ports, stop transmitting R-APS NR messages and flush the FDBs. Once the process is completed, the ring returns to its normal state.

Independent Paths for Ring Topology

Loop avoidance in the an ERPS ring configuration is achieved by each node being connected to adjacent node in a primary ring or sub-ring using two independent links. One link is designated as the primary path and the other link is the secondary (protection) path.

This redundancy is what enables fast network recovery when failed link occurs. When the primary access path fails, traffic is automatically switched to the secondary path to ensure that the network remains operational. All the nodes will also perform a Forwarding Database (FDB) flush which allows traffic to quickly return.

We should note that there are two versions of ERPS: G.8032v1 supports a single-ring topology, and G.8032v2 supports an integrated multiple rings/ladder topology. A single ring can withstand a single failure before connectivity is lost between switches and ports. A larger ring, however, has greater potential of having more than one failure at a time. To address this, G.8032v2 enables multiple rings on a single switch.

G.8032v2 also provides support for subring topologies — that is configured, partial rings in the shape of a "C" that are not fully closed. Subrings can be attached to a fully closed (“major”) ring and to other subrings where one node of the subrings is attached to another node of a major ring. Of the two versions, G.8032v2 enables larger volumes of Ethernet traffic to flow to more connection points with high-level redundancy.

Achieving High Network Availability with ERPS Ring Technology

Industrial Ethernet switches are critical components in network infrastructures, often used in harsh environments and mission-critical applications. The Ethernet Ring Protection Switching (ERPS) protocol is a powerful technology and service that can be used on industrial switches to ensure high network availability and prevent network downtime.

Redundancy and Resilience with ERPS Rings

ERPS Rings offer network redundancy and resilience by providing a redundant path in the network. This is achieved by creating a virtual control channel that allows nodes in the main ring, to communicate and reroute traffic in the main ring in the event of a fault. ERPS Rings can be used to build highly available networks in industries such as manufacturing, transportation, and energy.

Protecting Network Traffic with ERPS Rings

ERPS Rings protect network traffic by providing sub-50ms protection and recovery in the event of a fault. This means that network traffic can be quickly and seamlessly rerouted, reducing the impact of network downtime. ERPS Rings are ideal for applications where real-time data is critical, such as in control systems or process automation.

Simplifying Network Management with ERPS Rings

ERPS Rings simplify network management by providing a vendor-neutral solution that can be used with industrial switches and ports from multiple vendors. This reduces the complexity of managing networks with multiple Ethernet switches and ports and helps to ensure interoperability between different network components and ports.

The Cost of Network Downtime

Why is preventing downtime on a network so important? The simple answer is money. According to the Gartner, the cost to a small business for network downtime is approximately $137 to $427 per minute, whereas for larger industrial facilities, network downtime can cost over $16,000 per minute or approximately $1 million per hour. Even a relatively minor issue such as a loss of signal or industrial wireless connectivity in the office can waste both the time of staff and IT departments.

Not only is network downtime expensive, in some instances, it can also be dangerous. Consider a utility plant that suffers a network failure that causes a power blackout across a major city. Or a surveillance system unable to view and share real-time security video with police. In a mining operation, for example, a network or signal failure could result in the malfunction of a methane gas ventilation fan, endangering miners.

Any network downtime — even just a few seconds — can cause critical errors and operational issues. That’s why Antaira industrial managed switches come preconfigured for ERPS and other ring redundancy protocols to help reduce the risk of network outages.

Overall, ERPS Rings solutions are a powerful technology that can provide network redundancy, resilience, and protection for mission-critical applications in industrial settings. By using ERPS Rings solutions, organizations can achieve high network availability, protect network traffic, and simplify network management. For more information, contact our technical team at 714-671-9000.

Building Fault Tolerant Industrial Ring Networks with Ethernet Ring Protection Switching (ERPS)