You are here

Byzantine failure localization in Wireless Mesh routing

Networking protocols were not designed with security and resilience concerns. Born in small and reliable environments, networks grew till they became one big network that connects us all: the Internet. But as a consequence of this enormous success and how essential it has become, we face the problem of Internet ossification: introducing new features and requirements is highly difficult since everything must be backwards compatible.

On our work we try to focus on the concern of network security and resilience, specially focused on Wireless Community Networks, but with the hope that solutions proposed could be used on more scenarios.

Internet protocols lack of implicit mechanisms to measure and monitor the correctness and quality of the path a traffic flow follows and at the same time, given the decentralized management of a community network and diversity of routing protocols the solution proposed should be based only on data-plane information.

On the literature, there are two main approaches to detect traffic anomalies: (i) acknowledgement-based and (ii) traffic validation-based. Acknowledgment-based solutions (like [1], [2] or [3]) require the acknowledgment of a portion of the traffic and, either by using onion-encripted ACKs or by triggering a fault detection mechanism, they are able to locate the faulty link. The main drawback of such solutions is that they require the knowledge of the whole path for the traffic flow, but in the case of distance-vector routing protocols (like BMX6 or B.A.T.M.A.N.) this information is not available. On the other hand, traffic validation solutions are based on the conservation of the flow principle: the incoming and outgoing traffic of a node must be equal discounting the traffic from and towards that node. are 4, 5, 6 and 7 and therefore more suitable for our needs.

Mizrak divides the problem of network fault localization into three different subproblems:

  1. Traffic Validation: which defines which information the nodes will capture from the incoming and outgoing traffic and the function that compares it to determine whether a node is faulty or not.
  2. Detection Protocol: determines how the information gathered by the traffic validation mechanism is shared with other nodes in the network.
  3. Reaction: refers to the action performed by the nodes when another node is classified as faulty.

We propose to revisit each of these subproblems and study the different possible solutions and their quality for wireless mesh networks.