02. 10. 2020 Damiano Chini Development, NetEye

NetEye Communication over NATS

In a previous post we talked about how in NetEye we migrated all the Tornado communications from direct TCP to NATS. Since then, we’ve extensively and successfully adopted NATS as a communication channel for many of the components present in NetEye.

As often happens when approaching new technologies, the initial straightforward approach did not turn out to be the best way to get things working smoothly in all possible scenarios.

In particular, we saw that the approach used for Tornado, where we would define NATS subjects and user permissions for every subject, was not scalable and was hard to maintain with more complex infrastructures.


Distributed Multi-Tenant Scenario

Consider the following scenario where we have a multi-tenant environment composed of:

  • 3 satellites nodes, each of which has 5 NATS publishers that produce events, for a total of 15 publishers
  • 1 master node, on which a single NATS consumer processes all the events coming from the satellites

Each of the 3 satellites is in control of a different organization, so the main concern is that on any satellite it must not be possible to read messages sent from another satellite.

As mentioned above, messages coming from the different satellites are all processed inside the master node, so it must be possible for the consumers residing on the master node to read the messages coming from all the different satellites.


Configuration Based on NATS Subject Permissions

By defining permissions on subjects for the different publishers/consumers, we would have to pre-define naming conventions in the master node for subject names and publisher names, so that it would be possible to distinguish subjects/publishers coming from the different satellites.

This is needed in order to give correct permissions to the consumers on the master node to read from the subjects coming from the different satellites.

Then of course on the master node we should also configure the permissions of each of the 15 publishers so that they can publish on their dedicated channel.

This results in a long and error-prone configuration, which is even less maintainable when new satellites, consumers or publishers are added to the infrastructure.


Configuration Based on NATS Leaf Nodes and NATS Accounts

Given these downsides, we hit on a more structured configuration with the combined use of NATS Leaf Nodes and NATS Accounts.

NATS Leaf Nodes

NATS Leaf Nodes are a type of node in a NATS system which allow you to isolate local traffic from remote traffic. This means that a message will only be sent to a Leaf Node if the Leaf Node really needs to receive that message. Local NATS clients will only authenticate to the Leaf Node, and the Leaf Node will take care of communicating with the NATS cluster.

This means that instantiating Leaf Nodes on the satellites lets you ensure that, just by configuring the correct permissions for the Leaf Nodes, no NATS message coming from organization X will pass through a node which is in control of organization Y.

Moreover, NATS publishers running on a satellite will only need to authenticate locally on the satellite, and they will then be transparent to the NATS server running on the master node, which we will only need to take care of the permissions for the NATS Leaf Node representing the satellite.

NATS Accounts

Nats Accounts are a construct to define isolated communication contexts, which are very useful in the case of multi-tenant environments, since each message is only visible in its own communication context (unless the configuration explicitly defines the contrary).

With accounts it is then possible to easily define communication contexts in such a way that satellites can only write to their own communication context, and that the master node can read from all of the satellites’ communication contexts. This allows us to easily ensure that each Leaf Node’s communication channel is isolated from the other Leaf Nodes’ communication channels.

An Example NATS Configuration

In the following example you can see how, on the Master Node side, we can configure the NATS Accounts and NATS Leaf Nodes in such a way that:

  • on the Master Node the NATS consumers are able to read messages coming from the various satellites
  • on satellite nodes it will never be possible to read a message coming from another satellite

In particular, the example represents a situation where on the Master node we are running:

  • The Tornado engine (which authenticates to NATS as user tornado)
  • Some Telegraf consumers (which authenticate to NATS as user telegraf_consumer)

On our 2 satellites (which authenticate themselves through the NATS Leaf Nodes as satellite_acme and satellite_xyz) we are instead running:

  • Some Tornado Collectors (which publish on the subject tornado.events)
  • Some Telegraf publishers (which publish on the subject telegraf_<organization_name>)
leafnodes {
  port: 7422

  tls: {
    cert_file: "/neteye/shared/nats-server/conf/certs/nats-server.crt.pem"
    key_file: "/neteye/shared/nats-server/conf/certs/private/nats-server.key"
    # Require a client certificate and map user id from certificate
    verify_and_map: true
  }
}

accounts: {
  SATELLITE1: {
    users: [ 
      {user: satellite_acme}
    ]
    exports: [ 
      {stream: tornado.events}
      {stream: telegraf_acme}
    ]
  },
  SATELLITE2: {
    users: [ 
      {user: satellite_xyz}
    ]
    exports: [ 
      {stream: tornado.events}
      {stream: telegraf_xyz}
    ]
  },
  
  MASTER: {
    users: [
      {user: tornado}
      {user: telegraf_consumer}
    ]
    imports: [ 
      {stream: {account: SATELLITE1, subject: tornado.events}}
      {stream: {account: SATELLITE1, subject: telegraf_acme}}
      {stream: {account: SATELLITE2, subject: tornado.events}}
      {stream: {account: SATELLITE2, subject: telegraf_xyz}}
    ]
  }
}
Damiano Chini

Damiano Chini

Author

Damiano Chini

Leave a Reply

Your email address will not be published. Required fields are marked *

Archive