nvme-stas 1.1.6

Name

stacd.conf — stacd(8) configuration file

Synopsis

/etc/stas/stacd.conf

Description

When stacd(8) starts up, it reads its configuration from stacd.conf.

Configuration File Format

stacd.conf is a plain text file divided into sections, with configuration entries in the style key=value. A space immediately before or after the "=" is ignored. Empty lines and lines starting with "#" are ignored, which may be used for commenting.

Options

[Global] section

The following options are available in the "[Global]"section:

tron=

Trace ON. Takes a boolean argument. If "true", enables full code tracing. The trace will be displayed in the system log such as systemd's journal. Defaults to "false".

hdr-digest=

Enable Protocol Data Unit (PDU) Header Digest. Takes a boolean argument. NVMe/TCP facilitates an optional PDU Header digest. Digests are calculated using the CRC32C algorithm. If "true", Header Digests are inserted in PDUs and checked for errors. Defaults to "false".

data-digest=

Enable Protocol Data Unit (PDU) Data Digest. Takes a boolean argument. NVMe/TCP facilitates an optional PDU Data digest. Digests are calculated using the CRC32C algorithm. If "true", Data Digests are inserted in PDUs and checked for errors. Defaults to "false".

kato=

Keep Alive Timeout (KATO) in seconds. Takes an unsigned integer. This field specifies the timeout value for the Keep Alive feature in seconds. Defaults to 30 seconds for Discovery Controller connections and 120 seconds for I/O Controller connections.

ip-family=

Takes a string argument. With this you can specify whether IPv4, IPv6, or both are supported when connecting to a Controller. Connections will not be attempted to IP addresses (whether discovered or manually configured with the 'controller') if those IP addresses are disabled by this option. If an invalid value is entered, then "ipv4+ipv6" will be used by default.

Choices are "ipv4", "ipv6", or "ipv4+ipv6".

Defaults to "ipv4+ipv6".

ignore-iface=

Takes a boolean argument. This option controls how connections with I/O Controllers (IOC) are made.

There is no guarantee that there will be a route to reach that IOC. However, we can use the socket option SO_BINDTODEVICE to force the connection to be made on a specific interface instead of letting the routing tables decide where to make the connection.

This option determines whether stacd will use SO_BINDTODEVICE to force connections on an interface or just rely on the routing tables. The default is to use SO_BINDTODEVICE, in other words, stacd does not ignore the interface.

BACKGROUND: By default, stacd will connect to IOCs on the same interface that was used to retrieve the discovery log pages. If stafd discovers a DC on an interface using mDNS, and stafd connects to that DC and retrieves the log pages, it is expected that the storage subsystems listed in the log pages are reachable on the same interface where the DC was discovered.

For example, let's say a DC is discovered on interface ens102. Then all the subsystems listed in the log pages retrieved from that DC must be reachable on interface ens102. If this doesn't work, for example you cannot "ping -I ens102 [storage-ip]", then the most likely explanation is that arp proxy is not enabled on the switch that the host is connected to on interface ens102. Whatever you do, resist the temptation to manually set up the routing tables or to add alternate routes going over a different interface than the one where the DC is located. That simply won't work. Make sure arp proxy is enabled on the switch first.

Setting routes won't work because, by default, stacd uses the SO_BINDTODEVICE socket option when it connects to IOCs. This option is used to force a socket connection to be made on a specific interface instead of letting the routing tables decide where to connect the socket. Even if you were to manually configure an alternate route on a different interface, the connections (i.e. host to IOC) will still be made on the interface where the DC was discovered by stafd.

Defaults to "false".

udev-rule=

Takes a string argument "enabled" or "disabled". This option determines whether nvme-cli's udev rule will be executed or ignored.

A udev rule gets installed with nvme-cli that tells the udev daemon (udevd) to look for Asynchronous Event Notifications (AEN) indicating a change of Discovery Log Page Entries (DPLE). The udev rule is installed as: /usr/lib/udev/rules.d/70-nvmf-autoconnect.rules

When an AEN is detected, udevd simply instructs systemd to start a one-shot service that will retrieve the changed DPLEs and connect to all the I/O Controllers (IOC) listed in the DPLEs. This is basically the same as performing nvme-cli's "connect-all" command.

Unfortunately, stafd and stacd also perform the same operations when an AEN is received. This results in a race condition between udevd and stafd/stacd.

This is not really a problem. stafd and stacd are designed to handle this type of race condition and will conclude, eventually, that the connections succeeded. The only downside is that there may be error messages printed to the syslog when the race condition happens. These messages are printed by the kernel because two processes are trying to connect to the same IOC at the same time. One of them will be rejected by the kernel, but the other will succeed.

The udev-rule option allows a user to disable nvme-cli's udev rule so that udevd will not act on received AENs. Instead, only stafd/stacd will be allowed to react to AENs and set up IOC connections.

Defaults to "enabled", which means that udevd and stafd/stacd will react to AENs. It also means that the race condition will happen by default and error messages will be printed to the syslog.

sticky-connections=

Keep existing connections to I/O controllers (IOC). Takes a string argument "enabled" or "disabled".

The parameter sticky-connections determines how stacd reacts to the removal of an IOC Discovery Page Entry (DLPE) or the removal of a "controller=" entry in /etc/stas/stacd.conf. In other words, whether it should immediately disconnect from IOC when the DPLE/"controller=" is removed, or whether it should maintain the connection.

Table 1. List of terms used in the following text:

TermDescription
Manual ConfigRefers to manually adding entries to stacd.conf
Automatic ConfigRefers to receiving configuration from a Discovery Controller (DC) as DLPEs
External ConfigRefers to configuration done outside of the nvme-stas framework, for example using nvme-cli commands

IOC connection creation.  There are 3 ways to configure IOC connections on a host:

  1. Manual Config by adding "controller=" entries to the "[Controllers]" section (see below).

  2. Automatic Config received in the form of DLPEs from a remote DC.

  3. External Config using nvme-cli (e.g. "nvme connect")

Zoning and DLPEs.  Zoning configuration is performed at Discovery Controllers (DC). A zone is used to specify the list of IOC that a host is allowed to access. The zone contains a list of hosts and the IOC that these hosts can access. Users can add or remove IOC and/or hosts from zones.

DCs notify hosts of zoning configuration changes by sending Asynchronous Event Notifications (AEN) indicating a "Change of Discovery Log Page (DLP)". The host uses these AENs as a trigger to retrieve the new list of DLPEs by issuing a Get DLP command. This happens in real time, which means that a host that was previously connected to an IOC may suddenly be told that it is no longer allowed to connect to that IOC and should disconnect from it.

IOC connection removal.  There are 3 ways to remove controller connections to an IOC:

  1. Manual Config.

    1. by adding "blacklist=" entries to the "[Controllers]" section (see below).

    2. by removing "controller=" entries from the "[Controllers]" section.

  2. Automatic Config. As explained above, changing zoning at a DC will result in the host getting a new list of DLPEs. On DLPE removal, the host should remove the connection to the IOC matching that DLPE.

  3. External Config using nvme-cli (e.g. "nvme disconnect" or "nvme disconnect-all")

Some users may prefer for the IOC to be "sticky" and only be removed manually (nvme-cli or "blacklist=") or removed by a system reboot. They don't want for IOC connections to be removed unexpectedly on DLPE removal. This is where sticky-connections= comes into play.

sticky-connections= tells stacd whether to keep connections to IOC even if their DPLEs have been removed or the "controller=" entries in stacd.conf have been removed.

With sticky-connections=disabled (default).  stacd immediately disconnects from a previously connected IOC if the response to a Get DLP command no longer contains a DLPE matching that IOC or a "controller=" entry in stacd.conf is removed.

Ongoing I/O transactions will be terminated immediately as well. There is no way to tell what happens to the data being exchanged when such an abrupt termination happens. If a host was in the middle of writing to a storage subsystem, there is a good chance that incomplete and potentially corrupt data will be left on the remote storage.

NOTE This mode implies that nvme-stas will only allow Manually Configured or Automatically Configured IOC connections to exist. Externally Configured connections using nvme-cli that do not match any Manual Config (stacd.conf) or Automatic Config (DLPEs) will get deleted immediately by stacd.

With sticky-connections=enabled stacd does not disconnect from IOCs when a DPLE is removed or a "controller=" entry is removed from stacd.conf. Instead, users can issue the nvme-cli command "nvme disconnect", add a "blacklist=" entry to stacd.conf, or wait until the next system reboot at which time all connections will be removed.

[Controllers] section

The following options are available in the "[Controllers]" section:

controller=

Controllers are specified with the "controller" option. This option may be specified more than once to specify more than one controller. The format is one line per Controller composed of a series of fields separated by semi-colons as follows:

controller=transport=[trtype];traddr=[traddr];trsvcid=[trsvcid];host-traddr=[traddr],host-iface=[iface];nqn=[nqn]
                

Fields

transport=

This is a mandatory field that specifies the network fabric being used for a NVMe-over-Fabrics network. Current "trtype" values understood are:

Table 2. Transport type

trtypeDefinition
rdma The network fabric is an rdma network (RoCE, iWARP, Infiniband, basic rdma, etc)
fc The network fabric is a Fibre Channel network.
tcp The network fabric is a TCP/IP network.
loop Connect to a NVMe over Fabrics target on the local host

traddr=

This is a mandatory field that specifies the network address of the Controller. For transports using IP addressing (e.g. rdma) this should be an IP-based address (ex. IPv4, IPv6). It could also be a resolvable host name (e.g. localhost).

trsvcid=

This is an optional field that specifies the transport service id. For transports using IP addressing (e.g. rdma, tcp) this field is the port number.

Depending on the transport type, this field will default to either 8009 or 4420 as follows.

UDP port 4420 and TCP port 4420 have been assigned by IANA for use by NVMe over Fabrics. NVMe/RoCEv2 controllers use UDP port 4420 by default. NVMe/iWARP controllers use TCP port 4420 by default.

TCP port 4420 has been assigned for use by NVMe over Fabrics and TCP port 8009 has been assigned by IANA for use by NVMe over Fabrics discovery. TCP port 8009 is the default TCP port for NVMe/TCP discovery controllers. There is no default TCP port for NVMe/TCP I/O controllers, the Transport Service Identifier (TRSVCID) field in the Discovery Log Entry indicates the TCP port to use.

The TCP ports that may be used for NVMe/TCP I/O controllers include TCP port 4420, and the Dynamic and/or Private TCP ports (i.e., ports in the TCP port number range from 49152 to 65535). NVMe/TCP I/O controllers should not use TCP port 8009. TCP port 4420 shall not be used for both NVMe/iWARP and NVMe/TCP at the same IP address on the same network.

Ref: IANA Service names port numbers

nqn=

This is an optional field that specifies the Discovery Controller's NVMe Qualified Name. If not specified, this will default to the well-known DC NQN: "nqn.2014-08.org.nvmexpress.discovery".

host-traddr=

This is an optional field that specifies the network address used on the host to connect to the Controller. For TCP, this sets the source address on the socket.

host-iface=

This is an optional field that specifies the network interface used on the host to connect to the Controller (e.g. IP eth1, enp2s0, enx78e7d1ea46da). This forces the connection to be made on a specific interface instead of letting the system decide.

Examples:

controller = transport=tcp;traddr=localhost;trsvcid=8009
controller = transport=tcp;traddr=[2001:db8:::370:7334];host-iface=enp0s8
controller = transport=fc;traddr=nn-0x204600a098cbcac6:pn-0x204700a098cbcac6
                    

blacklist=

Blacklisted controllers can be specified with the "blacklist" option. Using mDNS to automatically discover and connect to controllers, can result in unintentional connections being made. This keyword allows configuring the controllers that should not be connected to (whatever the reason may be).

The syntax is the same as for "controller", except that the key "host-traddr" does not apply. Multiple "blacklist" keywords may appear in the config file to specify more than 1 blacklisted controller.

Note 1: A minimal match approach is used to eliminate unwanted controllers. That is, you do not need to specify all the parameters to identify a controller. Just specifying the "host-iface", for example, can be used to blacklist all controllers on an interface.

Note 2: "blacklist" takes precedence over "controller". A controller specified by the "controller" keyword, can be eliminated by the "blacklist" keyword.

Examples:

blacklist = transport=tcp;traddr=fe80::2c6e:dee7:857:26bb # Eliminate a specific address
blacklist = host-iface=enp0s8                             # Eliminate everything on this interface
                    

See Also

stacd(8)