To improve ZFS performance over NFS here are a few suggestions that worked in a cluster environment.
On the NFS server:
- Ensure /sys/module/zfs/parameters/zfs_arc_max is set the appropriate value. On linux, this defaults to half the available RAM on the server. For example, this can be set on the fly by "echo 107374182400 > /sys/module/zfs/parameters/zfs_arc_max" (100GB) and reset cache. Cache can be reset (with caution!) by this command "echo 3|tee /proc/sys/vm/drop_cache" but please research "linux drop_cache" before you do so. This setting needs to be in /etc/modprobe.d/zfs.conf (as "options zfs zfs_arc_max=107374182400") to survive a reboot.
- For RHEL/CentOS 6 and 7, the RPCNFSDCOUNT variable has to be set to accommodate more than 8 threads for NFS processes. You can start by assigning one thread per NFS client and round up to the nearest multiple of 16 or 32. RPCNFSDCOUNT=128 in /etc/sysconfig/nfs worked on Memex.
- The /etc/sysctl.conf configuration file can be used for TCP tuning. Here is an example for a 128GB RAM, 32-core server using an IPoIB interface for NFS connections.
net.ipv4.tcp_mem=134217728 134217728 134217728 net.core.rmem_max=268435456 net.core.wmem_max=268435456 # increase Linux autotuning TCP buffer limits # min, default, and max number of bytes to use # allow auto-tuning up to 128MB buffers net.ipv4.tcp_rmem=4096 87380 134217728 net.ipv4.tcp_wmem=4096 65536 134217728 # recommended to increase this for CentOS6 with 10G NICS or higher net.core.netdev_max_backlog=250000 # don't cache ssthresh from previous connection net.ipv4.tcp_no_metrics_save=1 # Explicitly set htcp as the congestion control: cubic buggy in older 2.6 kernels net.ipv4.tcp_congestion_control=htcp # If you are using Jumbo Frames, also set this #net.ipv4.tcp_mtu_probing=1 # recommended for CentOS7/Debian8 hosts net.core.default_qdisc=fq
- While using most of the default values should be fine, here are some options that can help performance, "remote:/zpool/dir /dir nfs4 rw,intr,noatime,nodiratime 0 0" in /etc/fstab worked well on Memex clients.
- If TCP tuning is needed, the /etc/sysctl.conf can also be used on the client. Here's an example:
net.ipv4.tcp_mem=4194304 4194304 4194304 net.ipv4.tcp_rmem=4096 87380 4194304 net.ipv4.tcp_wmem=4096 65536 4194304 net.core.rmem_max=4194304 net.core.wmem_max=4194304 net.core.rmem_default=4194304 net.core.wmem_default=4194304 net.core.optmem_max=4194304 net.ipv4.tcp_sack=1 net.ipv4.tcp_timestamps=0 net.core.netdev_max_backlog=250000 net.ipv4.tcp_low_latency=1 net.ipv4.tcp_adv_win_scale=1