Server Fault Asked by Gnosis on November 4, 2021
We recently switched to an NFS4 share for our web directory (/var/www/sites). Ever since the switch at exactly 60 seconds, I’m seeing a drop off in throughput from the NFS mounted drive on the client-side. CPU drops (apache/PHP are waiting) and I see a dip in the network load. It lasts between 500ms up to 1.5s. This happens exactly every 60 seconds.
I tested with the dd if=/dev/zero of=/mnt/files/samplefile bs=1M count=1024 oflag=direct
and was able to see an increase in the read/write times during one of the 60-second drops.
On the NFS mount, I’ve added FS-cache, noatime, and nodiratime without any change.
/etc/export
/mnt/files {clientIP} (rw,fsid=0,sync,no_root_squash)
Client mount
mount -v -t nfs4 {server_ip}:/ /mnt/files -o fsc,noatime,nodiratime
Based on the exact timing of the dropoff it seems to be some sort of setting and/or misconfiguration.
Any tips would be greatly appreciated.
Server side nfsstat:
Server rpc stats:
calls badcalls badfmt badauth badclnt
4066505251 262 22 240 0
Server nfs v3:
null getattr setattr lookup access
8 100% 0 0% 0 0% 0 0% 0 0%
readlink read write create mkdir
0 0% 0 0% 0 0% 0 0% 0 0%
symlink mknod remove rmdir rename
0 0% 0 0% 0 0% 0 0% 0 0%
link readdir readdirplus fsstat fsinfo
0 0% 0 0% 0 0% 0 0% 0 0%
pathconf commit
0 0% 0 0%
Server nfs v4:
null compound
72 0% 4066507670 99%
Server nfs v4 operations (centos 8):
op0-unused op1-unused op2-future access close
0 0% 0 0% 0 0% 187752303 1% 117353691 0%
commit create delegpurge delegreturn getattr
6175 0% 7467 0% 0 0% 36808013 0% 3988907750 31%
getfh link lock lockt locku
20592505 0% 0 0% 1988679 0% 0 0% 1978415 0%
lookup lookup_root nverify open openattr
32913665 0% 0 0% 0 0% 117761749 0% 0 0%
open_conf open_dgrd putfh putpubfh putrootfh
0 0% 24 0% 4050816618 32% 0 0% 328 0%
read readdir readlink remove rename
3970684 0% 1199340 0% 480 0% 181949 0% 18432 0%
renew restorefh savefh secinfo setattr
0 0% 0 0% 18432 0% 0 0% 2287964 0%
setcltid setcltidconf verify write rellockowner
0 0% 0 0% 0 0% 211708 0% 0 0%
bc_ctl bind_conn exchange_id create_ses destroy_ses
0 0% 2 0% 40 0% 46 0% 37 0%
free_stateid getdirdeleg getdevinfo getdevlist layoutcommit
1978400 0% 0 0% 0 0% 0 0% 0 0%
layoutget layoutreturn secinfononam sequence set_ssv
0 0% 0 0% 37 0% 4066651259 32% 0 0%
test_stateid want_deleg destroy_clid reclaim_comp allocate
13642707 0% 0 0% 31 0% 37 0% 0 0%
copy copy_notify deallocate ioadvise layouterror
0 0% 0 0% 0 0% 0 0% 0 0%
layoutstats offloadcancel offloadstatus readplus seek
0 0% 0 0% 0 0% 0 0% 0 0%
write_same
0 0%
Client side nfsstat (centos 7):
calls badcalls badclnt badauth xdrcall
0 0 0 0 0
Client rpc stats:
calls retrans authrefrsh
4157327074 6 4157501443
Client nfs v4:
null read write commit open open_conf
0 0% 12539371 0% 2010537 0% 171586 0% 17387625 0% 19761 0%
open_noat open_dgrd close setattr fsinfo renew
117435773 2% 28 0% 134408077 3% 2365580 0% 425 0% 736357 0%
setclntid confirm lock lockt locku access
68577 0% 14 0% 1998403 0% 0 0% 1988136 0% 73334903 1%
getattr lookup lookup_root remove rename link
3686184054 88% 35401700 0% 149 0% 4909916 0% 378484 0% 0 0%
symlink create pathconf statfs readlink readdir
0 0% 15960 0% 276 0% 11593628 0% 490 0% 2002535 0%
server_caps delegreturn getacl setacl fs_locations rel_lkowner
931 0% 36853705 0% 0 0% 0 0% 0 0% 0 0%
secinfo exchange_id create_ses destroy_ses sequence get_lease_t
0 0% 0 0% 31 0% 37 0% 28 0% 16 0%
reclaim_comp layoutget getdevinfo layoutcommit layoutreturn getdevlist
251 0% 28 0% 0 0% 0 0% 0 0% 0 0%
(null)
34 0%
Update: watching htop on the client, I noticed when this is happening the top process is
{NFS-IP}-mana
Each time the interrupts occur I am getting this process
48800 R ? 00:00:00 [{nfsIP_address}-mana]
48800 R ? 00:00:00 [{nfsIP_address}-mana]
48800 R ? 00:00:00 [{nfsIP_address}-mana]
48800 R ? 00:00:00 [{nfsIP_address}-mana]
48800 R ? 00:00:00 [{nfsIP_address}-mana]
48800 R ? 00:00:00 [{nfsIP_address}-mana]
48800 R ? 00:00:00 [{nfsIP_address}-mana]
48800 R ? 00:00:00 [{nfsIP_address}-mana]
48800 R ? 00:00:01 [{nfsIP_address}-mana]
48800 R ? 00:00:01 [{nfsIP_address}-mana]
48800 R ? 00:00:01 [{nfsIP_address}-mana]
48800 R ? 00:00:01 [{nfsIP_address}-mana]
48800 R ? 00:00:01 [{nfsIP_address}-mana]
48800 R ? 00:00:01 [{nfsIP_address}-mana]
48800 R ? 00:00:01 [{nfsIP_address}-mana]
48800 R ? 00:00:01 [{nfsIP_address}-mana]
48800 R ? 00:00:02 [{nfsIP_address}-mana]
48800 R ? 00:00:02 [{nfsIP_address}-mana]
48800 R ? 00:00:02 [{nfsIP_address}-mana]
48800 R ? 00:00:02 [{nfsIP_address}-mana]
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP