Friday, January 23, 2015

Why did "ls -al" command take much time to return output?

Issue/Symptom  : "ls -al" command took much time than expected.

time ls -la /data/XXX/mboxes/tier1/01/XXX/15/67/178/XXX/ 
... (just removing the output of the ls -la command itself)
real 1m26.674s 
user 0m0.285s 
sys 0m1.627s



OS Environment : RHEL 6, storage netapp filer ONTAP 8 
Software/Application : storage - FAS3240 over NFSv3, ls command[coreutils-8.4-16.el6.x86_64]
Investigation : Direcotry "XXX" contains around 22481 files.
Workaround Solution : Time consumption is expected in this case. This may not be same on other environment. Please follow RCA section.

Root Cause Analysis :


Above time is expected as directory has around 22481 files. "ls -al" retrieves more file attributes details than normal "ls" command. 
As per strace analysis, it is not found that any system call which took more than 1 sec. Accumulation of all calls took 1m26s as you see. Compared other user's 
directory[has around ~4000 mails], it was  not found  this much delay. It just took less than 2 sec. Test and details :
time ls -la /data/XXX/mboxes/tier1/01/XXX/15/67/178/XXX/ << no of file 22481
[...]
real 1m26.674s 
user 0m0.285s 
sys 0m1.627s
with out "-al" option :
time ls /data/XXX/mboxes/tier1/01/XXX/15/67/178/XXX/
[....]
real 0m4.678s
user 0m0.223s
sys 0m1.008s
This user has less mails :
time ls -al /data/XXX/mboxes/tier1/01/XXX/15/67/47/XXX << no of file 4442
real 0m1.010s
user 0m0.067s
sys 0m0.297s
time ls -al /data/XXX2/mboxes/tier1/01/XXX/15/67/47/XXn/
real 0m0.736s
user 0m0.070s
sys 0m0.162s
ls -al /data/XXX/mboxes/tier1/01/XXX/15/67/183/XXX/|wc -l
6468
took time :
real 0m1.406s
user 0m0.090s
sys 0m0.406s

Strace analysis :

#strace -Tttvv ls -al /data/XXX/mboxes/tier1/01/XXX/15/67/178/XXX/ &> /var/log/strace_op
#cat  /var/log/strace_op|awk '{print $NF}'|egrep -v 'msg|dat|new|bin|size|\?'|sed 's/<//'|sed 's/>//'|less 
#cat  /var/log/strace_op|awk '{print $NF}'|egrep -v 'msg|dat|new|bin|size|\?'|sed 's/<//'|sed 's/>//'|awk 'BEGIN {sum=0.0} {sum+=$NF} END {print sum}' 

If you see too much penalty in terms of consumption of retrieve time then please engage NetApp to perf analysis. Download perf tool 
and retrieve data from filers. Handover this data to Netapp to further analysis. 

No comments:

Post a Comment