In a solaris system that processes large numbers of files and stores their information in a database (yes i know that using the database is the quickest way to get information about the number of files we have). I need fast way to monitor the files as they progress through the system on their way to being stored in a database.
Currently I use a perl script that reads in the directory to an array and then grabs the size of the array and sends it to a monitoring script. Unfortunately as our system grows this monitor is getting more and more slow.
I am looking for a method that will operate much more quickly instead of pausing and updating every 15-20 seconds after performing the count operation on all the directories involved.
I am relatively certain that my bottleneck is the read directory into array operation.
I don't need any information about the files, I don't need sizes or file names, just the number of files in the directory.
In my code I do not count hidden files or the text files I use to hold configuration information. It would be great if this functionality was preserved but is certainly not mandatory.
I have found some references to counting inodes with C code or something along those lines but I am not very experienced in that area.
I would like to make this monitor as real-time as possible.
The perl code I use looks like this:
opendir (DIR, $currentDir) or die "Cannot open directory: $!";
@files = grep ! m/^\./ && ! /config_file/, readdir DIR; # skip hidden files and config files
closedir(DIR);
$count = @files;
What you do right now reads the whole directory (more or less) into memory only to discard that content for its count. Avoid that by streaming the directory instead:
my $count;
opendir(my $dh, $curDir) or die "opendir($curdir): $!";
while (my $de = readdir($dh)) {
next if $de =~ /^\./ or $de =~ /config_file/;
$count++;
}
closedir($dh);
Importantly, don't use glob()
in any of its forms. glob()
will expensively stat()
every entry, which is not overhead you want.
Now, you might have much more sophisticated and lighter weight ways of doing this depending on OS capabilities or filesystem capabilities (Linux, by way of comparison, offers inotify), but streaming the dir as above is about as good as you'll portably get.