If I load a kernel module and list the loaded modules with lsmod
, I can get the "use count" of the module (number of other modules with a reference to the module). Is there a way to figure out what is using a module, though?
The issue is that a module I am developing insists its use count is 1 and thus I cannot use rmmod
to unload it, but its "by" column is empty. This means that every time I want to re-compile and re-load the module, I have to reboot the machine (or, at least, I can't figure out any other way to unload it).
Actually, there seems to be a way to list processes that claim a module/driver - however, I haven't seen it advertised (outside of Linux kernel documentation), so I'll jot down my notes here:
First of all, many thanks for @haggai_e's answer; the pointer to the functions try_module_get
and try_module_put
as those responsible for managing the use count (refcount) was the key that allowed me to track down the procedure.
Looking further for this online, I somehow stumbled upon the post Linux-Kernel Archive: [PATCH 1/2] tracing: Reduce overhead of module tracepoints; which finally pointed to a facility present in the kernel, known as (I guess) "tracing"; the documentation for this is in the directory Documentation/trace - Linux kernel source tree. In particular, two files explain the tracing facility, events.txt and ftrace.txt.
But, there is also a short "tracing mini-HOWTO" on a running Linux system in /sys/kernel/debug/tracing/README
(see also I'm really really tired of people saying that there's no documentation…); note that in the kernel source tree, this file is actually generated by the file kernel/trace/trace.c. I've tested this on Ubuntu natty
, and note that since /sys
is owned by root, you have to use sudo
to read this file, as in sudo cat
or
sudo less /sys/kernel/debug/tracing/README
... and that goes for pretty much all other operations under /sys
which will be described here.
First of all, here is a simple minimal module/driver code (which I put together from the referred resources), which simply creates a /proc/testmod-sample
file node, which returns the string "This is testmod." when it is being read; this is testmod.c
:
/*
https://github.com/spotify/linux/blob/master/samples/tracepoints/tracepoint-sample.c
https://www.linux.com/learn/linux-training/37985-the-kernel-newbie-corner-kernel-debugging-using-proc-qsequenceq-files-part-1
*/
#include <linux/module.h>
#include <linux/sched.h>
#include <linux/proc_fs.h>
#include <linux/seq_file.h> // for sequence files
struct proc_dir_entry *pentry_sample;
char *defaultOutput = "This is testmod.";
static int my_show(struct seq_file *m, void *v)
{
seq_printf(m, "%s\n", defaultOutput);
return 0;
}
static int my_open(struct inode *inode, struct file *file)
{
return single_open(file, my_show, NULL);
}
static const struct file_operations mark_ops = {
.owner = THIS_MODULE,
.open = my_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
};
static int __init sample_init(void)
{
printk(KERN_ALERT "sample init\n");
pentry_sample = proc_create(
"testmod-sample", 0444, NULL, &mark_ops);
if (!pentry_sample)
return -EPERM;
return 0;
}
static void __exit sample_exit(void)
{
printk(KERN_ALERT "sample exit\n");
remove_proc_entry("testmod-sample", NULL);
}
module_init(sample_init);
module_exit(sample_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Mathieu Desnoyers et al.");
MODULE_DESCRIPTION("based on Tracepoint sample");
This module can be built with the following Makefile
(just have it placed in the same directory as testmod.c
, and then run make
in that same directory):
CONFIG_MODULE_FORCE_UNLOAD=y
# for oprofile
DEBUG_INFO=y
EXTRA_CFLAGS=-g -O0
obj-m += testmod.o
# mind the tab characters needed at start here:
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
When this module/driver is built, the output is a kernel object file, testmod.ko
.
At this point, we can prepare the event tracing related to try_module_get
and try_module_put
; those are in /sys/kernel/debug/tracing/events/module
:
$ sudo ls /sys/kernel/debug/tracing/events/module
enable filter module_free module_get module_load module_put module_request
Note that on my system, tracing is by default enabled:
$ sudo cat /sys/kernel/debug/tracing/tracing_enabled
1
... however, the module tracing (specifically) is not:
$ sudo cat /sys/kernel/debug/tracing/events/module/enable
0
Now, we should first make a filter, that will react on the module_get
, module_put
etc events, but only for the testmod
module. To do that, we should first check the format of the event:
$ sudo cat /sys/kernel/debug/tracing/events/module/module_put/format
name: module_put
ID: 312
format:
...
field:__data_loc char[] name; offset:20; size:4; signed:1;
print fmt: "%s call_site=%pf refcnt=%d", __get_str(name), (void *)REC->ip, REC->refcnt
Here we can see that there is a field called name
, which holds the driver name, which we can filter against. To create a filter, we simply echo
the filter string into the corresponding file:
sudo bash -c "echo name == testmod > /sys/kernel/debug/tracing/events/module/filter"
Here, first note that since we have to call sudo
, we have to wrap the whole echo
redirection as an argument command of a sudo
-ed bash
. Second, note that since we wrote to the "parent" module/filter
, not the specific events (which would be module/module_put/filter
etc), this filter will be applied to all events listed as "children" of module
directory.
Finally, we enable tracing for module:
sudo bash -c "echo 1 > /sys/kernel/debug/tracing/events/module/enable"
From this point on, we can read the trace log file; for me, reading the blocking, "piped" version of the trace file worked - like this:
sudo cat /sys/kernel/debug/tracing/trace_pipe | tee tracelog.txt
At this point, we will not see anything in the log - so it is time to load (and utilize, and remove) the driver (in a different terminal from where trace_pipe
is being read):
$ sudo insmod ./testmod.ko
$ cat /proc/testmod-sample
This is testmod.
$ sudo rmmod testmod
If we go back to the terminal where trace_pipe
is being read, we should see something like:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
insmod-21137 [001] 28038.101509: module_load: testmod
insmod-21137 [001] 28038.103904: module_put: testmod call_site=sys_init_module refcnt=2
rmmod-21354 [000] 28080.244448: module_free: testmod
That is pretty much all we will obtain for our testmod
driver - the refcount changes only when the driver is loaded (insmod
) or unloaded (rmmod
), not when we do a read through cat
. So we can simply interrupt the read from trace_pipe
with CTRL+C in that terminal; and to stop the tracing altogether:
sudo bash -c "echo 0 > /sys/kernel/debug/tracing/tracing_enabled"
Here, note that most examples refer to reading the file /sys/kernel/debug/tracing/trace
instead of trace_pipe
as here. However, one problem is that this file is not meant to be "piped" (so you shouldn't run a tail -f
on this trace
file); but instead you should re-read the trace
after each operation. After the first insmod
, we would obtain the same output from cat
-ing both trace
and trace_pipe
; however, after the rmmod
, reading the trace
file would give:
<...>-21137 [001] 28038.101509: module_load: testmod
<...>-21137 [001] 28038.103904: module_put: testmod call_site=sys_init_module refcnt=2
rmmod-21354 [000] 28080.244448: module_free: testmod
... that is: at this point, the insmod
had already been exited for long, and so it doesn't exist anymore in the process list - and therefore cannot be found via the recorded process ID (PID) at the time - thus we get a blank <...>
as process name. Therefore, it is better to log (via tee
) a running output from trace_pipe
in this case. Also, note that in order to clear/reset/erase the trace
file, one simply writes a 0 to it:
sudo bash -c "echo 0 > /sys/kernel/debug/tracing/trace"
If this seems counterintuitive, note that trace
is a special file, and will always report a file size of zero anyways:
$ sudo ls -la /sys/kernel/debug/tracing/trace
-rw-r--r-- 1 root root 0 2013-03-19 06:39 /sys/kernel/debug/tracing/trace
... even if it is "full".
Finally, note that if we didn't implement a filter, we would have obtained a log of all module calls on the running system - which would log any call (also background) to grep
and such, as those use the binfmt_misc
module:
...
tr-6232 [001] 25149.815373: module_put: binfmt_misc call_site=search_binary_handler refcnt=133194
..
grep-6231 [001] 25149.816923: module_put: binfmt_misc call_site=search_binary_handler refcnt=133196
..
cut-6233 [000] 25149.817842: module_put: binfmt_misc call_site=search_binary_handler refcnt=129669
..
sudo-6234 [001] 25150.289519: module_put: binfmt_misc call_site=search_binary_handler refcnt=133198
..
tail-6235 [000] 25150.316002: module_put: binfmt_misc call_site=search_binary_handler refcnt=129671
... which adds quite a bit of overhead (in both log data ammount, and processing time required to generate it).
While looking this up, I stumbled upon Debugging Linux Kernel by Ftrace PDF, which refers to a tool trace-cmd, which pretty much does the similar as above - but through an easier command line interface. There is also a "front-end reader" GUI for trace-cmd
called KernelShark; both of these are also in Debian/Ubuntu repositories via sudo apt-get install trace-cmd kernelshark
. These tools could be an alternative to the procedure described above.
Finally, I'd just note that, while the above testmod
example doesn't really show use in context of multiple claims, I have used the same tracing procedure to discover that an USB module I'm coding, was repeatedly claimed by pulseaudio
as soon as the USB device was plugged in - so the procedure seems to work for such use cases.