Newer Linux kernels have a sysfs tunable /proc/sys/kernel/perf_event_paranoid
which allows the user to adjust the available functionality of perf_events
for non-root users, with higher numbers being more secure (offering correspondingly less functionality):
From the kernel documenation we have the following behavior for the various values:
perf_event_paranoid:
Controls use of the performance events system by unprivileged users (without CAP_SYS_ADMIN). The default value is 2.
-1: Allow use of (almost) all events by all users Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>=0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN Disallow raw tracepoint access by users without CAP_SYS_ADMIN
>=1: Disallow CPU event access by users without CAP_SYS_ADMIN
>=2: Disallow kernel profiling by users without CAP_SYS_ADMIN
I have 1
in my perf_event_paranoid
file which should "Disallow CPU event access" - but what does that mean exactly?
A plain reading would imply no access to CPU performance counter events (such as Intel PMU events), but it seems I can access those just fine. For example:
$ perf stat sleep 1
Performance counter stats for 'sleep 1':
0.408734 task-clock (msec) # 0.000 CPUs utilized
1 context-switches # 0.002 M/sec
0 cpu-migrations # 0.000 K/sec
57 page-faults # 0.139 M/sec
1,050,362 cycles # 2.570 GHz
769,135 instructions # 0.73 insn per cycle
152,661 branches # 373.497 M/sec
6,942 branch-misses # 4.55% of all branches
1.000830821 seconds time elapsed
Here, many of the events are CPU PMU events (cycles
, instructions
, branches
, branch-misses
, cache-misses
).
If these aren't the CPU events being referred to, what are they?
In this case CPU event refers to monitoring events per CPU rather than per task. For perf
tools this restricts the usage of
-C, --cpu=
Count only on the list of CPUs provided. Multiple CPUs can be provided as a comma-separated list with no space: 0,1.
Ranges of CPUs are specified with -: 0-2. In per-thread mode, this option is ignored. The -a option is still necessary
to activate system-wide monitoring. Default is to count on all CPUs.
-a, --all-cpus
system-wide collection from all CPUs (default if no target is specified)
For perf_event_open
this considers the following case:
pid == -1 and cpu >= 0
This measures all processes/threads on the specified CPU. This requires CAP_SYS_ADMIN capability or a /proc/sys/ker‐
nel/perf_event_paranoid value of less than 1.
This may be version specific, the cited documentation is from 4.17. This is another related question.