I've been tasked with generating a certain number of data-cache misses and instruction-cache misses. I've been able to handle the data-cache portion without issue.
So I'm left with generating the instruction-cache misses. I do not have any idea what causes these. Can someone suggest a method of generating them?
I'm using GCC in Linux.
As people have explained, an instruction cache miss is conceptually the same as a data-cache miss - the instructions are not in the cache. This is because the processor's program counter (PC) has jumped to a place which hasn't been loaded into the cache, or has been flushed out because the cache got filled, and that cache line was the one chosen for eviction (usually least recently used).
It is a bit harder to generate enough code by hand to force an instruction miss than it is to force a data cache miss.
One way to get lots of code, for little effort, is to write a program which generates source code.
For example write a program to generate a function with a huge switch statement (in C) [Warning, untested]:
printf("void bigswitch(int n) {\n switch (n) {");
for (int i=1; i<100000; ++i) {
printf(" case %d: n += %d;\n", n, n+i/2);
}
printf(" }\n return n;}\n");
Then you can call this from another function, and you can control how big a jump along the cache line it takes.
A property of a switch statement is the code can be forced to execute backwards, or in patterns by choosing the parameter. So you can work with the pre-fetching and prediction mechanisms, or try to work against them.
The same technique could be applied to generate lots of functions too, to ensure the cache can be 'busted' at will. So you may have bigswitch001, bigswitch002, etc. You might call this using a switch which you also generate.
If you can make each function (approximately) some number of i-cache lines in size, and also generate more functions than will fit in cache, then the problem of generating instruction cache-misses becomes easier to control.
You can see exactly how big a function, an entire switch statement, or each leg of a switch statement is by dumping the assembler (using gcc -S), or objdump the .o file. So you could 'tune' the size of a function by adjusting the number of case:
statements. You could also choose how many cache lines are hit, by judicious choice of the parameter to bigswitchNNN().