Adding Metadata to Instructions in LLVM IR

ash picture ash · Nov 17, 2012 · Viewed 7.1k times · Source

First up, I am a newbie to LLVM passes.

I am trying to add metadata to instructions in LLVM after a transformation pass (with the C++ API). I intend to store this information for use by another tool in a tool chain. I have two questions regarding this.

  1. I expect the information I store as metadata to feed into another tool which works on the LLVM IR. So is metadata a good idea ? I intend to store strings as metadata with some instructions.

  2. If metadata is the right way to go here, I need some help creating a metadata node. I plan to use the setMedata() function to attach it to an instruction. Which variant of setMetadata() is the right one to use. I am not sure which MDKind should my data be of. I want to create a MDString, attach it to my MDNode and then call setMetadata() with an instruction. What Context should I use in the setMedata(), if I want to attach the metadata to an instruction inside a function. What is the relevance of context to metadata?

I tried reading up a lot of discussions in forums and the llvm doxygen docs but I did not get a clear and complete answer to all my questions. I appreciate your help or some material that could help me understand this.

Answer

Oak picture Oak · Dec 11, 2012

In my opinion:

1. Is metadata the right mechanism to use?

If your "other tool" is not a pass in itself, then yes, I think metadata is the best approach - keeps everything in the IR, easy to identify by eye, simple to manually add for testing, and - perhaps most importantly - does not collide with anything else, as long as you don't reuse existing metadata kinds.

However, if your "other tool" is a pass by itself, there's an alternative: you can make one pass dependent on the other, and than use information from the earlier directly in the later pass. The advantage is that you don't have to modify the IR.

2. How to use a custom metadata node?

Use the char* variant of setMetadata, like so:

LLVMContext& C = Inst->getContext();
MDNode* N = MDNode::get(C, MDString::get(C, "my md string content"));
Inst->setMetadata("my.md.name", N);

And if it's the first time the string is used in a setMetadata, it will automatically register my.md.name as a new kind in the module (it's actually consistent in the entire context, I believe). You can later on retrieve the string by using:

cast<MDString>(Inst->getMetadata("my.md.name")->getOperand(0))->getString();

If you want to invoke getMetadata or setMetadata repeatedly from the same scope, though, you can also use Module::getMDKindID to just get the actual kind used, and use the variations of these methods that use the kind value.

Finally, be aware that you can limit the metadata node scope to be inside a function - use the MDNode::get(..., ..., true) variant for that - though I never used it myself.