getline() vs. fgets(): Control memory allocation

edavid picture edavid · May 3, 2019 · Viewed 11.5k times · Source

To read lines from a file there are the getline() and fgets() POSIX functions (ignoring the dreaded gets()). It is common sense that getline() is preferred over fgets() because it allocates the line buffer as needed.

My question is: Isn’t that dangerous? What if by accident or malicious intent someone creates a 100GB file with no '\n' byte in it – won’t that make my getline() call allocate an insane amount of memory?

Answer

John Bollinger picture John Bollinger · May 3, 2019

My question is: Isn’t that dangerous? What if by accident or malicious intent someone creates a 100GB file with no '\n' byte in it – won’t that make my getline() call allocate an insane amount of memory?

Yes, what you describe is a plausible risk. However,

  • if the program requires loading an entire line into memory at once, then allowing getline() to attempt to do that is not inherently more risky than writing your own code to do it with fgets(); and
  • if you have a program that has such a vulnerability, then you can mitigate the risk by using setrlimit() to limit the total amount of (virtual) memory it can reserve. This can be used to cause it to fail instead of successfully allocating enough memory to interfere with the rest of the system.

Best overall, I'd argue, is to write code that does not require input in units of full lines (all at once) in the first place, but such an approach has its own complexities.