Read and remove first (or last) line from txt file without copying

user1587451 picture user1587451 · Mar 1, 2016 · Viewed 7.6k times · Source

I wanna read and remove the first line from a txt file (without copying, it's a huge file).
I've read the net but everybody just copies the desired content to a new file. I can't do that.

Below a first attempt. This code will be stucked in a loop as no lines are removed. If the code would remove the first line of file at each opening, the code would reach the end.

#include <iostream>
#include <string>
#include <fstream>
#include <boost/interprocess/sync/file_lock.hpp>

int main() {
    std::string line;
    std::fstream file;
    boost::interprocess::file_lock lock("test.lock");
    while (true) {
        std::cout << "locking\n";
        lock.lock();
        file.open("test.txt", std::fstream::in|std::fstream::out);
        if (!file.is_open()) {
            std::cout << "can't open file\n";
            file.close();
            lock.unlock();
            break;
        }
        else if (!std::getline(file,line)) {
            std::cout << "empty file\n"; //
            file.close();                // never
            lock.unlock();               // reached
            break;                       //
        }
        else {
            // remove first line
            file.close();
            lock.unlock();
            // do something with line
        }
    }
}

Answer

Colin Pitrat picture Colin Pitrat · Mar 1, 2016

What you want to do, indeed, is not easy.

If you open the same file for reading and writing in it without being careful, you will end up reading what you just wrote and the result will not be what you want.

Modifying the file in place is doable: just open it, seek in it, modify and close. However, you want to copy all the content of the file except K bytes at the beginning of the file. It means you will have to iteratively read and write the whole file by chunks of N bytes.

Now once done, K bytes will remain at the end that would need to be removed. I don't think there's a way to do it with streams. You can use ftruncate or truncate functions from unistd.h or use Boost.Interprocess truncate for this.

Here is an example (without any error checking, I let you add it):

#include <iostream>
#include <fstream>
#include <unistd.h>

int main()
{
  std::fstream file;
  file.open("test.txt", std::fstream::in | std::fstream::out);

  // First retrieve size of the file
  file.seekg(0, file.end);
  std::streampos endPos = file.tellg();
  file.seekg(0, file.beg);

  // Then retrieve size of the first line (a.k.a bufferSize)
  std::string firstLine;
  std::getline(file, firstLine);

  // We need two streampos: the read one and the write one
  std::streampos readPos = firstLine.size() + 1;
  std::streampos writePos = 0;

  // Read the whole file starting at readPos by chunks of size bufferSize
  std::size_t bufferSize = 256;
  char buffer[bufferSize];
  bool finished = false;
  while(!finished)
  {
    file.seekg(readPos);
    if(readPos + static_cast<std::streampos>(bufferSize) >= endPos)
    {
      bufferSize = endPos - readPos;
      finished = true;
    }
    file.read(buffer, bufferSize);
    file.seekg(writePos);
    file.write(buffer, bufferSize);
    readPos += bufferSize;
    writePos += bufferSize;
  }
  file.close();

  // No clean way to truncate streams, use function from unistd.h
  truncate("test.txt", writePos);
  return 0;
}

I'd really like to be able to provide a cleaner solution for in-place modification of the file, but I'm not sure there's one.