After writing to a file, why does os.path.getsize still return the previous size?

Question 1

After writing to a file, why does os.path.getsize still return the previous size?

python filesize

Maulin · Jun 18, 2009 · Viewed 25.8k times · Source

Answer

Answer

File size is different from file position. For example,

os.path.getsize('sample.txt')

It exactly returns file size in bytes.

But

f = open('sample.txt')
print f.readline()
f.tell()

Here f.tell() returns the current position of the file handler - i.e. where the next write will put its data. Since it is aware of the buffering, it should be accurate as long as you are simply appending to the output file.

Question 2

I am trying to split up a large xml file into smaller chunks. I write to the output file and then check its size to see if its passed a threshold, but I dont think the getsize() method is working as expected.

What would be a good way to get the filesize of a file that is changing in size.

Ive done something like this...

import string
import os

f1 = open('VSERVICE.xml', 'r')
f2 = open('split.xml', 'w')

for line in f1:
  if str(line) == '</Service>\n':
    break
  else:
    f2.write(line)
    size = os.path.getsize('split.xml')
    print('size = ' + str(size))

running this prints 0 as the filesize for about 80 iterations and then 4176. Does Python store the output in a buffer before actually outputting it?

After writing to a file, why does os.path.getsize still return the previous size?

Answer

Related questions