I am trying to split up a large xml file into smaller chunks. I write to the output file and then check its size to see if its passed a threshold, but I dont think the getsize() method is working as expected.
What would be a good way to get the filesize of a file that is changing in size.
Ive done something like this...
import string
import os
f1 = open('VSERVICE.xml', 'r')
f2 = open('split.xml', 'w')
for line in f1:
if str(line) == '</Service>\n':
break
else:
f2.write(line)
size = os.path.getsize('split.xml')
print('size = ' + str(size))
running this prints 0 as the filesize for about 80 iterations and then 4176. Does Python store the output in a buffer before actually outputting it?
File size is different from file position. For example,
os.path.getsize('sample.txt')
It exactly returns file size in bytes.
But
f = open('sample.txt')
print f.readline()
f.tell()
Here f.tell() returns the current position of the file handler - i.e. where the next write will put its data. Since it is aware of the buffering, it should be accurate as long as you are simply appending to the output file.