I was recently asked to help a fellow postgrad fix his network simulator tracefile analyser to run concurrently with the simulator. The problem was that the analyser was reading the tracefile faster than the simulator could fill it, and thus stopping its analysis when it reached the end of the file without waiting for more data.
The first solution was just to sit in an infinite loop waiting for more data, but then you have to kill the analyser to stop reading and you don't get the report at the end. After some consultation with the gurus in #schlock_mercenary on the Nightstar IRC network (thanks Rhamphoryncus (Adam Olsen in meatspace) and Vornicus), the following code was born:
import time class logfile(file): def next(self): if self.logTimeout <= 0 or self.logSleepTime <= 0: counter = None else: counter = self.logTimeout/self.logSleepTime fpos = self.tell() # Store current position in file. while counter == None or counter > 0: line = self.readline() if line: if line[-1] == "\n" or not self.logOnlyCompleteLines: return line self.seek(fpos) # "Unread" incomplete line. time.sleep(0.1) if counter != None: counter -= 1 raise StopIteration def __init__(self, *args, **kwargs): file.__init__(self, *args, **kwargs) self.logTimeout = 5 self.logSleepTime = 0.1 self.logOnlyCompleteLines = True if __name__ == "__main__": import sys if len(sys.argv) > 1: fname = sys.argv f = logfile(fname) for line in f: print line, f.close()
How to use it
The class is used exactly as if it were a
file. The only
real difference is that a loop over the file contents (
for line in
file:) will wait a certain amount of time for more data before it
ends. This threshold can be set with
<number>. The length of time it sleeps between looking for
new data can be set with
<number>. If either of those parameters are zero or
negative, it will wait indefinitely for more data. If
True, it'll wait
for a complete line (ending in a newline) before it returns. This is
useful if you don't want to (or can't) process part lines and whatever's
writing the data doesn't buffer writes for full lines.
How it happened
I started with everything in an infinite
running over a normal
file object, but that seemed messy.
Rhamphoryncus suggested subclassing
file, and things
snowballed from there.
I tried overriding
readline(), but apparently the
next() in a normal
file doesn't use it. Then I
tried using the inherited
next(), but the readahead kept
causing problems. Since my implementation doesn't use the readahead it's
not as efficient reading the existing parts of the file, but sometimes you
have to make a tradeoff.
This works for everything I've tested it with. If you have any problems or ideas for features, give me a shout.