How to pause a python script running in terminal

jerrymouse picture jerrymouse · Dec 25, 2011 · Viewed 8.8k times · Source

I have a web crawling python script running in terminal for several hours, which is continuously populating my database. It has several nested for loops. For some reasons I need to restart my computer and continue my script from exactly the place where I left. Is it possible to preserve the pointer state and resume the previously running script in terminal?

I am looking for a solution which will work without altering the python script. Modifying the code is a lower priority as that would mean to relaunch the program and reinvest time.

Update: Thanks for the VM suggestion. I'll take that. For the sake of completion, what generic modifications should be made to script to make it pause and resumable?

Update2: Porting on VM works fine. I have also modified script to make it failsafe against network failures. Code written below.

Answer

Abhijit picture Abhijit · Dec 25, 2011

You might try suspending your computer or running in a virtual machine which you can subsequently suspend. But as your script is working with network connections chances are your script won't work from the point you left once you bring up the system. Suspending a computer and restoring it or saving a Virtual M/C and restoring it would mean you need to restablish the network connection. This is true for any elements which are external to your system and network is one of them. And there are high chances that if you are using a dynamic network, the next time you boot chances are you would get a new IP and the network state that you were working previously would be void.

If you are planning to modify the script, few things you need to keep it mind.

  1. Add serializing and Deserializing capabilities. Python has the pickle and the faster cPickle method to do it.
  2. Add Restart points. The best way to do this is to save the state at regular interval and when restarting your script, restart from last saved state after establishing all the transients elements like network.

This would not be an easy task so consider investing a considrable amount of time :-)

Note***

On a second thought. There is one alternative from changing your script. You can try using cloud Virtualization Solutions like Amazon EC2.