Trigger checkpointing from within the application

Matthew Dillon dillon at apollo.backplane.com
Mon Nov 22 20:41:48 PST 2004


    The current checkpoint implementation has two signals, SIGCKPT and
    SIGCKPTEXIT (signals 33 and 34).  SIGCKPT means 'checkpoint and continue'
    and that is the one that the TTY will generate when you hit ^E.

    A process that is checkpoint-aware should be able to set a signal
    handler for SIGCKPT.  This will prevent the automatic checkpoint from
    occuring so your program can then control when the checkpoint is to
    occur.

    Unfortunately that's where the work ended.  Theoretically one can make
    the checkpoint system call but we have not incorporated the system call
    into the master syscall list yet (it only exists as a module and the
    checkpt program is kinda hacked to generate the syscall to restore).

    It seems that there is more then a passing interest in the checkpointing
    code so I will finish up the interface and make the system call available.
    At that point you will be able to set a signal handler to catch the
    request and then call the checkpointing code when convenient.  The
    resume would not be another signal, it would simply resume with a
    different return code from the system call (so you can tell the difference
    between the initial call that dumps the checkpoint and the
    resume-from-checkpoint by looking at the return code of the system call).

    In anycase, that sounds like a bit of fun and now that I've fixed 
    chmod/chown I need to have a bit of fun, so I'll get it done tonight.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>

:Michael Neumann wrote:
:
:> Hi,
:> 
:> I'd like to checkpoint my application without that the user has to type 
:> ^E. It's a server application. What I'm dreaming of is a syscall like this:
:> 
:>   if (checkpt('filename.ckpt')) {
:>     // checkpoint was restored
:>   }
:> 
:> checkpt() will checkpoint the running application as ^E do. The checkpt 
:> syscall returns 0 for the running application and != 0 when the 
:> checkpointed application is restored.
:> 
:> Or instead of a syscall, how about a special signal, that is called 
:> after the checkpoint was restored?
:> 
:> I have to run some procedures to setup sockets etc. after a checkpoint 
:> has been restored.
:> 
:> Regards,
:> 
:>   Michael





More information about the Users mailing list