[linux-elitists] Cluster filesystems

Ben Woodard woodard@redhat.com
Thu Jan 1 13:02:31 PST 2004


On Thu, 2004-01-01 at 20:50, Jason Spence wrote:
>  - Or even better yet, why hasn't anyone implemented process
>    migration?  As far as I can tell (but haven't tried (yet)), you
>    just need to do this for the process, its threads, and any
>    dependent processes (like the X server if you're migrating
>    something that's doodling on the X display):
> 

The people at Lawrence Berkley Labs are working on checkpoint restart
semantics. I missed the update regarding this topic which happened a
couple of weeks ago so I really don't know what state it is in right
now. What I do know is that they have been given money to have four or
five programmers working on this problem for the next five years.

>     1) Have a common filesystem (including devnodes)
> 
>     2) Read /proc/pid/map to get a memory map
> 
>     3) SIGSTOP(pid)
> 
>     4) ptrace(PT_READ_something, pid, someaddr, 0); [2]
>        for all the memory sections (and get the registers too)
> 
>     5) Figure out file descriptors, SysV IPC usage, sockets, etc and
>        write them down somewhere
> 
>     6) Kill everything on the source host
> 
>     7) Move all the paperwork over to the target host over the network
>        and start a dummy program in unused VM space that reallocates
>        all the resources [3], writes the process sections into its VM
>        space, spawns threads, copies all the registers over for each
>        thread, [5] and then jumps to the PC value retrieved after the
>        SIGSTOP for each thread. [6] [7]
> 
>     8) Pray.
> 
>   The idea being that in a well maintained network, controlled
>   shutdowns do not have to impact the use of applications hosted on
>   the machine being shut down (much).
> 
>   Oh nuts, you'd have to intercept all the hardware I/O on the source
>   to reconstruct the state of hardware devices if you're migrating
>   something that talks directly to hardware.  Oh wait, that would be
>   bad because the device might not like being reinitialized.  Hmm.




More information about the linux-elitists mailing list