[linux-elitists] Cluster filesystems

Ben Woodard woodard@redhat.com
Thu Jan 1 13:02:31 PST 2004

On Thu, 2004-01-01 at 20:50, Jason Spence wrote:
>  - Or even better yet, why hasn't anyone implemented process
>    migration?  As far as I can tell (but haven't tried (yet)), you
>    just need to do this for the process, its threads, and any
>    dependent processes (like the X server if you're migrating
>    something that's doodling on the X display):

The people at Lawrence Berkley Labs are working on checkpoint restart
semantics. I missed the update regarding this topic which happened a
couple of weeks ago so I really don't know what state it is in right
now. What I do know is that they have been given money to have four or
five programmers working on this problem for the next five years.

>     1) Have a common filesystem (including devnodes)
>     2) Read /proc/pid/map to get a memory map
>     3) SIGSTOP(pid)
>     4) ptrace(PT_READ_something, pid, someaddr, 0); [2]
>        for all the memory sections (and get the registers too)
>     5) Figure out file descriptors, SysV IPC usage, sockets, etc and
>        write them down somewhere
>     6) Kill everything on the source host
>     7) Move all the paperwork over to the target host over the network
>        and start a dummy program in unused VM space that reallocates
>        all the resources [3], writes the process sections into its VM
>        space, spawns threads, copies all the registers over for each
>        thread, [5] and then jumps to the PC value retrieved after the
>        SIGSTOP for each thread. [6] [7]
>     8) Pray.
>   The idea being that in a well maintained network, controlled
>   shutdowns do not have to impact the use of applications hosted on
>   the machine being shut down (much).
>   Oh nuts, you'd have to intercept all the hardware I/O on the source
>   to reconstruct the state of hardware devices if you're migrating
>   something that talks directly to hardware.  Oh wait, that would be
>   bad because the device might not like being reinitialized.  Hmm.

More information about the linux-elitists mailing list