[linux-elitists] Congruent Infrastructure

Andy Bennett andyjpb at ashurst.eu.org
Sun Sep 8 11:24:38 PDT 2013


> For those who don't know, getupdates is very simple, it really just
> retrieves a shell script and a tar file for each changeset you updated, and
> runs that.

> I know puppet offers extra functionality, but IMO, it is not needed in many
> cases and that adds complexity.
> It may be possible to use puppet like I used getupdates and take advantage
> of whatever collectors they have without having to learn a lot of extra
> stuff, but I just don't know puppet, so I can't say.

Right. It seems like the way to do these things is to stay down at the
file system level where you don't have to know about any specifics of
what you're installing. I've heard of other successful tools which have
been maintained over periods greater than decades and successfully
ported between distros and all of them seem to operate well below the
package level.

> So the last thing I worked on, was production machines at google. Those are
> much easier (and unusual because few people do this): they are all the same
> for the maintained portion.

Wow. I thought that keeping everything the same was just common sense. I
figured it was something that was widespread but I'd always come across
amateurs or people who thought they were operating on too small a scale
to bother or such a large scale that they thought they couldn't avoid
the special cases.

> In other words it's pretty much like rsyncing all the entire filesystem with
> an exclude list of files for stuff like network config.
> I'm not going to give the talk via Email, you can read there
> http://marc.merlins.org/linux/talks/ProdNG-LC2013-JP/
> but basically that 100% takes care of divergence :)

Thanks for this it's really interesting. I once had an infrastructure
project I called ProdNG and was about how we'd move from a bunch of
machines managed in an adhoc way to something more central. I left
before it was really started and when I recently spoke to someone there
it sounded like they're running exactly the same machines in the same
configs 8 years later.

Why was it important to be able to upgrade without rebooting? Surely you
have so many machines and are prepared for them to fail and therefore
can just rotate out a few at a time?

Also, I see you mention "Start with easy packages like 'ed' and 'bc'"
Don't these come under the heading of "things that made sense to ship as
part of RH 7.1, but were useless to us" and were therefore already removed?

> (however, yes it means you cannot install your custom software on the
> managed root filesystem, it needs to be outside of that, meaning that you
> cannot use rpm/dpkg to install a package off the net and need to compile
> your software to run with different pathnames if applicable, which is what
> we do at google regardless)

It seems sensible to manage different layers of the infrastructure in
different ways. I have long been a fan of epkg (no, not the Gentoo
thing, this: http://encap.org/ ).

What do you think of technology like NixOS? It strikes me that it solves
similar problems to encap but does it in a more complicated way and one
that forcibly subverts some of the benefits of shared libraries.

Are shared library benefits, such as the ability to replace a buggy /
insecure libz, important in large scale infrastructures or do they end
up being so dwarfed with other operational concerns as to be irrelevant?

Thanks for the tips!


andyjpb at ashurst.eu.org

More information about the linux-elitists mailing list