please prefer base 64 over base 32 (was: Re: [p2p-hackers] Bitzi (was Various identifier choices))

Gordon Mohr gojomo at usa.net
Wed Sep 19 10:39:01 UTC 2001


Zooko writes:
> Having URLs which are short enough to cut and paste is important. 

I agree.

> Encoding six
> bits per character (base 64) is that much better than encoding five bits per
> character.

Yes, Base64 is 17% more compact than Base32, which is 
20% more compact than Hexadecimal.

But Base64 introduces case-sensitivity. Especially if
you ever use identifier fragments as a shorthand, this
introduces situations where they "bleed together" --
in human perception, in filesystems, in search-routines.

Also, Base64 introduces 2 characters that can present
problems in URLs and filenames: '/' and '+'. These
also serve as 'break' characters to many text-index
and text-search routines. 

You could use Freenet's patched Base64, which uses the
characters '~' and '-' instead, but then you've deviated
slightly from a long-standing standard, and still have
the 'break' characters problem.

In contrast, Base32 is robust across case isomorphisms,
safe for URLs and filesystems, and results in full-length
and fragment identifiers which are typically recognized
as unbroken units by legacy text-search mechanisms.

> A mojoid in base-32 would look like this:
> 
> http://localhost:4004/id/1b17864eeb6c68294c9b2db0324a2b773401f0da0537d82626c24a7850e15ef2d6c4265dcd5e85f1

That looks like Hexadecimal to me; the chance that a 70-digit Base32
number would contain no letters G-Z is infinitesimal.

> The same mojoid in base-64 would look like this:
> 
> http://localhost:4004/id/GxeGTutsaClMmy2wMkordzQB8NoFN9gmJsJKeFDhXvLWxCZdzV6F8Q

If you're lucky enough not to get any '/' or '+' characters!

> That can make a significant difference in terms of usability, due to
> line-wrapping in SMTP gateways and in GUIs, the awkwardness of layout when
> representing this mojoid e.g. in HTML, and the general user experience.  The
> bigger and uglier the URL, the less a user likes to deal with it.

I again agree. However, for the foreseeable future, SHA1 
will be a sufficient casual "mailable" key into Bitzi, 
and SHA1 in Base32 is already a manageable 32-characters. 

I can see with your longer MojoIDs you have a problem; 
there is no need for all identifiers to use the same 
ASCII-compatible-encoding, so perhaps Base64 is the right 
choice for MojoNation. 

If Bitzi was to track and display MojoIDs, associated 
with Bitprints, we would display the MojoIDs in whatever 
fashion is typical for MojoNation users.

> By the way, we might try to squeeze mojoids.  I think we can get down to 30
> bytes from 40 (by convincing ourselves that an 80-bit symmetric key has the
> same attack work factor as a 160-bit hash id), so then it would look like:
> 
> http://localhost:4004/id/1b17864eeb6c68294c9b2db0324a2b773401f0da0537d82626c24a7850e1
> 
> or
> 
> http://localhost:4004/id/GxeGTutsaClMmy2wMkordzQB8NoFN9gmJsJKeFDh
> 
> We might also have an unencrypted mojoid, which would be 20 bytes, like this:
> 
> http://localhost:4004/id/1b17864eeb6c68294c9b2db0324a2b773401f0da
> 
> or
> 
> http://localhost:4004/id/GxeGTutsaClMmy2wMkordzQB8No

Sure, knock yourselves out. My only request would be that 
you document how they are created somewhere (besides the 
code itself) and freeze the definition at some point. :)

- Gojomo







More information about the P2p-hackers mailing list