[p2p-hackers] SHA1 broken?

Gordon Mohr ( at Bitzi) gojomo at bitzi.com
Thu Feb 17 18:23:51 UTC 2005


Serguei Osokine wrote:
> On Wednesday, February 16, 2005 Gordon Mohr wrote:
> 
>>Dan Kaminsky runs over a number of potential attacks that
>>are relevant to P2P -- see:
>>
>>  http://paketto.doxpara.com
>>...
>>Here's another example from the cryptography list that convinced
>>a  doubter...
> 
> 
> 	Certainly looks cute. Now correct me if I'm not getting something
> here - but isn't it true that in order to mount an attack one has to
> replace the "good" code (content, whatever) by the "bad" code, and the
> absolutely necessary condition is that the "good" code also has to be
> created by an attacker? So an attacker creates "good" code, gives it
> to security experts for verification, and then after they are done,
> replaces it with "bad code", right? 

Yes.

> 	Isn't it a bit far-fetched? Do we have a somewhat more realistic
> attack scenario? I just cannot imagine all this happening in real 
> life. Real-life breakdowns always tend to be way simpler than their
> theoretical scenarios (and totally unexpected, too).

It's possible. It's not that hard. It would offer rewards to an attacker
that are different and possibly larger than those offered by the simple
tricks that reel in easy marks.

So it doesn't seem that far-fetched to me.

>>But are there more simple ways to trick conscientious, hash-checking
>>users into running malware?
> 
> 
> 	Users typically don't give a damn about hash-checking; they
> expect the system to do that for them. And a few users that do give
> a damn typically can defend themselves from pretty much anything no
> matter what you throw at them. So the fate of this "expert" group 
> (consisting of about ten people for any given P2P system) does not
> really worry me, whereas for the rest of the user population there 
> are plenty of ways to trick them into running the malware - *all* 
> the current ways of doing  so are simpler than fiddling with hashes. 

If your attack is just to get someone, somewhere to run your malware,
sure. But the average/mass user is not the only interesting case.

If you want to get onto other, higher-valued machines, you have to get
around the real practices of many users, of various sophistication,
who do care about hashes of received content matching expected values.

For such people, to get them to settle for MD5, you either convince
them not to worry about the potential attack -- making them potential
victims -- or you lose them as users, because they realize that the
hashes used for content-identification on your network do not offer
the guarantee they seek.

That's not a good result. I want P2P+CDN that delivers content that I
and other sophisticated users can trust, and I want the unsophisticated
users on the same network, too:  I gain from their presence as peers/
seeds, and they can gain from my insistence on rigorous content
identification.

> 	Which brings me back to my question above: do we have a
> realistic scenario where a network like Gnutella would be harmed by
> using MD5?

Having installers like the fire.exe/ice.exe described by Kaminsky,
which have the same MD5 but install different software, could
quickly undermine confidence in an MD5-only P2P network for most kinds
of content delivery. Telling average users (or businesses considering
P2P delivery),  "but that's only when the attacker gets to create both
files", is noise to them.

(And for pro users, telling them that they have to trust the original
creator of the file not to have created twins is tantamount to requiring
the content to be separately digitally signed to prove origination --
an additional step rendering the plain standalone MD5 for content
identification superfluous.)

> (Not that I give a damn about MD5, and no one in Gnutella probably 
> uses it anyway; my interest is largely theoretical here, and the same
> issues might be relevant for the other hashes, either.)
> 
> 
>>And since when did the ease of other attacks become an excuse
>>for ignoring more complicated and subtle (and thus perhaps
>>more valuable) attacks?
> 
> 
> 	Why, every time you do not have infinite development resources,
> of course. You always have to juggle priorities, and subtle attacks 
> typically are not anywhere close to the head of the development 
> priority list for P2P networks...

Of course work has to be prioritized in context. But the priority list
is not a single-file line, where a few frontmost entries prevent
consideration of everything else.

In particular, I would guess the "head of the development priority list"
for most commercial P2P networks is dominated by user satisfaction
issues. But these are only remedied incrementally, with research and
trial and error. The risk of delay is incremental competitive decay,
and the work is never really "done".

At the same time, developers can be addressing other specific flaws --
failures of the software and chosen algorithms to deliver the
functionality intended. Such flaws can't be ignored forever. They
may be easy to fix with a discrete amount of effort. And since
transitioning hash functions requires lead time, the groundwork
should be laid before any change is urgent.

>>Or better yet: design with the idea in mind that no hash function
>>lives forever.
> 
> 
> 	Sure; but that's orthogonal:
> 
> 
>>If you're stuck with a legacy hash, fine, analyze the situation
>>and if you're confident the weakness has no effect on current
>>usage, rationalize using it a while longer.
> 
> 
> 	My point exactly. The issue is whether one should consider the
> deployed legacy codebase unsecure after every new discovery is made 
> in the hash collision research or not. My personal approach would be 
> to disregard the possible collision issues until there is a problem
> serious enough to be noticed by CNN. (So far I still cannot see any
> *realistic* attack scenario; maybe your next letter will convince me
> that I'm wrong :-) But everyone has a personal "worry threshold",
> I guess. Mine is pretty low...

I suppose it depends on how high your ambitions for P2P are. Clearly,
you can have a very popular network with a very weak hash for quite
a while -- witness ED2K, using MD4, a hash "broken" for over a decade.

But over time, users have become more aware of the importance of
hash-based content-verification, and users have generally migrated in
the direction of more-rigorous hash-using networks -- though not to
the *most* rigorous networks.

If P2P is just a leisure-time lark for credulous, casual users who have
many other unhygenic comuting practices, then you can be lacksadaisical
in your use of hash algorithms. If you want it to also be a platform
stable for long-term use by more discriminating users and commercial
endeavors, you should take the strength of your hashes seriously. If
you wait until someone is hurt enough that the damage is reported on
CNN, that's too long.

- Gordon @ Bitzi

> 	Best wishes -
> 	S.Osokine.
> 	16 Feb 2005.
> 
> 
> -----Original Message-----
> From: p2p-hackers-bounces at zgp.org [mailto:p2p-hackers-bounces at zgp.org]On
> Behalf Of Gordon Mohr (@ Bitzi)
> Sent: Wednesday, February 16, 2005 8:12 PM
> To: Peer-to-peer development.
> Subject: Re: [p2p-hackers] SHA1 broken?
> 
> 
> Serguei Osokine wrote:
> 
>>On Wednesday, February 16, 2005 Gordon Mohr wrote:
>>
>>
>>>MD5 should not be used for content identification, given the 
>>>ability to create content pairs with the same MD5, with one 
>>>version being (and appearing and acquiring a reputation for 
>>>being) innocuous, and the other version malicious.
>>
>>
>>	Right. So let's go and try to find something with the same
>>MD5 as this letter of mine, shall we? :-)
> 
> 
> I can't -- but you could have made a collision, very easily, if
> you composed your initial message with the intent of also composing
> an MD5 twin at the same time.
> 
> That means for content identification MD5 is fatally flawed. For
> any file whose contents I think I know and trust, perhaps based
> on analysis and history of the file, there could be another
> dangerous file with the same MD5. MD5 cannot be used to distinguish
> between the two, but that's the whole point of using a secure
> hash for content identification.
> 
> Dan Kaminsky runs over a number of potential attacks that
> are relevant to P2P -- see:
> 
>    http://paketto.doxpara.com
> 
> Don't be fooled by the title of his analysis, "MD to be considered
> harmful someday" -- the attacks mentioned are possible now, and
> could trick people and software in subtle ways different from
> other threats to P2P nets.
> 
> Here's another example from the cryptography list that convinced
> a  doubter that the attacks on MD5 were of more than purely
> theoretical interest: two long binary strings, one a prime number,
> one not:
> 
>    http://lists.virus.org/cryptography-0412/msg00102.html
> 
> Consider source code or executables which work fine with the
> primes, s-boxes, and other initialization vectors initially
> examined -- but have exploitable flaws when those values are
> perturbed in a manner that leaves the MD5 the same. You need
> to use a different, stronger content check to prevent such
> mischief -- making the use of MD5 redundant and even dangerous
> for the false sense of security it gives.
> 
> 
>>	For any practical purpose that I can imagine in a content
>>identification field, MD5 is just fine. And SHA-1 is even more
>>fine. 
> 
> 
> If you can't imagine exploits, perhaps it's just a failure of
> your imagination. Prudent engineering would assume some attackers
> have better imaginations than you, when it comes to exploiting
> hashes that don't work as originally intended.
> 
> 
>>There are plenty more simple ways to attack the CDN nets
>>than MD5 collisions. Way more simple. And abandoning MD5 for
>>SHA1, then SHA1 for Tiger, and then abandoning Tiger for some
>>newer hash when some researcher finds that it is really twenty
>>bits weaker than you thought - it is all just a huge waste of
>>development effort, as far as I'm concerned.
> 
> 
> Depends on the kinds of attacks you're worried about. There
> are more simple ways to disrupt P2P nets, sure. But are there
> more simple ways to trick conscientious, hash-checking users
> into running malware?
> 
> And since when did the ease of other attacks become an excuse
> for ignoring more complicated and subtle (and thus perhaps
> more valuable) attacks?
> 
> If you need a secure hash's properties in your software, you
> should use an uncompromised secure hash. (Results as early as
> 1996 suggested MD5 should not be used in applications where
> collision-resistance is important.)
> 
> If you're stuck with a legacy hash, fine, analyze the situation
> and if you're confident the weakness has no effect on current
> usage, rationalize using it a while longer. But get ready for
> the potential need to switch hashes quickly in the presence of
> further discoveries. Or better yet: design with the idea in mind
> that no hash function lives forever.
> 
> - Gordon @ Bitzi
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers at zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers at zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 
> 
> 





More information about the P2p-hackers mailing list