[p2p-hackers] substring matching with Bloom filters

gbildson at limepeer.com gbildson at limepeer.com
Mon Jul 4 19:25:43 UTC 2005


We route queries on the two last hops based on a bloom filter bitvector
representing keywords in this way.  For routing purposes, it is a true or a
false so extra trues don't matter.  We avoid encoding such short words in
shorter forms so "h" would not be in the encoding.

I'm not entirely clear on what you are replicating so perhaps that has some
significance.

Thanks
-greg
Quoting Hailong Cai <hcai at cse.unl.edu>:
>
> I'm interested in more details. Do you know where I can find some documents
> on this?  By "length 1", do you mean that "h" can match "how".  If so, it
> seems we have too many matches for every query.
>
> Thanks
> hailong
>
> -----Original Message-----
> From: gbildson at limepeer.com [mailto:gbildson at limepeer.com]
> Sent: Monday, July 04, 2005 2:04 PM
> To: Peer-to-peer development.; Hailong Cai
> Cc: 'Peer-to-peer development.'
> Subject: Re: [p2p-hackers] substring matching with Bloom filters
>
> Gnutella encodes length, length-1, length-2 and I believe length-3 to catch
> the
> standard suffixes and other common shortenings.  Degenerate words and cases
> are
> avoided.  Not optimal but better than nothing.
>
> Thanks
> -greg
>
> Quoting Hailong Cai <hcai at cse.unl.edu>:
>
> > Hi there,
> >
> > I know that some P2P systems as well as research prototypes use Bloom
> > filters as content replications.  However, using Bloom filters does not
> > support substring matching such as "how" matches "however", and wildcard
> > matching.  Is there any solution for this already?
> >
> > Thanks
> >
> > Hailong
> >
> >
> >
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers at zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
>
>
>






More information about the P2p-hackers mailing list