[p2p-hackers] substring matching with Bloom filters

gbildson at limepeer.com gbildson at limepeer.com
Mon Jul 4 19:31:01 UTC 2005


I would look at these two papers for insight.

http://www.limewire.com/developer/query_routing/keyword%20routing.htm

http://aeolusres.homestead.com/files/index.html

We don't actually use the full depth routing mentioned in these papers but we
did experiment with it.  We use a simplified depth for the last two hops.

Thanks
-greg

Quoting gbildson at limepeer.com:

> We route queries on the two last hops based on a bloom filter bitvector
> representing keywords in this way.  For routing purposes, it is a true or a
> false so extra trues don't matter.  We avoid encoding such short words in
> shorter forms so "h" would not be in the encoding.
>
> I'm not entirely clear on what you are replicating so perhaps that has some
> significance.
>
> Thanks
> -greg
> Quoting Hailong Cai <hcai at cse.unl.edu>:
> >
> > I'm interested in more details. Do you know where I can find some documents
> > on this?  By "length 1", do you mean that "h" can match "how".  If so, it
> > seems we have too many matches for every query.
> >
> > Thanks
> > hailong
> >
> > -----Original Message-----
> > From: gbildson at limepeer.com [mailto:gbildson at limepeer.com]
> > Sent: Monday, July 04, 2005 2:04 PM
> > To: Peer-to-peer development.; Hailong Cai
> > Cc: 'Peer-to-peer development.'
> > Subject: Re: [p2p-hackers] substring matching with Bloom filters
> >
> > Gnutella encodes length, length-1, length-2 and I believe length-3 to catch
> > the
> > standard suffixes and other common shortenings.  Degenerate words and cases
> > are
> > avoided.  Not optimal but better than nothing.
> >
> > Thanks
> > -greg
> >
> > Quoting Hailong Cai <hcai at cse.unl.edu>:
> >
> > > Hi there,
> > >
> > > I know that some P2P systems as well as research prototypes use Bloom
> > > filters as content replications.  However, using Bloom filters does not
> > > support substring matching such as "how" matches "however", and wildcard
> > > matching.  Is there any solution for this already?
> > >
> > > Thanks
> > >
> > > Hailong
> > >
> > >
> > >
> > > _______________________________________________
> > > p2p-hackers mailing list
> > > p2p-hackers at zgp.org
> > > http://zgp.org/mailman/listinfo/p2p-hackers
> > > _______________________________________________
> > > Here is a web page listing P2P Conferences:
> > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> > >
> >
> >
> >
>
>
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers at zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>






More information about the P2p-hackers mailing list