[p2p-hackers] Bitzi (was Various identifier choices)

Brandon Wiley cyb at azrael.dyn.cheapnet.net
Fri Sep 7 02:06:02 UTC 2001


> The dump will grow quite large; how often would you expect to 
> schedule full-fetches (or delta-fetches)?

It would be best if Bitzi implemented the searching API directly so that
clients could talk directly to Bitzi without having to download the entire
dump. If not then it would probably be best to have a centralized service
which occasionally fetches a dump from Bitzi and then implements the
searching API so as to free normal nodes from having to fetch anything
massive. So if I end up implementing a Bitzi searching service then
fetches will be scheduled whenever it is convenient for Bitzi for fetches
to be scheduled.

> Do you have any example RDF dumps which would demonstrate the
> fields and format conventions you'd find most useful? 

Yes I do. My search engine can be configured to handle any schema. However
I've been using Dublin Core because it's a standard schema for talking
about files and generally people want to search for files. I've attached
an example database. It doesn't use all of the dublin core fields, just
the ones that I felt like filling in. On a side note, I replaced the DC
schema one day with one I made up and turned my search engine into a
personal contact information database and let me friends add
themselves. So it's not limited to file searching.

> (We won't invent if there's already good precedents to mimic,
> and we could crank out an initial dump in very short order if
> it'd help give you something better to demo at O'R-P2P.)

That would be great! I could give a great demo with a fat database. If you
decide to include fields that aren't in Dublic Core then just give me a
list of the names of the fields and I'll configure it to use that schema
instead.

> What is your dominant search model? Free text across all credible 
> metadata? Field-specific with things like scalar value comparisons 
> (e.g. "128 <= bitrate <= 196")? Both?

Currently the API only supports substring matches on a field-by-field
basis for a set of fields defined by a particular schema. So if you're
using DC, for instance, you can search for "ala" in the "Creator" field
and "Wo" in the Title field, things like that. I'd like to add more
complex searching to the API but I think that some discussion needs to
occur regarding a good API for searching metadata before the API can be
extended past its most various basic and obvious initial form.

> And that's exactly the role we'd like to play -- being the 
> steward for cataloguing tasks which are easiest to do with a 
> shared, central reference point, while letting the metadata
> itself travel whatever chaotic paths make the most sense to
> system developers and users.

Whee! This sounds like fun.





More information about the P2p-hackers mailing list