[linux-elitists] Tuesday 29 March 2005 NYLUG Meeting: Craig Nevill-Manning of Google on Finding Needles in a 20 Terabyte Haystack

jays@panix.com jays@panix.com
Sun Mar 27 23:20:21 PST 2005


Unofficial Paragraphs from Jay Sulzberger:

Note that you may not be able to get into this meeting, due to obscure
difficulties.  See second announcement below for partial explanation.

We the Free Software Movement have built GNU, and the free BSDs, and we've
got the Linux kernel, started by a student in Finland at the beginning of
the Nineties of the last century.  We've got Perl and Apache and Mozilla,
SCM, CMUCL, CLISP, PHP, Haskell, Python, KDE, X, ssh, and much much more.
And we built them all.

The Free Software Tribes of New York should be able to find a large enough
meeting place for NYLUG and twenty other organizations, sodalities,
political arms, clubs, marching societies, and so on and on and on.  If
necessary, let us build our own place.

Though the formal meeting may be truncated, there will be neither let nor
hindrance to the eating and drinking afterwards.  Here is a quote from the
official NYLUG announcement:

  Stammtisch
       After the meeting ... Join us around 8:30pm or so at TGI Friday's,
       located at 677 Lexington Avenue and 56th Street, second floor.
       Northeast corner.

Jay Sulzberger <secretary@lxny.org>
Corresponding Secretary LXNY
LXNY is New York's Free Computing Organization.
http://www.lxny.org


Postscript on texts below: Below are copies of two different notices, taken
from the nylug-announce list.  Each is bracketed with its own
angle-bracketed quote marks, XML style.


<blockquote
  what="first official NYLUG announcement of 29 March 2005 meeting">

   From: John Bacall <john@unixen.org>
   To: NYLUG Announcements <nylug-announce@nylug.org>
   Date: Wed, 16 Mar 2005 09:38:26 -0500 (EST)

   March 29th, 2005
   Tuesday
   6:30PM-8:00PM
   IBM Headquarters Building
   590 Madison Avenue at 57th Street
   12th Floor, home to the IBM Linux Center of Competency

   ** RSVP Instructions **
       NEW POLICY: You must R.S.V.P. for *EVERY* meeting.
       Register at http://rsvp.nylug.org/
       Check in with photo ID at the lobby for badge and room number.


                           Craig Nevill-Manning (Google)
                                        -on-
                     Finding Needles in a 20 Terabyte Haystack


      Due to scheduling, venue problems this month's meeting will be on
      Tuesday, 29 March. Please mark your calendars. If you can help with
      a modern (projector, connectivity), large, regular space we would
      like to hear from you.

      What to think when a company's name becomes a verb? When through word of
      mouth and no paid advertising it is commonplace? We are witnessing
      something especial no doubt, a rare a bird. We are speaking of Google,
      Inc. of course. The preeminent, global entity in Net search.

      Tuesday, March 29 Craig Nevill-Manning of Google will make a
      presentation for the New York Linux Users Group entitled "Finding
      Needles in a 20 Terabyte Haystack: 200 million times per day."

      In Craig's own words. ``Google faces two large technical challenges:
      Ensuring that our search results are as relevant as possible, and
      serving hundreds of millions of queries in a fraction of a second each
      at a reasonable cost. To solve the first problem we perform an offline
      matrix computation to produce PageRank, a query independent measure of
      page reputation, and combine it with more traditional query-specific
      scoring. To solve the distributed computing problem, we use tens of
      thousands of commodity PCs and highly fault-tolerant software. I will
      discuss some details of these solutions, and also share some interesting
      statistical tidbits about search and the web.''

      Google has taken an unorthodox approach to its mission, and it has paid
      off handsomely. To exerpt a passage from a developerpipeline.com
      article:

        To search the [Google] index quickly, Google breaks it "into pieces
        called shards," scattered across servers so they may be searched in
        parallel, each server coming up with part of the answer to a question
        and feeding it back for aggregated results.

        Google's file system, indexing technology, and grid of commodity
        servers allow it to achieve search times of a quarter of a second on
        a typical query. The replication and constant heartbeat messaging
        built into the file system gives it high reliability and
        availability, he noted.

        In addition, as Google servers parse queries, they break them down
        into smaller tasks and make one trip to the database for a result
        that may satisfy many users. The process is called "map reduction."
        Hoelzle said Google once "lost 1,800 of 2,000 map-reduction machines
        in a large-scale maintenance incident." Because of the load balancing
        built into the system, Google still completed all queries by steering
        uncompleted tasks to the machines that showed they had processing
        power.

      This will be a highly attended meeting, space is limited.

   For More Information Visit:

        * developerpipeline.com article
           http://developerpipeline.com/showArticle.jhtml?articleId=60404907
        * Interesting projects coming out of Google Labs
           http://labs.google.com/
        * A paper on the Google File System
           http://www.cs.rochester.edu/sosp2003/papers/p125-ghemawat.pdf
        * A paper on the Google MapReduce system
           http://labs.google.com/papers/mapreduce.html

   About Craig Nevill-Manning:

      Dr. Craig Nevill-Manning is a Senior Staff Research Scientist and New
      York Engineering Director at Google. While at Google, he has led the
      development team for Froogle, a product search engine. Prior to his four
      years at Google, Dr. Nevill-Manning was an assistant professor in the
      Computer Science Department at Rutgers University and a postdoctoral
      fellow at Stanford University.

   Swag (Give Away) - During the meeting... unusally terrific swag of
      non-predetermined origin will be given out to all attendees at the
      regular meeting for free as usual.

   Stammtisch
       After the meeting ... Join us around 8:30pm or so at TGI Friday's,
       located at 677 Lexington Avenue and 56th Street, second floor.
       Northeast corner.

   Please see our home page at http://www.nylug.org for the HTMLized
   version of this announcement, our archives, and a lot of other good
   stuff.

   Monthly Reminder!
       Please read the NYLUG-Talk Posting Guidelines at:
       http://www.nylug.org/mlistguide/

   ________________________________________________________________________
   March 2005 - The New York Linux Users Group, NYLUG.org
   ______________________________________________________________________
   Hire expert Linux talent by posting jobs here :: http://jobs.nylug.org
   nylug-announce mailing list
   nylug-announce@nylug.org
   http://nylug.org/mailman/listinfo/nylug-announce

</blockquote>


<blockquote
  what="official NYLUG announcement of difficulties in getting into 29 March 2005 meeting">

   Date: Thu, 24 Mar 2005 10:46:37 -0500
   From: Ron Guerin <ron@vnetworx.net>
   To: nylug-talk@nylug.org, nylug-announce@nylug.org

   Reminder:

   This month's meeting is on Tuesday, March 29 at 6:30pm.
   Registration is no longer possible for this meeting as it is
   filled to capacity.

   If you're already confirmed for the meeting, we'll see you
   there on Tuesday.  If you're already confirmed for the meeting
   but can no longer attend, please send me a note so I can make
   your spot available to someone else.  Thanks.

   - Ron
   ______________________________________________________________________
   Hire expert Linux talent by posting jobs here :: http://jobs.nylug.org
   nylug-announce mailing list
   nylug-announce@nylug.org
   http://nylug.org/mailman/listinfo/nylug-announce
      
</blockquote>


Distributed poC TINC:

Jay Sulzberger <secretary@lxny.org>
Corresponding Secretary LXNY
LXNY is New York's Free Computing Organization.
http://www.lxny.org



More information about the linux-elitists mailing list