[linux-elitists] Re: SCO Teleconference transcript (Friday, May 30, 2003)

Brian McGroarty brian@mcgroarty.net
Wed Jun 4 05:42:47 PDT 2003

I've been thinking about this.

One could preprocess C source, running the C preprocessor output into
a stream of C tokens, a generic symbol representing all specific
symbols, and discard everything else. With the resulting stream in
hand, it should be possible to compare large codebases and quickly
find expanses of 'common code.' This would work even when the source
formats and naming conventions are completely different.

If anyone's got their hands on SCO code, I bet we could "prove" it's
been derived from nethack, emacs -and- vi, or something similarly

Damning chunks from five to ten or fifteen lines, they say?

