NOTE: this "archaeological artifact" paper is a true time-capsule from 1995-1996. The very early days of the web. I recently found it among some old backups and dusted it off. It won the "best paper award" in an internal Silicon Graphics Worldwide conference back in 1996. Unfortunately, I had to disable most of the links because they used to point at various locations on the old Silicon Graphics intranet. Still, it remains a fun read today -- ariel.
Lessons I learned from Sniff
A case study of the human factors
in software development at SGI
By: Ariel Faigon
[ Abstract | History | Evolution | Future | Summary | References ]
Sniffis the SGI internal Web search engine. It is one of the most popular internal web pages within SGI. This paper describes Sniff as a case study. A bit of history, what I learned from the project and some future plans.
This is not a technical paper. Like Tom DeMarco and Timothy Lister (Peopleware) and more lately Steve Maguire (Debugging the engineering process), I believe too, that the human factors of engineering are grossly understated in the literature so this is my humble attempt at improving the situation.
Sniff is a newcomer to the SGI Web. As of Dec 1995, it is less than 6 months old. I'm pretty much a newbie as well, I've been a little over six months with SGI.
Sniff was born out of the need to better share information among the engineering community, as it turned out, and true to the SGI spirit, it expanded to something much more useful and encompassing. When I came to SGI and after talking to many people, I was looking for a really quick way to index the full internal web. The idea was to employ the 90%-10% rule to achieve 90% of the results in 10% of the effort: take something that already works from the net and deploy it. I have looked at many of the available search engines on the net and zeroed on the Lycos beta that was available at that time (since Lycos became a commercial enterprise, the source is no longer available) the main reasons for choosing Lycos were scalability, speed, and a promising scoring function (note that Alta Vista, Excite, HotBot and other search engines didn't yet exist then).
At about the same time, Silicon Junction (version 1) was looking for a search engine, Brian McClendon pointed them to me, and Sameer Singh and I began to work together. I was pleasantly surprised by how easy it is to create virtual teams that span divisional boundaries at SGI, and how supportive is management to such cooperation. That was my first Sniff lesson from Sniff, and it was a very positive one, it quickly became clear that Sniff is going to transcend the engineering community and instead be an SGI-wide resource to be used by everyone, manufacturing, marketing, administration, sales etc.
A little while before Sniff was operational, Terry Weissman, (who is now with Netscape) was voluntarily running a robot on campus, I managed to catch him for a short interview a day before he left. His robot was dying mysteriously, and had some scalability problems. I learned that there were about 25,000 web pages at SGI at that point. Terry's prototype was indexing the titles only, rather than the full text, was limited to html docs, and its scoring was inferior to that of Lycos. Despite all these deficiencies, that search engine seemed quite useful to me, as a newcomer to SGI, it helped me to find many things I was looking for. It also taught me my second Sniff lesson: at SGI everything depends on individuals' initiatives. Time to take responsibility.
Sameer did most of the CGI programming for Sniff, and came with many ideas that helped later generations of Sniff to get better. The first Sniff version was announced on the Silicon Junction home page on June 20, 1995. It was almost a bare Lycos with a few bug fixes, plus Sameer's URL administration (search a URL, add a URL) add-ons.
At this point Sniff wasn't too popular. Based on early feedback, its main flaws were:
- Ugly output, unneeded details clutter the screen real estate
- Too much junk was indexed, can't see the trees for the forest
- Scalability struck again: EFS file systems were filling up because EFS didn't support dbm holes in files, also hashing of integer keys was performing abysmally badly
It was time to sit down and write a TODO list.
During subsequent months, I was working on other stuff, but from time to time I felt the itch to go back to Sniff, and try to improve on it. The best improvements came when something really bothered me and I spent some late night or a weekend working on it. At that time I basically rewrote the whole Sniff robot so it now contains almost no Lycos original code unlike the indexer and search engine which were still Lycos based at the time of this writing. [Ed: this has changed, as of June 1996, there's no Lycos code in the indexer and the search engine either.] The main improvements were:
- Get it to work on EFS, without filling the disk too quickly
- Filter a lot of junk files
- A big speed improvement
- Many bug fixes, mainly in converting URL references to absolute form
- More robust detection of unindexable files
- Improving maintainability: base parts the code on the wwwperl (0.40) library
- Adding functionality:
- Save each URL meta data (last modified date, size, type etc.)
- Keep track of all the URL references as a basis for the later bad links report.
- Ability to do incremental exploration based on the If-Modified-Since capability and other user selectable criteria.
- Provide a statistical view of the growing SGI Web.
Throughout this incremental improvement process, Sniff taught me my third lesson.
My involvement with the Silicon Junction team has brought a lot of good things to Sniff. I was invited to the Silicon Junction redesign off-site, met with many other webmasters, and got excellent feedback from the SJ team: Mark Brown, Frank Dietrich, Sameer Singh and Russell Whitaker. It was then when Shiraaz Bhabha was brought in by Sameer, and her graphical talents made SJ the great web that it is today, She gave Sniff its cute logo.
Sniff continued to improve. One night I hacked a CGI script to report bad links from/to your server, in another I cleaned out the output so it'll look much nicer, and used titles, rather than URLs. I also ported the robot to perl5, said goodbye to all the code that tried to reinvent the common HTTP/Web tasks such as making a URL reference absolute, fetch a URL etc. Such code was bound to break with the rapid changes in the Web standards, so it was prudent to rely on free (and rapidly evolving) code from the net. The Sniff robot went from using Lycos proprietary code, to using libwww-perl-0.40 (the perl4 based Web/HTTP library) to the recent version of LWP5b6 (the most recent perl5 based Web/HTTP library) after all these small changes, the robot code was nothing like when I started, it was much cleaner, easier to read, and reliable. Around that time, Sniff was also endowed with a sniffity voice based on Anati's suggestion. Interestingly, almost all these changes were done in a kind of unplanned burst mode, after I felt an itch to fix or improve something. That was my 4th Sniff lesson.
As Sniff was improving, people started to use it more and I was starting to get more and more email with feedback. Some of it was difficult to take since it contained explicit & offensive language about how bad Sniff was :-) However, almost invariably, when I was not quick to react and took the time to think about the problem from an end-user point of view, I learned that those who bothered to take their time and write to me, were right. They complained because it didn't fulfill their need. Of course, I had to write a FAQ about Sniff to save time in repeating answers, but more importantly, that's how I got my 5th Sniff lesson.
One interesting anecdote about Sniff was that suddenly, it became really easy to find stuff on the Web. I got a few requests from managers to block access to certain classified keywords (to which I quickly obliged) I also had to write a guide about How to restrict access to the data on your Web server. Nevertheless, based on extensive experience with information restrictive environments, in my previous place of work, I'd like to point out the following: I'm 100% convinced that an open environment like that of SGI tends to make people much more productive and motivated than classified, multi layered, "spread information on a need-to-know basis" based cultures. Thus Sniff had taught me a 6th lesson.
Looking back at the Sniff project, its popularity among users, and the relatively small amount of time I worked on it, led me to an additional important lesson. I used to work for another company in the valley (name withheld to protect the innocent). When I tried to compare my overall productivity in my former place, with my productivity at SGI, I found the improvement factor hard to believe. In about three months at SGI I managed to make an effect that is several times bigger (based on the usage of Sniff on feedback I get from all over the company , and on personal gut feeling) than what I was able to do in the last two years with my previous employer. The striking fact was that here I was, the same guy, with the same experience, qualifications, training, and by just moving across the street, into a more supportive and inspiring environment, I felt like I was literally transformed into someone else.
Just comparing what I heard from my previous manager with what I was hearing while working at SGI was a lesson in itself: The most common question there was "what have you done", while the most common question here was "what do you need." Also, with my previous employer, I was spending most of my time in frustrating "evangelizing" activities, trying to get buy-in and support from others, at SGI the support and props were almost granted, so naturally I was engaged most of the time, doing simple productive work. With this flashback, I was no longer wondering why the revenue per employee and yearly revenue growth at SGI are about 4 times higher than those of my previous employer. This is in a pinch what two respected Stanford professors tried to convey in their book Built to Last and what Tom DeMarco and Timothy Lister describe in the PeopleWare chapter on their "Coding War Games" study; the effect of the team standards and quality on the individual team member. I consider this to be the most important lesson I learned from the Sniff project.
The future of Sniff
Where will Sniff go from here?
It partially depends on your further feedback. But the most important changes I've been contemplating are:
- Making the index more flexible and general: allow numerical queries and scrapping the 26x26x26 Lycos lookup table. Basically, this means a full rewrite of the indexer and search engine.
- Improving the scoring, both functionally (give bonus weight to keyword proximity) and algorithmically: avoid a full sorting of thousands of hits when only the top N are needed.
The first improvement will be based on some hash table or a PATRICIA tree to allow arbitrary characters in words. The second will be based on either priority-queue/heap-sort or maybe even on a linear time binsort (also called bucketsort) based on the fact that the scoring can be "quantized" into a finite number of values rather than spread on the full spectrum of a floating point number.
Another interesting development is that SGI is now in the process of complementing the WebFORCE offering with a more comprehensive, integrated solution that will include Web management and administration. Naturally, one of the components of such a solution is a site robot and a search engine. I hope to be able to convey what I learned to the team who is working on that project and that SGI will come out with a great organizational Web search product that will help our Web related sales to grow even further.
[June 1996: Good happens: Check out the new Sniff. Ed.]
Summary: Lessons I learned while working on Sniff
- Lesson #1: Get out of your cubicle, meet and talk with other people. People who share a passion, work much more productively together than those just assigned to a task.
- Lesson #2: Just do it! Volunteer, don't wait for others to do it. If you don't like something - change it. The spirit of "just do it" or "if you don't like it, no one is stopping you from improving it" flies high at SGI. You are empowered to do what you want. Improvement depends on you.
- Lesson #3: This one is the obvious: "Plan to throw one away - you will anyhow" but then, even Fred Brooks (The Mythical man month) as early as that was, knew that.
- Lesson #4: A programmer's most productive moments are those which are born out of a passion or itch to do something better, rather than out of carefully planned or scheduled activities. If you want productive people, let them follow their passions.
- Lesson #5: Users who send you feedback are your most valuable clients. Even though criticism may be hard to take, try to concentrate on the reason that led to the complaint or the suggestion, rather than on the complaint/suggestion itself, or its style. Listen, and most importantly try your best at never saying "no".
- Lesson #6: The most efficient organizations are those within which information flows easily.
- Lesson #7: The same people, in different environments can perform amazingly differently. Inspiring teams and environments, and spirits of enterprises make up a huge difference in the productivity of the individual
- Debugging the Development Process
Practical Strategies for Staying Focused, Hitting Ship Dates, and Building Solid Teams
by Steve Maguire.
Microsoft Press. ISBN 1-55615-650-2
Productive projects and teams.
by Tom DeMarco & Timothy Lister.
Dorset House Publishing Co., 1987. ISBN ...
- Built to Last,
Successful Habits of Visionary Companies
by James C. Collins and Jerry I. Porras,
Harper Business. ISBN 0-88730-671-3
Feedback: Ariel Faigon email@example.com
(Ed note: this email address is long obsolete)