Making a Web Search Feel Like a Stroll in the Library
By ANNE EISENBERG
VISIT to the school library was once a necessity before writing term
papers or reports. But nowadays many students use the Internet as their
However convenient it may be to search the Web from home
or a dorm room, the Internet cannot replace many of the built-in
benefits of the library, like browsing the stacks for related
information that could add spark and depth to an essay or a report.
But researchers are working on more flexible approaches to searching
for digital information not only on the Web, but on one's own hard
drive, where elusive details may be scattered through photos, e-mail
and other files.
At the University of California, Berkeley, a
professor and her students have created a search program called
Flamenco that lets users browse a digitized collection in ways that are
similar to a stroll among the shelves of a library.
when you are not quite sure what you want," said Marti Hearst, an
associate professor at the School of Information Management and
Systems, who led the research. "It's meant to help people find things,
in part, by serendipity."
To create Flamenco, Dr. Hearst
started with one archived collection of art at the Fine Arts Museums of
San Francisco, which included 35,000 images that were identified by
written descriptions. She used the descriptions to classify the items
in a variety of ways, including the medium, the date, the artist and
the content of the image.
The categories were then cross-linked
so that when people clicked on a category, they immediately saw not
only the images within it - say, of landscapes - but those in related
categories, like other artists working on landscapes at the same time
in the Netherlands.
The effect, she said, is very much like walking down a library aisle and finding related books on a subject.
search program is also intended to let people look at multiple
subcategories at once, she said. For example, a student doing research
for an essay on the depiction of flowers in the 18th century can click
on the "flowers" category. The system can immediately group the flowers
in the collection by subcategories like the kind of flower and show
thumbnail-size images of them.
It can then group the irises or
chrysanthemums by medium, for instance, listing all the ceramics pieces
showing these flowers or all of the prints or drawings that include
them. It can group the images by decade - showing, for example, how
flowers were portrayed in 1740 compared to 1780. "This way," Dr. Hearst
said, "people can compare and contrast, discovering new categories and
Dr. Hearst has been working for 10 years on ways
to browse digital collections, inspired in part by her own frustration
in searching the Web. Flamenco, financed in part by the National
Science Foundation, is still a prototype; she will be testing it this
month with students.
The Web is not the only place where searches
are made. Often, necessary details are scattered across a computer hard
drive, making them hard to find. To address this problem, Bruce Horn,
the founder of Ingenuity Software in Mammoth Lakes, Calif., has created
an information management system, now being tested, that lets people
individually tailor and cross-index all kinds of files.
Dr. Horn, one of the members of the original Macintosh team at Apple Computer,
has added another layer of organization beyond folders to his desktop
system. The layer is called "collections" because the system collects
and cross-links all references to any subject that the user specifies.
For example, someone researching John Adams and his presidency could
make a collection by telling the program to find any mention of him and
related historical events.
While some current software uses a
"collection" system to keep track of one kind of file - digital photos,
for instance - Dr. Horn's software can handle many kinds of files.
The collection does not copy the actual items, a move that could
multiply storage demands and possibly lead to changes in original
documents. "The items remain in their original folders," he said, "and
are referenced by the collection."
There are many ways to put
objects into collections. "People can drag and drop them in," he said,
"or use an annotation to classify items one by one, for instance, in a
group photo." Items can also be put into collections automatically by
using key phrases.
Dr. Horn and Dr. Hearst both presented their work at a conference at the I.B.M.
Almaden Research Center in California, organized by Daniel Russell,
senior scientist there, to discuss new approaches to dealing with the
ever-increasing mass of the Web. "Too much information was our topic
this year," Dr. Russell said. "Way too much information."
types of information are constantly evolving, he added, citing moblogs
- Web pages filled with photos from cellphones - as one of the latest
examples. Video, too, is being stored at a ferocious rate, he said, as
are radio shows.
And all of it has to be made searchable, he said.