The New York Times The New York Times Technology Write a movie review

NYTimes: Home - Site Index - Archive - Help

Welcome, mmmmmm - Member Center - Log Out
Site Search:  

Mary Ann Smith

Email This Article E-Mail This Article
Printer Friendly Format Printer-Friendly Format
Most E-mailed Articles Most E-Mailed Articles
Reprints & Permissions Reprints & Permissions

Subscribe to Circuits
Sign up to receive a free weekly Circuits newsletter by e-mail, with technology news and tips and exclusive commentary by David Pogue, the State of the Art columnist.



Computers and the Internet

Indexers and Indexing

University of California

Fine Arts Museums of San Francisco

NYT Store
Photo: Alcindor vs. Hayes, 1967

Get the best deal on a new car. From convertibles to trucks‚ compare different cars by make‚ price‚ safety features and more.  

Compare and save on your next car.
Find a car dealer near you.


Making a Web Search Feel Like a Stroll in the Library


Published: August 19, 2004

A VISIT to the school library was once a necessity before writing term papers or reports. But nowadays many students use the Internet as their library.

However convenient it may be to search the Web from home or a dorm room, the Internet cannot replace many of the built-in benefits of the library, like browsing the stacks for related information that could add spark and depth to an essay or a report.

But researchers are working on more flexible approaches to searching for digital information not only on the Web, but on one's own hard drive, where elusive details may be scattered through photos, e-mail and other files.

At the University of California, Berkeley, a professor and her students have created a search program called Flamenco that lets users browse a digitized collection in ways that are similar to a stroll among the shelves of a library.

"It's for when you are not quite sure what you want," said Marti Hearst, an associate professor at the School of Information Management and Systems, who led the research. "It's meant to help people find things, in part, by serendipity."

To create Flamenco, Dr. Hearst started with one archived collection of art at the Fine Arts Museums of San Francisco, which included 35,000 images that were identified by written descriptions. She used the descriptions to classify the items in a variety of ways, including the medium, the date, the artist and the content of the image.

The categories were then cross-linked so that when people clicked on a category, they immediately saw not only the images within it - say, of landscapes - but those in related categories, like other artists working on landscapes at the same time in the Netherlands.

The effect, she said, is very much like walking down a library aisle and finding related books on a subject.

The search program is also intended to let people look at multiple subcategories at once, she said. For example, a student doing research for an essay on the depiction of flowers in the 18th century can click on the "flowers" category. The system can immediately group the flowers in the collection by subcategories like the kind of flower and show thumbnail-size images of them.

It can then group the irises or chrysanthemums by medium, for instance, listing all the ceramics pieces showing these flowers or all of the prints or drawings that include them. It can group the images by decade - showing, for example, how flowers were portrayed in 1740 compared to 1780. "This way," Dr. Hearst said, "people can compare and contrast, discovering new categories and relationships."

Dr. Hearst has been working for 10 years on ways to browse digital collections, inspired in part by her own frustration in searching the Web. Flamenco, financed in part by the National Science Foundation, is still a prototype; she will be testing it this month with students.

The Web is not the only place where searches are made. Often, necessary details are scattered across a computer hard drive, making them hard to find. To address this problem, Bruce Horn, the founder of Ingenuity Software in Mammoth Lakes, Calif., has created an information management system, now being tested, that lets people individually tailor and cross-index all kinds of files.

Dr. Horn, one of the members of the original Macintosh team at Apple Computer, has added another layer of organization beyond folders to his desktop system. The layer is called "collections" because the system collects and cross-links all references to any subject that the user specifies. For example, someone researching John Adams and his presidency could make a collection by telling the program to find any mention of him and related historical events.

While some current software uses a "collection" system to keep track of one kind of file - digital photos, for instance - Dr. Horn's software can handle many kinds of files.

The collection does not copy the actual items, a move that could multiply storage demands and possibly lead to changes in original documents. "The items remain in their original folders," he said, "and are referenced by the collection."

There are many ways to put objects into collections. "People can drag and drop them in," he said, "or use an annotation to classify items one by one, for instance, in a group photo." Items can also be put into collections automatically by using key phrases.

Dr. Horn and Dr. Hearst both presented their work at a conference at the I.B.M. Almaden Research Center in California, organized by Daniel Russell, senior scientist there, to discuss new approaches to dealing with the ever-increasing mass of the Web. "Too much information was our topic this year," Dr. Russell said. "Way too much information."

New types of information are constantly evolving, he added, citing moblogs - Web pages filled with photos from cellphones - as one of the latest examples. Video, too, is being stored at a ferocious rate, he said, as are radio shows.

And all of it has to be made searchable, he said.



.WHAT'S NEXT; Inventing a Way to Index 50,000 Memories of the Holocaust  (October 25, 2001)  $
.Ideas & Trends; Searching for the Search Engines of Meaning  (November 28, 1999)  $
Find more results for Computers and the Internet and Indexers and Indexing

. Shares of Google Surge After Debuting on Nasdaq
. Nortel to Cut Its Work Force by 3,500
. Online Shopper: Spotting Signposts to Higher Education
. In the Classroom, Web Logs Are the New Bulletin Boards
Go to Technology

IBM Middleware for mid-sized companies. That's On Demand Business.

AT&T CallVantage Cable/DSL home calling

Learn about the risks of high cholesterol


[Munched] [Munched]