The Flamenco Project

Demos of Castanet:

(Semi)Automatic Creation of Facet Hierarchies

Castanet is an algorithm for (semi)automatically generating hierarchical faceted metadata from textual descriptions of items, to be incorporated into browsing and navigation interfaces for large information collections, for systems like Flamenco.

Castanet carves out a structure from the WordNet lexical hierarchy, with the goal of building facets that reflect the contents of the target information collection; moderate manual modifications improve the outcome.

The algorithm is simple yet effective: a study conducted with 34 information architects found that Castanet achieves higher quality results than other automated category creation algorithms, and 85\% of the study participants said they would like to use the system for their work.

For more information, see this paper: Automating Creation of Hierarchical Faceted Metadata Structures Emilia Stoica, Marti Hearst, and Megan Richardson, in the proceedings of NAACL-HLT, Rochester NY, April 2007. pdf