Version 2.0 April 2008 (NEW agglomerative clustering)[Ex.1] [Ex.2] [Ex.3]
Example 3: Spam vs. NOT Spam: Same as example 2, except we have added a 7th doc also describing investment oppertunities in Nigeria. The settings are brought back to "normal" employing idf and words as terms. The clustering isolates the 7th document from the real spam emails.! Press "Measure similarity" at the bottom of the page or select "Clear URIs" and input your own settings.[Clear URIs]
An experimental document classifier based on the vector space model and agglomerative clustering. Input is a number of links to documents to be analyzed. Output is a distance matrix depicting the similarities of the documents and how they cluster.