Example 3: Spam vs. NOT Spam: Same as example 2, except a 7th doc describing "serious" investment oppertunities in Nigeria has been added. Using n=1, the clustering C1(1,2,4,3,5,6) C2(7) isolates the 7th document from the Nigerian spam emails. Press "Measure similarity" at the bottom of the page or select "Clear URIs" and input your own settings.[Clear URIs]
An experimental document classifier based on the vector space model and agglomerative clustering. Input is a number of links to documents to be analyzed. Output is a distance matrix depicting the similarities of the documents and how they cluster.