Of course a lot depends on what your objects/cases are, what your
variables are, and what kind of similarity measure you have.
Since you already have some kind of similarity measure, that decision is
already made for you.
Since you only have 1000 cases/objects you should be able to to use one
of the hierarchical.
several packages have methods for clustering.
e.g., SPSS has several methods that can also work with a matrix of
similarities.
BAVERAGE Average linkage between groups (UPGMA). BAVERAGE is the default
and can also be requested with keyword DEFAULT.
WAVERAGE Average linkage within groups.
SINGLE Single linkage or nearest neighbor.
COMPLETE Complete linkage or furthest neighbor.
CENTROID Centroid clustering (UPGMC). Squared Euclidean distances are
commonly used with this method.
MEDIAN Median clustering (WPGMC). Squared Euclidean distances are
commonly used with this method.
WARD Ward’s method. Squared Euclidean distances are commonly used with
this method.
No matter which package you end up using the chapter in the SPSS Base
User's Guide "Choosing a Clustering Procedure" should be of help.
In version 16 that is chapter 32.
I have been doing clustering since 1971. I seldom rely on the results
of a single clustering. However, the interpretation of the tree and
where to cut it to decide how many clusters to retain relies heavily on
the profile of the cluster. You don't have that. You will need to rely
on whatever info you have about the clusters to make these decisions.
The SPSS do***entation including the algorithms and the command syntax
come with the software. If you are not at a university or other place
where SPSS is readily available, send me an email and I'll send you the
..pdf chapter from the do***entation.
Good luck.
Art Kendall
Social Research Consultants
Sengly wrote:
> Dear all,
>
> I would like you to share with me your experience on how should I
> handle my data. I have 1000 objects and I have a list of pair
> similarity of them. I would like to know how to cluster them into
> different groups according to their similarity?
>
> I have browse through various methods such as hierarchy, k-means,
> scaling dimension, etc. I really like k-means method but the problem
> is that I don't have points (and their coordinates) in space but
> rather their similarity.
>
> Any suggestion is appreciated.
>
> Kindest regards,
>
> Sengly


|