Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Education > Artificial intelligence > clustering and ...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 1 of 1 Topic 634 of 660
Post > Topic >>

clustering and data variance question

by "ozgun.harmanci" <ozgun.harmanci@[EMAIL PROTECTED] > May 11, 2008 at 08:19 AM

Hello,
We have been doing some data clustering to compare samples generated
by two different methods: A method is used to generate sample x_1,
then we cluster x_1 using diana in R package and determine the optimal
clustering scenario by maximizing calinsky harabasz index (as
calculated by R). diana is divisive analysis, which is a hierarchical
divisive clustering method. It computes a tree or dendrogram.

Our hypothesis is that one method should generate data which is less
scattered, meaning that cluster analysis should yield less number of
clusters.

However, when we do the clustering analysis on the generated samples,
we saw that there is no clear distinction between number of clusters.
But if I look at the tree's generated by diana then it is obvious to
me that the method which we expect to have less clusters has less
spread in the tree.

I am thinking that we should also use the variance of data in the
clusters in addition to number of clusters to compare the sampling
methods. I, however, could not find a theoretical way to do that.
Could you suggest me ideas, papers or books to follow up with this
problem?

I hope this makes sense.
Arif.
 




 1 Posts in Topic:
clustering and data variance question
"ozgun.harmanci"  2008-05-11 08:19:07 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Thu Jul 24 14:44:46 CDT 2008.