On Mar 31, 12:03=A0pm, Paul Rubin <ru...@[EMAIL PROTECTED]
> wrote:
> jenmoocat wrote:
>
> > They say that in testing, there is an implied cost function overlaid
> > on the test --- where each of those quadrants has a cost value of 1 --
> > we want to evaluate them all at the same level. =A0But, what if,
> > minimizing Type II error is more im****tant than minimzing Type I
> > error. =A0Then how should the test be constructed?
>
> I don't think what they are saying is accurate, but I think I understand
> the motivation. =A0Consider an example from statistical quality control,
> in which you are inspecting a batch of a product to decide if it is good
> enough to ****p. =A0If not, you will scrap it. =A0The null hypothesis,
> broadly speaking, is "good to go", so a Type I error is that you scrap a
> batch that actually met quality standards. =A0That incurs some very real
> costs -- loss of the capital invested in materials and labor, cost of
> disposal, cost of ****pping delays or expediting a new batch, ... =A0On
the=
> other hand, a Type II error is that you ****p a bad batch, which has its
> own set of costs (contract penalties, customer returns, lost good
> will/lost business, lawsuits, ...). =A0Those costs are usually
asymmetric.=
> If you manufacture heart medication, say, where dosage errors can be
> lethal, you probably err on the side of minimizing Type II errors. =A0If
> you are Microsoft, and quality is measured by bugs, you err on the side
> of Type I error and let the customer download patches later on.
>
> I don't think you control for this asymmetry by monkeying with the test;
> I think you deal with it by setting the significance level. =A0I may be
> wrong (in which case we'll find out quickly :-)), but I think that in
> most hypothesis tests the Type II probability is basically one minus the
> Type I probability, at least in a worst case bound sense (meaning if the
> null is false but the true parameter is close to the set described by
> the null). =A0If you can establish a prior probability for the null
being
> true -- which assumes that the null is a statement about a random event,
> not a statement about a deterministic parameter -- then I think you can
> take a Bayesian approach and set the significance level so as to
> minimize the expected error cost.
>
> /Paul
Hi Jennifer:
I am not sure how much help this will be but it's worth a shot.
This really seems like a problem typically encountered in data mining
where essentially, all differences are significant using typical
criteria. Have you check their techniques, literature and groups?
How about discriminant analysis for this problem?
Marc


|