Talk About Network

Google


Register and Login
Nick
Password
Register create new account Sign up is FREE and you can post replies, new topics, bookmark posts and more!
Recover lost password


Education > Artificial intelligence > Re: automatic p...
Latest [ Topics | Posts ] Archive Post A New Topic Post a Reply
<< Topic < Post Post 3 of 8 Topic 551 of 688
Post > Topic >>

Re: automatic parametric space exploration and boundary detection.

by "Paul A. Rubin" <rubin@[EMAIL PROTECTED] > Jun 13, 2007 at 02:17 PM

Vista wrote:

> 
> Let me describe my problem with an simpler example.
> 
> Let's say I have a 2D plane.
> 
> I draw a straight line to split the plane to two parts: Left and Right.
The 
> equation for the straightline is y=a*x+b.
> 
> In the left part the values of F(x, y) are 1s, while in the right part
the 
> values of F(x, y) are 0s.
> 
> You can query the value of F for any x and y. That's to say, you give me
a 
> location (x, y), I will tell you the function value F(x, y).
> 
> What's the most efficient way to discover the value of "a" and "b"?
> 

First off, the original message was cross-posted all over the place, so 
I'm replying to all the same groups.  Apologies to anyone who finds what 
follows exceptionally boring.

In the absence of a closed form for F, or some strong known properties 
of F (and I'm not sure what those properties might be, given that the 
range of F is discrete), I don't know that you'll be able to *prove* 
that the line (more generally, hyperplane) that you end up with 
separates the zeros from the ones.  I suspect that the most you could do 
would be to randomly sample the domain of F, compute values, verify 
they're all correctly separated by the hyperplane, then give some 
probability that the hyperplane is universally accurate.

As to computing the hyperplane, a relatively straightforward approach 
would be to compute F for a random sample of points from its domain, 
then solve the LP

min \sum_{n \in N} d_n
s.t. a'x_n + b + d_n - e_n \ge 1  (n \in O)
      a'x_n + b - d_n + e_n \le -1 (n \in Z)
      a, b free
      d, e \ge 0

where N is the set of indices of your sample points x_n, O and Z are the 
sets of indices where F is one respectively zero, a is vector of 
variables of the same dimension as x (which is the vector of arguments 
to F), b is scalar, and d_n and e_n are nonnegative variables for n \in 
N.  The variables for the LP model are a (the hyperplane coefficients), 
b (the hyperplane offset from the origin) and d and e (deviations from 
the boundary hyperplane in the wrong and right directions, respectively).

There's a literature on classification using math programming, in which 
this model is usually called the MSD (minimize sum of deviations) model. 
  There are various other LP classification models, but the key is that 
when the training data can be strongly separated via a linear model 
(meaning all ones are in the *interior* of one half space and all zeros 
in the *interior* of the other half space), all LP models produce valid 
separating hyperplanes (valid in the sense of separating the training 
data -- no guarantee what happens with non-training data).

If you're not thrilled about solving an LP, there's also a fairly old 
algorithm called the "perceptron algorithm" that will find a valid 
separating hyperplane (when one exists) by iterative means, with the 
number of iterations bounded.  It's known to fail when the data cannot 
be separated linearly, but again I believe you are assuming it can be.

Two caveats apply.  First, different samples will likely produce 
different hyperplanes.  Second, let's assume that there really is a 
hyperplane separating the ones from the zeros.  Unless (a) that 
hyperplane contains both ones and zeros and (b) your sample includes at 
least one point of each type from the hyperplane, the odds of any 
solution (LP or otherwise) finding the exact hyperplane are minimal.  On 
the other hand, if your sample does happen to contain points of both 
flavor from the boundary hyperplane, then the training data cannot be 
strongly separated, in which case different LP models actually have 
different performances (some may be more prone to missing widely than 
others).  I don't want to recap all the literature here, particularly as 
I don't remember it all :-).

To end on a more positive note, Leo Breiman did some interesting work on 
"bagging" classifiers that would apply here.  A short and possibly 
inaccurate summary:  generate a random sample and construct a classifier 
(separating hyperplane), say by some LP model; generate a new sample and 
repeat (getting a somewhat different hyperplane); repeat ad nauseum; 
average the coefficients from the hyperplanes you got and hope for the
best.

/Paul
 




 8 Posts in Topic:
automatic parametric space exploration and boundary detection.
"Vista" <abc  2007-06-12 21:55:53 
Re: automatic parametric space exploration and boundary detectio
"Vista" <abc  2007-06-13 10:29:56 
Re: automatic parametric space exploration and boundary detectio
"Paul A. Rubin"  2007-06-13 14:17:12 
Re: automatic parametric space exploration and boundary detectio
"Graham Jones"   2007-06-13 22:09:38 
Re: automatic parametric space exploration and boundary detectio
"Vista" <abc  2007-06-13 10:37:12 
Re: automatic parametric space exploration and boundary detectio
"Paul A. Rubin"  2007-06-13 14:20:53 
Re: automatic parametric space exploration and boundary detectio
Toby Kelsey <toby.kels  2007-06-13 21:04:14 
Re: automatic parametric space exploration and boundary detectio
Fred Krogh <fkrogh@[EM  2007-06-13 14:24:43 

Post A Reply:
  Go here to Signup

AddThis Feed Button


About - Advertising - Contact - Frequently Asked Questions - Privacy Policy - Terms of Use - Signup

Contact
tan12V112 Sat Nov 22 17:10:31 CST 2008.