Collaboroate

Introducing Collaborate

Collaborate is the latest OS X software from TheSpicyChicken. This real-time collaborative document editor lets you share and edit documents with your friends over any Jabber server. Just sign in with your Google Talk or Jabber username and start editing.

R Tip: Fitting Sigmoidal Data

Sigmoid functions are our friends and sometimes you have data which you would like to fit with a sigmoid function. We can use R to find such a fit. First let us look at a sigmoid function.

y = 1 / (1 + exp( -a*x + b) )

Now let’s say you are given a vector x and y, say:

x = c(0.00,0.02,0.04,0.06,0.08,0.10,0.12,0.14,0.16,0.18,0.20,0.24,0.26,

0.28,0.30,0.34,0.40,0.42,0.48,0.54,0.56,0.64,1.00)

y = c(0.409742,0.319277,0.530120,0.377778,0.357143,0.608696,0.315789,

0.692308,0.642857,0.636364,0.750000,0.000000,0.833333,1.000000,

0.000000,1.000000,1.000000,1.000000,1.000000,1.000000,1.000000,

1.000000,1.000000)

In this case the y values here represent probabilities and one thing you’ll notice is that we have probs of 1 and 0. Both of which are bad. So we apply a little “laplace smoothing” to them:

y[y==0] = 0.001

y[y==1] = 0.999

Now let’s look at what the data looks like.

plot(x, y)

Well, it may be sigmoidal, maybe not. For now let’s assume we think it is. Which we do for the most part.

Okay, now let’s solve for a line in our sigmoid function:

y = 1 / (1 + exp( a*x + b) )

1 + exp( a*x + b) = 1/ y

a*x + b = log ( (1/ y) – 1 )

Now the left hand side of the equation is a line and the right hand side is some logarithm of the y data. We can plot x versus this right hand side:

new_y = log( 1 / y – 1 )

plot(x, new_y)

Looks pretty interesting and hopefully at this point it also looks kinda linear, which it kinda does.

Now let’s fit it with a line:

lm.res <- lm( new_y ~ x )

lm.res

Which produces this output:

Coefficients:

(Intercept) x

1.122 -11.647

We can also test the significance of the fit with an ANOVA.

anova(lm.res)

Which produces this output:

Analysis of Variance Table

Response: new_y

Df Sum Sq Mean Sq F value Pr(>F)

x 1 172.80 172.802 14.641 0.0009834 ***

And we can plot the resulting fit in linear space:

Now let’s see how our fit looks back in normal space using our formula with our derived a and b values.

a = -11.647

b = 1.122

plot(x, y)

sim_x = (1:101-1)/100

points(sim_x, 1/(1+exp(a*sim_x+b)), type=”l”)

Voila! We have fit a sigmoid function to our data.

Tags: , ,

Filed under:Development, Science

Leave a Reply

Spam protection by WP Captcha-Free