![]() |
Algorithm for fitting a logistic curve? - Printable Version +- HP Forums (https://archived.hpcalc.org/museumforum) +-- Forum: HP Museum Forums (https://archived.hpcalc.org/museumforum/forum-1.html) +--- Forum: Old HP Forum Archives (https://archived.hpcalc.org/museumforum/forum-2.html) +--- Thread: Algorithm for fitting a logistic curve? (/thread-204621.html) |
Algorithm for fitting a logistic curve? - Tim Wessman - 11-11-2011 Hello, Working on the math library here, and I have had an immensely difficult time finding how to efficiently implement a logistic curve fit. Note, this isn't a full fledged binary logistic regression (which I can find lots of information on), but rather the fitting of a curve to a set of data with the form L/(1+a*^(-b*x)). The fitting method in the math library right now linearizes the equation and it doesn't give a very good fit at all so I am trying to replace it. Does anyone have any helpful pointers to any algorithms for this type of problem? I have posted the Charlie Patton's code comments below from the 48 math library for those interested (he wrote this originally). I am not 100% certain if the issue is the linearization and a completely different method is needed, or just the L esitmator routine needs replacing/improvement.
TW -- Although I work for the HP calculator department, the comments and opinions I express here are my own.
Edited: 11 Nov 2011, 12:18 p.m.
Re: Algorithm for fitting a logistic curve? - Eric Smith - 11-11-2011 In my experience, fitting either the logistic function or the tanh function tends to get poor results. I suspect that this is due to how rapidly they go asymptotic, but that's really only a guess on my part. Hopefully someone knowledgeable about numerical analysis can explain how to do it properly.
Re: Algorithm for fitting a logistic curve? - Dieter - 11-11-2011 Quote:Linearizing the equation, followed by a simple linear regression, is a classic method that usually gives decent results. However, it does not minimize the sum of the residuals' squares. How did you determine the quality of the fit here? Quote:As far as I can see the comments simply refer to the common linearization, which here leads to the transformation ln(A)+B*x = ln(L/y-1). The goal of the algorithm however seems to be a different one: implement a method to estimate the saturation parameter L: "This utility attempts to estimate the saturation value for a logistic equation from sorted statistical samples".
A true least-square regression, i.e. one that exactly minimizes the sum of the residuals' squares, is not trivial. I came across the following document and think it's an interesting read on this subject:
Dieter
Re: Algorithm for fitting a logistic curve? - MacDonald Phillips - 11-11-2011 Tim,
Don
Re: Algorithm for fitting a logistic curve? - Crawl - 11-11-2011 I can't believe I'm saying this (being a big fan of CAS calculators), but you don't NEED to use a CAS. I use Excel's Solver routine all the time to do least squares fitting to arbitrary function forms.
Re: Algorithm for fitting a logistic curve? - Wes Loewer - 11-13-2011 Tim,
Quote: How critical is speed? Perhaps you've already been down this road, but using a brute-force approach I took the equivalent equation:
y = L/(1+a*exp(-k*x)) and applied least-mean-square principles:
Let E = sum i=1 to n of (L/(1+A*EXP(-K*X_i)) - Y_i)^2 then minimized E by taking the partial derivatives of E with respect to L, a, and k and setting them to zero.
E 'L' DERIV 0 = This gives three non-linear equations which can then be solved numerically for L, a, k. I tried this with a few sample data points on the 50g (using the SOLVESYS lib to solve) and in Maxima (using MNEWTON to solve) and got matching results which also matched the FitLogistic command in the computer software GeoGebra. I don't know if you're allowed to use GPL code for your project, but GeoGebra is a GPL program with source code available from http://www.geogebra.org/source/program/. Perhaps you could look and see how they handle it. You don't need the CAS since the derivatives can be hard coded, but numerically solving the equations is of course the bottleneck. It might even be faster to use the linearized results as the initial values in the iterative solving process.
~wes
|