Statistical Question Namir Posting Freak Posts: 2,247 Threads: 200 Joined: Jun 2005 12-18-2012, 02:09 PM Hi All, Anyone knows of a statistic that can assess the goodness of predicting out-of-sample values? For example I have N data points which I use to evaluate a model with p parameters. I use that model to predict the values of M data points. I then compare these predictions with the actual values for the M data points. Aside from Chi Square, do you know of any other statistic that can be used to measure the goodness of out-of-sample predictions? Namir Walter B Posting Freak Posts: 4,587 Threads: 105 Joined: Jul 2005 12-18-2012, 03:34 PM Namir, IIRC, usually the model predicts certain values and the data points scatter around these predicted ('expected') values. If the model follows a normal distribution (or something which can be transformed into such a distribution) then confidence intervals can be calculated around the model curve. Typically these intervals are not rectangular even in a coordinate system where the model corresponds to a straight line. Within these intervals the data points shall be found with said confidence. If that's what you're looking for then I've to dig in my old files to find the exact way that's done - it was definitly not chi-square. I don't remember anything else alike. d:-) Edited: 18 Dec 2012, 3:35 p.m. Nick_S Member Posts: 125 Threads: 9 Joined: Oct 2011 12-19-2012, 03:40 AM If you are looking for overall summary of how well your model fits the observed data, then an alternative is to work directly with the likelihood ratio statistic, sometimes expressed on an additive scale as -2 ln likelihood. For categorical data this can be simply calculated as the G^2 statistic However, if your interest is more in identifying outlying individual observations, then a calculation of residual values for each can be useful (e.g., Pearson residuals, or deviance residuals), particularly if calibrated as studentized values. Pearson residuals form the components that make up the Pearson X^2 statistics, while deviance residuals combine to form -2log likelihood, known as the deviance. Finally, one can reduce over-fitting of a model by using a training sample of observations to estimate the model and then a separate testing set to evaluate the fit of the model (which seems to be something along the lines which you have described). The Prediction Error Sum of Squares is a summary measure of the fit of a regression model to the set of observations that were not themselves used to in estimating the model. It is the sums of squares of the prediction residuals for those observations. Nick Edited: 23 Dec 2012, 6:42 a.m. Bruce Larrabee Member Posts: 80 Threads: 14 Joined: Sep 2010 12-22-2012, 03:39 AM I'm curious Namir, do you want to predict values outside of your sample set as apposed to inside your sample data set? « Next Oldest | Next Newest »

 Possibly Related Threads... Thread Author Replies Views Last Post Best statistical fit Richard Berler 8 1,906 10-30-2013, 11:25 PM Last Post: Walter B Project Euler Problem 39: Statistical Mode on the HP 48GX? Peter Murphy (Livermore) 3 1,017 07-29-2011, 09:44 PM Last Post: Peter Murphy (Livermore) Statistical analysis galore Geir Isene 9 2,048 11-19-2010, 09:51 PM Last Post: Palmer O. Hanson, Jr. OT: Tutorials for R statistical language Namir 8 1,823 10-29-2009, 01:37 PM Last Post: Tim Wessman Origins of HP41 numerical routines ot compute statistical distributions Les Wright 11 2,116 05-09-2006, 02:35 PM Last Post: Namir Basic statistical functions on HP-32SII Ed Look 9 1,659 10-06-2003, 01:18 PM Last Post: Ed Look SPC (Statistical Process Control) data connector info wanted Ellis Easley 0 549 04-16-2003, 06:41 AM Last Post: Ellis Easley Re: Statistical Bug? R Lion (Spain) 5 1,177 04-07-2003, 05:30 PM Last Post: hugh Statistical Bug? Trent Moseley 0 495 03-28-2003, 12:15 AM Last Post: Trent Moseley Statistical Bug? Trent Moseley 11 1,777 03-27-2003, 01:28 AM Last Post: Karl Schneider

Forum Jump: