After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 453765 - Adding Kaplan Meier test to the statistical tests list
Adding Kaplan Meier test to the statistical tests list
Status: RESOLVED FIXED
Product: Gnumeric
Classification: Applications
Component: Analytics
1.7.x
Other Linux
: Normal enhancement
: ---
Assigned To: Andreas J. Guelzow
Jody Goldberg
Depends on:
Blocks:
 
 
Reported: 2007-07-04 19:07 UTC by santam chakraborty
Modified: 2008-10-30 19:38 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
HTML output of SPPS output (12.09 KB, text/html)
2008-10-30 18:56 UTC, santam chakraborty
Details
SPSS output chart (31.93 KB, image/jpeg)
2008-10-30 18:57 UTC, santam chakraborty
Details
Gnumeric sheet with the data and the curve (5.61 KB, application/x-gnumeric)
2008-10-30 18:58 UTC, santam chakraborty
Details

Description santam chakraborty 2007-07-04 19:07:52 UTC
Introduction: The Kaplan Meier method (or the Kaplan Meier Product Limit Estimator) is a a nonparametric (actuarial) technique for estimating time-related events (the survivorship function). 1 Ordinarily it is used to analyze death as an outcome. It may be used effectively to analyze time to an endpoint, such as remission. 	Kaplan-Meier

The Kaplan-Meier method is a nonparametric (actuarial) technique for estimating time-related events (the survivorship function). 1 Ordinarily it is used to analyze death as an outcome. It may be used effectively to analyze time to an endpoint, such as remission.

It is a univariate analysis and is an appropriate starting technique. It estimates the probability of the proportion of individuals in remission at a particular time, starting from the initiation of active date (time zero), is especially applicable when length of follow-up varies from customer to customer, and takes into account those customer lost to follow-up or not yet in remission at end of study (censored customers, assuming the censoring is non-informative). It is therefore the instrument of choice in evaluating remissions following loosing a customer. Since the estimated survival distribution for the cohort study has some degree of uncertainty, 95% confidence intervals may be calculated for each survival probability on the “estimated” curve.

A variety of tests (log-rank, Wilcoxan and Gehen) may be used to compare two or more Kaplan-Meier “curves” under certain well-defined circumstances. Median remission time (the time when 50% of the cohort has reached remission), as well as quantities such as three, five, and ten year probability of remission, can also be generated from the Kaplan-Meier analysis, provided there has been sufficient follow-up of subjects.

Need: The K-M test is one of the most commonly used statistical tests in the medical field in almost all clinical branches. It allows calculation of survival estimates. At present several commercial software packages allow the use the this test but they are expensive. The free and OS alternative is R, sadly it is CLI based and difficult to use by the medical professionals who have limited computing knowledge. To my current knowledge there are no free alternatives for this test elsewhere. Gnumeric can provide this statistical test in the statistical plugins list for ease of use of this software by the medical profession. I am sure the acceptance for this software will increase remarkably if this statistical test is implemented.

What the test does:
Please see a nice demo on the following website:
http://www.medcalc.be/manual/kaplan-meier.php

The data is presented in a columnar format in which the user has the first column for the time for the event to occur (e.g. disease recurrence). The second column contains the descriptor for the event (e.g. disease recurrence = 1 , disease free = 0) and an optional 3rd column containing the grouping variable (eg male = 1 , female =2 and so on). The test consists of two parts - a mathematical equation which derives the survival times based upon the time of "censoring" and a graphical component which shows the curves typically in a step ladder pattern.

How I can help:
I cant help with the coding part but I can help with the equations and I have several examples of the SPSS output for various scenarios that I can share which include the calculations and the graph.

I can be contacted at drsantam (at the rate of) gmail dot com
Comment 1 Andreas J. Guelzow 2007-07-05 06:43:36 UTC
Is there somewhere an online description of this statistical analysis, preferably a desription written for mathematicians?
Comment 2 santam chakraborty 2007-07-05 15:55:02 UTC
Here are some additional online resources that may be of help:
http://en.wikipedia.org/wiki/Kaplan-Meier_estimator
http://support.sas.com/rnd/app/da/new/802ce/stat/chap6/sect10.htm
http://www.turkishrespiratoryjournal.com/pdf.php3?id=281
http://www.weibull.com/LifeDataWeb/nonparametric_analysis.htm
http://www.graphpad.com/www/book/survive.htm
http://www.cs.huji.ac.il/~exp/ex5.html

Here are some of the online links that I found after some googling. I dont know if it will be adequate for a mathematician or not as I am not one but I would be willing to search more if asked.
Thanks for the interest Andreas
Comment 3 Andreas J. Guelzow 2008-10-11 20:41:02 UTC
Creating kaplan-meier curves should be easy since we already have a graph type that match. Calculating the probabilities also seems to be straight forward. 
Comment 4 Andreas J. Guelzow 2008-10-16 06:26:21 UTC
I have added a Kaplan-Meier tool. Please try it out and file new enhancement requests if there is anything important that I missed. Without using Kaplan-Meier estimates myself it is a little tricky to figure out what are the important features.

This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.

This will be in the 1.9.3 release (possibly only 1.9.4 on Windows)
Comment 5 santam chakraborty 2008-10-30 18:56:44 UTC
Created attachment 121676 [details]
HTML output of SPPS output

HTML output of the the SPSS output for the Kaplan Meier test
Comment 6 santam chakraborty 2008-10-30 18:57:11 UTC
Created attachment 121677 [details]
SPSS output chart
Comment 7 santam chakraborty 2008-10-30 18:58:05 UTC
Created attachment 121678 [details]
Gnumeric sheet with the data and the curve
Comment 8 santam chakraborty 2008-10-30 19:00:41 UTC
Thanks for the test. It works perfectly I have tested in Gnumeric 1.93 in Mandriva - sorry for the delay as I had to install the cooker version and that required a bit of googling around.
Anyways the test works perfectly and I have tested it against a test data in spss. I have attached the output of SPSS for the survival curves and for comparision the Gnumeric results which are mathematically and graphically correct.
Comment 9 santam chakraborty 2008-10-30 19:38:06 UTC
Bug 558582 – Additional Enhancements in the KM plugin filed.
http://bugzilla.gnome.org/show_bug.cgi?id=558582