After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 558582 - Confidence Intervals in Kaplan Meier Tool
Confidence Intervals in Kaplan Meier Tool
Status: RESOLVED OBSOLETE
Product: Gnumeric
Classification: Applications
Component: Statistics Tools
1.9.x
Other All
: Normal enhancement
: ---
Assigned To: Andreas J. Guelzow
Andreas J. Guelzow
Depends on:
Blocks:
 
 
Reported: 2008-10-30 19:28 UTC by santam chakraborty
Modified: 2018-05-22 13:30 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
HTML output from SPSS (12.09 KB, text/html)
2008-10-30 19:31 UTC, santam chakraborty
Details
Hazards plot (30.49 KB, image/jpeg)
2008-10-30 19:31 UTC, santam chakraborty
Details
KM curves of three categories superimposed (39.78 KB, image/jpeg)
2008-10-30 19:32 UTC, santam chakraborty
Details
Demo of Mean and Median Survival (4.61 KB, application/x-gnumeric)
2008-11-16 10:35 UTC, santam chakraborty
Details

Description santam chakraborty 2008-10-30 19:28:40 UTC
The recent inclusion of the Kaplan Meier plugin in the Gnumeric 1.93 release is a big step forward in the routine use of Gnumeric in the medical proffession looking for an easy way to do Survival Analysis. Here are some suggestions to improve the Analysis.
1. Include other numbers that can be used as censors in addition to 0 and 1.
2. Comparison of two Kaplan Meier curves - example senario - I want to analyse the difference in the survival between the two groups of patients one treated with medicine A and one treated with Medicine B and I want to know any significant differences between the two. Using the log rank test we can find out the difference in the mean survival estimates and the statistical significance. The wikipedia article which links to log rank test is here : http://en.wikipedia.org/wiki/Logrank_test
3. Another feature I would like to see is the mean and median survivals with the 95% confidence intervals.
Here is an article that describes the KM test procedure in SPSS in details
http://faculty.chass.ncsu.edu/garson/PA765/kaplanmeier.htm
Comment 1 santam chakraborty 2008-10-30 19:30:20 UTC
I am attaching an example HTML output for a set of data analyzed in SPSS which shows a comparative analysis for 3 categories.
Comment 2 santam chakraborty 2008-10-30 19:31:18 UTC
Created attachment 121684 [details]
HTML output from SPSS

HTML output from the SPSS program showing the important additions to the test and results
Comment 3 santam chakraborty 2008-10-30 19:31:37 UTC
Created attachment 121685 [details]
Hazards plot
Comment 4 santam chakraborty 2008-10-30 19:32:18 UTC
Created attachment 121686 [details]
KM curves of three categories superimposed
Comment 5 Andreas J. Guelzow 2008-10-30 19:50:37 UTC
thanks for the suggestions

I hope when we impement your suggestion we can avoid the colour problems in teh survival function you attached where the 3-censored marks have the colour of the 2-curve!
Comment 6 Andreas J. Guelzow 2008-10-30 23:24:54 UTC
Just a correction in the title since this is not a plugin.
Comment 7 Andreas J. Guelzow 2008-10-31 16:24:11 UTC
Santam, regarding your item (1) of including other numbers to be used as censors: which numbers did you have in mind?
Comment 8 santam chakraborty 2008-10-31 17:45:27 UTC
Other Numbers means that in the present dialog box we have the option of using 0 or 1 as censored values. But sometimes a person may code the status (eg surviving or not) using values other than 0 or 1, say I use 2 as not surviving and 3 as surviving. SPSS also allows the use of a range of numbers which is useful when you have outcomes that are not binary - for example a symptom may be absent, mild, moderate or severe. So if you want you can find out the actuarial duration of persistence of symptoms which were moderate to severe if you can use a range of numbers as a censoring variable.
Comment 9 Andreas J. Guelzow 2008-11-11 23:02:30 UTC
I have just committed changes that allow the censor marks to be a consecutive range of integers. So if absent, mild, moderate or severe is coded as 0,1,2,3 you could use the range 0 to 1 as censor marks and the remainder as deaths.

This should handle request (1).
Comment 10 Andreas J. Guelzow 2008-11-14 07:38:31 UTC
I have just committed changes that allow multiple groups to be handled simultaneously. This is part of request (2).

From here the log-rank test should be straight forward.
Comment 11 Andreas J. Guelzow 2008-11-14 07:43:01 UTC
regarding comment (3): the given reference describes the SPSS output without explaining the meaning of those terms. So for implementeation of the mean and median survival times this is quite useless. 
Comment 12 Andreas J. Guelzow 2008-11-15 23:05:16 UTC
I have committed an optional log-rank test to the kaplan meier tool. This completes request (2).

This leaves output of the mean and of confidence intervals for both the mean and median.
Comment 13 santam chakraborty 2008-11-16 10:34:05 UTC
Sorry for the delay in getting back.
This pertains to the feature request in the Mean and Median survival times section.
I am taking this data from a Stastic Book " A Foundation for Analysis in health Sciences " - Wayne W. Daniel.
Median Survival: It is the time at which the survival probablity is equal to 0.5. If the output doesnot have the value 0.5 per se, then the median survival is the time interval after which the survival drops below 0.5.
Example: If the survival probablity is  0.61 at 13 months and 0.35 at 14 months then the median survival is 14 months. 
N.B. If the survival doesnot drop below 0.5 then the Median survival is not reached for the population.
Mean Survival: This calculated by finding the mean of the survival time in months.
I am attaching a Gnumeric sheet which demonstrates this
Comment 14 santam chakraborty 2008-11-16 10:35:56 UTC
Created attachment 122778 [details]
Demo of Mean and Median Survival

Here is a demo Gnumeric sheet showing the median and the mean survival estimates calculated with the KM plugin
Comment 15 santam chakraborty 2008-11-16 10:39:19 UTC
In addition another enhancement that can be added is the hazard rate which is simply the quotient of the number of deaths / total survival times which is calculated by adding up the survival times as done when calculated for the mean.
Comment 16 santam chakraborty 2008-11-16 11:15:53 UTC
I found a webpage that can give the KM curve 95% confidence interval formula
http://www.hutchon.net/Kaplan-Meier.htm
Comment 17 Andreas J. Guelzow 2008-11-16 16:03:44 UTC
Thanks for the information. The median is alrady being calculated. Most sites I have found differentiate between the mean of the survival times (that you mention above) and the mean survival time. Whenever they give examples, those two values differ.
Comment 18 Andreas J. Guelzow 2008-11-16 16:07:32 UTC
In you example you are also including censured events in your mean survival time. This does not make sense since those events haven't happen.
Comment 19 santam chakraborty 2008-11-16 16:28:12 UTC
ya that is the true about the mean. The example of the mean I have taken is from the stats book I have mentioned. As far as the meaning is concerned mean is a very bad measure to be used for survival as it is influenced by the extremes. But sometimes when the median is not reached it is the only measure available. 
Comment 20 Andreas J. Guelzow 2008-11-16 17:25:02 UTC
I think I will have to dig into some journal articles. There has tp be some better way. (And it may become useful that I am a mathematician.)
Comment 21 santam chakraborty 2008-11-16 18:05:02 UTC
I have a couple of PDFs you may want to look into - books rather 
Comment 22 Andreas J. Guelzow 2008-11-16 19:18:03 UTC
perhaps yu can send the PDFs to me: aguelzow@pyrshep.ca
Comment 23 santam chakraborty 2008-11-16 19:46:51 UTC
sent please check spam if not found in inbox
Comment 24 Andreas J. Guelzow 2008-11-16 20:43:22 UTC
Thanks I have received them. I will have a look at them.
Comment 25 GNOME Infrastructure Team 2018-05-22 13:30:22 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gnumeric/issues/109.