Survival Analysis
Time-to-event analysis
What is Survival Analysis?
Survival Analysis is a statistical method for analyzing the time until a specific event occurs. Despite the name, it is not limited to medical studies - it is widely used in business!
You provide data on when events occurred (or if they haven't occurred yet) and the system calculates survival probabilities, average time to event, and compares different groups.
Usage examples:
- • Predict when customers will cancel their subscription (churn)
- • Analyze time until equipment failure in industrial settings
- • Estimate customer lifetime value
- • Compare the effectiveness of medical treatments or interventions
Quick Start
- 1. Prepare your data in CSV format with time, event status, and group (optional)
- 2. Upload the file to the upload page
- 3. Set the parameters (model, variables)
- 4. Please wait for processing (typically 2-4 minutes)
- 5. Analyze survival curves and statistics
How to organize your data
Organize your data in a CSV spreadsheet with at least three columns:
Column 1: Time (Duration)
How long until the event or until the cutoff. For example: 120 days, 8 months, 3 years
Column 2: Event Occurred (Event)
1 if the event occurred (e.g.: customer canceled), 0 if it has not occurred yet (censored)
Column 3+: Group/Covariates (optional)
Features to compare groups. For example: plan (basic/premium), region, age
Example of customer churn spreadsheet:
| duration_days | churned | plan |
|---|---|---|
| 365 | 1 | basic |
| 180 | 0 | premium |
| 90 | 1 | basic |
| 540 | 0 | premium |
💡 Censorship: When churned=0, it means that the customer HAS NOT canceled up to that point. This is important and survival analysis knows how to handle this!
Analysis settings
Survival Model
Choose the analysis method:
Time Variable
Column name that contains the observed time.
Example:
duration_days, time_to_event, months
Event Variable
Column name that indicates whether the event occurred (1) or not (0).
Example:
churned, event_occurred, died
Group Variable (optional)
Column name to compare survival curves between groups.
Example:
plan, treatment, region
Understanding the results
The analysis returns survival curves and statistics that show the probability of 'surviving' (not having the event) over time.
Survival Curve (Kaplan-Meier)
Y Axis: Probability of Survival
Ranges from 0 to 1 (0% to 100%). Indicates that % has not yet experienced the event.
Example: 0.7 at 180 days = 70% of customers still active
X Axis: Time
Timeline in days, months, or years.
The curve descends in steps when events occur
Median Survival Time
Time until 50% event occurrence.
Example: Median of 240 days = half of the customers cancel within 8 months
Log-Rank Test (group comparison)
Tests if the curves of different groups are statistically different.
p < 0.05 = Groups have significantly different survival
⚠️ Practical interpretation: If the 'premium' group has a curve always above the 'basic', it means that premium customers survive longer (lower churn). Use this to make strategic decisions!
Need help? Contact us: contato@grabatus.com