Sensitivity
Sensitivity, also widely known in machine learning as Recall or the True Positive Rate (TPR), is a critical metric used to evaluate a classifier's ability to identify all relevant instances within a specific class. It specifically measures the proportion of actual positive instances that were correctly identified as positive by the model.
1. Calculation and Building Sensitivity
Sensitivity is derived from the confusion matrix. It is built by taking the number of correct positive predictions and dividing it by the total number of actual positive cases in the dataset.
- Formula: Sensitivity (Recall)=TP(FN+TP).
- Components:
- True Positives (TP): Instances correctly predicted as positive.
- False Negatives (FN): Actual positive instances that the model incorrectly predicted as negative (also known as a Type II error or a "miss").
- Multiclass Extension: For multiclass problems, sensitivity is calculated for a specific class by treating that class as "positive" and all others as "negative".
2. How to Interpret Sensitivity
Sensitivity answers the fundamental question: "Of all the actual positive cases, how many did the model find?".
- In Medical Diagnosis: If a cancer test has high sensitivity, it means the test is very good at correctly identifying people who actually have the disease. A sensitivity of 1.0 (100%) means the model identified all sick individuals, resulting in zero false negatives.
- High Sensitivity vs. Low Sensitivity:
- High Sensitivity: Indicates that the model is "inclusive" and rarely misses positive cases. This is vital when the cost of a false negative is high (e.g., missing a fatal disease or a security threat).
- Low Sensitivity: Suggests the model is failing to detect many positive instances, which can be dangerous in critical applications.
3. Key Trade-offs and Relationships
Sensitivity cannot be viewed in isolation, as it is inextricably linked to other performance metrics:
- The Precision/Recall Trade-off: Increasing sensitivity (recall) usually leads to a decrease in precision (the proportion of positive calls that are actually correct). A model that predicts "positive" for every single case will have a perfect sensitivity of 100%, but its precision will be extremely low because it will also produce many false positives.
- *Sensitivity vs. Specificity: While sensitivity measures the accuracy of positive predictions, specificity (Selectivity) measures the accuracy of negative predictions (True Negative Rate). Often, as you tune a model's threshold to be more sensitive to disease, you inevitably decrease its specificity, meaning you will falsely diagnose more healthy people.
- The ROC Curve: The Receiver Operating Characteristic curve is a standard tool for assessing this trade-off. It plots Sensitivity (TPR) on the y-axis against the False Positive Rate (1 - Specificity) on the x-axis across various decision thresholds.
4. Critical Importance in Imbalanced Datasets
Sensitivity is often far more informative than Accuracy when dealing with skewed or imbalanced datasets.
- The "Accuracy Trap": In a scenario where only 1 in 1,000 people has a disease, a "dummy" model that always predicts "healthy" will achieve 99.9% accuracy. However, its sensitivity will be 0%, as it fails to detect the single actual positive case.
- Cost-Sensitive Learning: In many real-world applications, false negatives (missing a case) are much costlier than false positives (a false alarm). In such cases, analysts deliberately prioritize high sensitivity, even at the expense of precision or overall accuracy, sometimes by adjusting the decision threshold or using a cost matrix.