Selectivity
Specificity, also known as the True Negative Rate (TNR), is a performance metric that measures a classifier's ability to correctly identify the negative class. It specifically quantifies the proportion of actual negative instances that were predicted as negative by the model.
- Calculation and Building Specificity
Specificity is derived from the confusion matrix. It is calculated by taking the number of correctly identified negative cases and dividing them by the total number of actual negative cases in the dataset.
- Formula: Specificity (TNR)=TN/(FP+TN).
- Components:
- True Negatives (TN): Instances correctly identified by the model as belonging to the negative class.
- False Positives (FP): Actual negative instances that the model incorrectly predicted as positive (also known as Type I errors or "false alarms").
- Multiclass Extension: In multiclass problems, specificity is calculated for a single class by treating it as the "positive" class and all others combined as the "negative" class.
- How to Interpret Specificity
Specificity answers the question: "Of all the instances that are actually negative, how many did the model correctly identify?".
- In Medical Testing: If a diagnostic test has high specificity, it is very good at identifying people who do not have the disease. For example, a cancer test with 97.5% specificity means that 97.5% of healthy people will correctly test negative.
- High vs. Low Specificity:
- High Specificity: Indicates the model is conservative and rarely generates "false alarms".
- Low Specificity: Suggests the model frequently misclassifies negative instances as positive, which can lead to unnecessary costs or distress.
- Key Relationships and Trade-offs
Specificity is mathematically and conceptually linked to other metrics:
- Relationship to False Positive Rate (FPR): The False Positive Rate is defined as 1−Specificity. As specificity increases, the FPR decreases.
- The Sensitivity/Specificity Trade-off: There is often an inverse relationship between sensitivity (recall) and specificity. By adjusting the decision threshold to be more sensitive (to catch more positives), you typically decrease specificity because the model begins to misclassify more negatives as positives.
- The ROC Curve: The Receiver Operating Characteristic curve assesses this trade-off by plotting Sensitivity (True Positive Rate) on the y-axis against 1 - Specificity (False Positive Rate) on the x-axis. A perfect classifier reaches the top-left corner, where both sensitivity and specificity are 1.0.
- Importance in Specific Scenarios
Specificity is the primary focus when the cost of a false positive is high.
- Spam Detection: High specificity is vital to ensure that legitimate, important emails (negatives) are not incorrectly filtered into the spam folder (false positives).
- Medical Diagnosis: While sensitivity is often prioritized to avoid missing a disease, high specificity is necessary to prevent healthy patients from undergoing invasive, expensive, or distressing follow-up procedures due to a false positive result.
- Imbalanced Datasets: In scenarios where the negative class is the vast majority (e.g., 99% of transactions are not fraudulent), a model could achieve high Accuracy simply by predicting "negative" for everything. In such cases, specificity remains high, but this must be balanced against sensitivity to ensure the model is actually performing a useful task