Guide
Data types, test selection, p values, and 95% confidence intervals.
This page explains the concepts used by the dashboard so the results are easier to interpret.
Download companion book
BASIC BIOSTATISTICS is a colorful PDF handbook with index pages and short chapters on descriptive statistics, analytic statistics, and sample size calculations. Author: Manjunath Kulkarni, Nephrologist, Mangaluru, India.
Types of data
Choosing the test starts with understanding the variable type.
- Nominal data are categories without numeric order, such as blood group, dialysis modality, or outcome status.
- Ordinal data have order but not equal spacing, such as symptom grades or staged severity labels.
- Continuous data are numeric measurements such as hemoglobin, creatinine, age, BMI, or blood pressure.
Nominal tests
Use contingency tables when comparing frequencies across categories.
- Chi-square tests independence between categorical variables.
- For a 2x2 table with small expected counts, Fisher exact is preferable or should be reviewed alongside chi-square.
- For 2x2 tables, the odds ratio with a 95% confidence interval provides a clinically useful point estimate.
Continuous tests
Normality helps determine whether a parametric or non-parametric summary is more suitable.
- Shapiro-style testing and QQ plots help judge whether a group is approximately normal.
- Student t test compares two approximately normal groups and yields a mean difference with 95% confidence interval.
- Mann-Whitney compares two groups when a rank-based non-parametric approach is more suitable.
- ANOVA compares three or more approximately normal groups, while Kruskal-Wallis is the non-parametric alternative.
Correlation and regression
Use paired numeric data when asking whether two variables move together.
- Pearson correlation measures linear association.
- Spearman correlation measures rank-based monotonic association.
- Simple linear regression estimates the expected change in the outcome for a one-unit change in the predictor.
What a p value means
Interpret p values carefully and in context.
- A p value is the probability of seeing data this extreme, or more extreme, if the null hypothesis were true.
- A small p value suggests the observed data are less compatible with the null hypothesis.
- A large p value does not prove there is no effect; it may reflect limited sample size or variability.
What a 95% confidence interval means
Confidence intervals provide more context than p values alone.
- A 95% confidence interval gives a range of plausible values for the population effect under the model used.
- For ratios such as odds ratios, an interval crossing 1 often suggests no clear association.
- For differences such as mean difference, an interval crossing 0 often suggests no clear difference.
- Narrower confidence intervals suggest greater precision.