How to Choose an Appropriate Statistical Test
Inferential statistics involves using statistical theory to test hypotheses. Knowing when to use what statistical test can be a frustrating part of the dissertation journey. This post highlights important considerations when developing a statistical analysis plan for your dissertation or thesis.
Before we dive into the details, let’s have a short recap of what a research hypothesis is.
With inferential statistics, you can test hypotheses about the population under study by:
- Examining relationships
- Comparing groups
- Building predictive or explanatory models
- And more…
Each statistical test tests a hypothesis. The main purpose of hypothesis testing is to choose between two competing hypotheses (i.e., null and alternative) about the value of a population parameter. Statistical tests assume a null hypothesis of no relationship or no difference between groups before determining whether the sample (observed) data fall outside of the range of values predicted by the null hypothesis. If the sample data fall outside of the range of values predicted by the null hypothesis, the alternative hypothesis is considered to be true.
Null hypothesis (Ho) | Alternative hypothesis (Ha) |
A statement that maintains that there is no difference or relationship between groups or variables.
No difference/no effect/no relationship.
|
A statement that maintains that there are differences between groups or that a relationship is present between variables.
A difference/effect/relationship is present. |
How do the research questions relate to the hypotheses?
A research question is a question that the researcher is gathering evidence to try to answer. The research question is then rephrased into two mutually exclusive statements that can be empirically tested (i.e., hypotheses).
Below are some examples of how to rephrase your research questions into null and alternative hypotheses to perform statistical tests.
Research Question | Null hypothesis (Ho) | Alternative hypothesis (Ha) |
Do men and women perform the same in facial recognition tasks?
|
Men and women perform the same in facial recognition tasks. | Men and women perform differently in facial recognition tasks. |
Are people suffering from PTSD more violent than people from the general population? | People suffering from PTSD and people from the general population display similar levels of violence. | People suffering from PTSD display different levels of violence than people from the general population. |
How do statistical tests work?
There are five main steps to consider when performing statistical tests. Luckily, statistical software will perform steps 2, 3 and 4. All you need to focus on is ensuring that your hypotheses are testable and that your data has been collected and cleaned correctly.
Step 1: State the null (Ho) and alternative (Ha) hypotheses
Step 2: Determine the significance level, alpha (usually 0.05)
Step 3: Calculate the test statistic
Step 4: Calculate the p-value
Step 5: Use the p-value to make a decision and draw conclusions
The p-value is the probability of observing a test statistic as extreme as the one observed, assuming the null hypothesis is true. This is the value that is used to make a decision on whether to reject or fail to reject the null hypothesis.
How do you use the p-value to make a decision?
The general rule of thumb is:
- You will reject the null hypothesis (Ho) when the p-value < 0.05 (a statistically significant difference/effect/relationship was found)
- You will fail to reject the null hypothesis when the p-value > 0.05 (a statistically significant difference/effect/relationship was not found)
How do you know which statistical test to use?
All statistical tests require that data meet certain requirements (assumptions) for the test to be considered appropriate and the results reliable.
Parametric statistical methods usually have sterner requirements than their nonparametric counterparts. However, they generally have more statistical power than the equivalent nonparametric statistical tests and can make stronger inferences from the data.
Parametric Statistical Methods
Parametric statistical tests usually focus on the mean and require the data to meet the following conditions:
- Data is normally distributed (bell-shaped curved)
- Groups being compared have equal variances (homogeneity of variances)
- Outcome or dependent variable is measured as a continuous variable
Nonparametric Statistical Methods
Nonparametric statistical tests are usually considered in cases where the data fails to meet the requirements for parametric statistical tests. Generally speaking, nonparametric statistical tests are more robust than the equivalent parametric statistical tests, are valid for a broader range of situations, and can be used for small samples.
Here are some additional considerations to keep in mind regarding nonparametric statistical tests:
- These tests are appropriate to use when the data is ordinal and not strictly numerical.
- They use the medians of the groups or the distribution but never the mean.
- They can be used as an alternative when dealing with small samples because small sample sizes make it impossible to distinguish the normality of the data (a parametric assumption)
So then, why not just use non-parametric statistical tests?
For one, nonparametric statistical tests are less powerful. This means that you would require larger effects/differences to be present in your sample for your test to pick it up. In addition, the results of nonparametric tests can be more challenging to interpret.
The table below displays the nonparametric alternatives to a few commonly used parametric statistical tests.
Parametric statistical test | Nonparametric alternative |
One sample t-tests | Chi-square tests |
Paired samples t-test | Wilcoxon Signed Rank test |
Independent samples t-test | Mann-Whitney U test |
ANOVA | Kruskal-Wallis test |
In addition to considering the parametric assumptions for determining a suitable statistical test, understanding your data types is crucial. Identifying the appropriate statistical methods to use to test your hypotheses is dependent upon the type of data that you’re working with. The different types of data are briefly discussed below.
Data types
Quantitative variables represent numerical measurements and are broken down into two types:
- Discrete – a quantitative variable that can take only a limited number of values, e.g., number of children in a class and number of cars owned.
- Continuous – a variable that can assume an infinite number of values, e.g., weight and height.
Categorical variables represent groupings and are also broken down into two types:
- Nominal data – categories are of equal importance. This type of data set cannot be ranked, and order is not important. Scores may represent names but not differences in the amount, e.g., eye color, clothing brands, country of residence, and gender.
- Ordinal data –can be ranked into different categories. The different categories have value and are not equally important. Scores indicate rank order, e.g., management level: senior, junior and intern.
Consult the flowcharts below to determine which of these commonly used statistical tests will be best suited to answer your research questions.
Statistical analysis for associations between categorical variables
Use a chi-square test if you would like to know whether two categorical variables are associated with one another. For example, you could use a chi-square test for association to determine whether there is an association between whether a person smokes and the presence of heart disease.
Statistical analysis for assessing relationships
Are you trying to establish whether a relationship exists between variables and not sure which statistical test to use? This flowchart will help you decide among the commonly used statistical methods when assessing relationships.
Statistical analysis for group comparisons
Are you making group comparisons? This flowchart will help you decide among the commonly used statistical methods when determining whether differences exist between groups.
If you are looking for dissertation help with your statistical analysis and guidance on which statistical tests are most appropriate to address your research questions, contact us to book your complimentary 30-minute consultation to find out how we can assist you.
This blog post was written by Kirstie Eastwood, lead statistician at Dissertation by Design.