+ - 0:00:00
Notes for current slide
Notes for next slide

DS Skills Lab 04

Show me the receipts!

Dr Danielle Evans

17 Oct 2024

1 / 20

Overview

  • Making Claims & Providing Evidence

    • Reporting Statistical Values
  • Reporting Model Fit

  • Justified Decisions

    • Reporting Assumption Checks
  • Reporting Model Parameters

  • KahootR

2 / 20

Background

  • There are multiple researcher degrees of freedom when it comes to conducting research and statistical analyses

  • Essentially, this means that the choice of the analysis, interpretation, and final model is up to you as the analyst!

  • So, you must decide which is the final/best model for your data for each analysis and argue/defend your case in your results section

3 / 20

Making Claims

  • What do each of these statements need?

    • "People who have synaesthesia are more creative than people who don't have synaesthesia."

    • "There are only two types of people in this world: those who can extrapolate from incomplete data."

    • "The second, more complex model was a better model than the first model with only one predictor."

4 / 20

Making Claims

  • What do each of these statements need? EVIDENCE!

    • "People who have synaesthesia are more creative than people who don't have synaesthesia."

    • "There are only two types of people in this world: those who can extrapolate from incomplete data."

    • "The second, more complex model was a better model than the first model with only one predictor."

5 / 20

But How?

6 / 20

Forms of Evidence

  • Use your in-text statistical results as if they were citations!

  • The statistics you report in each sentence should provide evidence for the claim you make in that sentence, and the sentence should be a complete thought without the statistics:

    • NO: "The means were (mean = 135.30…"
7 / 20

Forms of Evidence

  • Use your in-text statistical results as if they were citations!

  • The statistics you report in each sentence should provide evidence for the claim you make in that sentence, and the sentence should be a complete thought without the statistics:

    • NO: "The means were (mean = 135.30…"

    • YES: "The means indicated that the synaesthete group (M = 135.30, SD = 12.06) had a higher score for creativity than the non-synaesthete group (M = 105.78, SD = 32.90)."

7 / 20

APA Reporting

  • When reporting numbers to provide evidence for a claim you're making, keep in mind:

    • Test statistics like t, F, and p are italicised

    • Always round to 2 decimal places, except for p which should be rounded to three

    • Look up the correct reporting format if you're not sure!

8 / 20

Reporting Results

  • When reporting our overall model, we need to build a narrative clearly describing each step (in full sentences), with evidence to back up any claims, and justified decisions throughout:

  • Specifically, we need to include:

    • What models we've fit to our data (& how well they fit)

    • What model was better

    • The assumptions you checked, and the outcome of those checks

    • The main results from the better model

9 / 20

What Models*

  • First we need to describe what models we've fit to our data:

  • NO: "The model being constructed is a dual-predictor multiple regression with OLS estimation. Predictors will be considered to be significant if the probability p of finding the b values by chance is less than 0.05. The two first predictor was beauty and had p = 3.13e-05. The second p-value was 0.00165 for nativeno."

10 / 20

What Models*

  • First we need to describe what models we've fit to our data:

  • NO: "The model being constructed is a dual-predictor multiple regression with OLS estimation. Predictors will be considered to be significant if the probability p of finding the b values by chance is less than 0.05. The two first predictor was beauty and had p = 3.13e-05. The second p-value was 0.00165 for nativeno."

  • YES: "The first model investigated instructor beauty ratings as a predictor of teaching evaluations, and showed satisfactory model fit; (R2 = .04, F(1, 461) = 17.08, p < .001). The second model investigated instructor beauty ratings and native speakers of English as predictors of teaching evaluations (R2 = .06, F(2, 460) = 13.72, p < .001), showing significantly better model fit compared to the first model (R2change = .02, F(1, 460) = 10.02, p = .002)."


10 / 20

What Models*

  • First we need to describe what models we've fit to our data:

  • NO: "The model being constructed is a dual-predictor multiple regression with OLS estimation. Predictors will be considered to be significant if the probability p of finding the b values by chance is less than 0.05. The two first predictor was beauty and had p = 3.13e-05. The second p-value was 0.00165 for nativeno."

  • YES: "The first model investigated instructor beauty ratings as a predictor of teaching evaluations, and showed satisfactory model fit; (R2 = .04, F(1, 461) = 17.08, p < .001). The second model investigated instructor beauty ratings and native speakers of English as predictors of teaching evaluations (R2 = .06, F(2, 460) = 13.72, p < .001), showing significantly better model fit compared to the first model (R2change = .02, F(1, 460) = 10.02, p = .002)."


Don't directly copy this in your assessments otherwise you'll be flagged for academic misconduct!! Use it to guide you..

10 / 20

Making Justified Decisions*

  • After fitting our models, we want to check our assumptions to see if our model was biased, and decide on which final model to report

  • But there are fewer standardised formats for assumptions checks reporting

  • You should aim to explain the process of your decision-making clearly, step by step

  • So let's look at the assumptions first, and then think about what we did to check our model, and then let's add in the evidence...

11 / 20

First, What Steps Did We Take?

  1. We checked some plots of linearity, heteroscedasticity, normality of residuals, & influential cases
12 / 20

First, What Steps Did We Take?

  1. We checked some plots of linearity, heteroscedasticity, normality of residuals, & influential cases

  2. We checked outliers using standardised residuals

12 / 20

First, What Steps Did We Take?

  1. We checked some plots of linearity, heteroscedasticity, normality of residuals, & influential cases

  2. We checked outliers using standardised residuals

  3. We fit robust models as a sensitivity check to examine the pattern of results compared to our original model

12 / 20

First, What Steps Did We Take?

  1. We checked some plots of linearity, heteroscedasticity, normality of residuals, & influential cases

  2. We checked outliers using standardised residuals

  3. We fit robust models as a sensitivity check to examine the pattern of results compared to our original model

  4. We decided which model to report

12 / 20

Now Show The Receipts!

13 / 20

Providing Evidence

  • "Non-linearity and heteroscedasticity were checked using a scatterplot of the predicted values against the residuals. The plot showed no issues with non-linearity or heteroscedasticity in the data."
14 / 20

Providing Evidence

  • "Non-linearity and heteroscedasticity were checked using a scatterplot of the predicted values against the residuals. The plot showed no issues with non-linearity or heteroscedasticity in the data."

  • "A Q-Q plot of the standardized residuals indicated the residuals were fairly normally distributed, with deviations at the tails."

14 / 20

Providing Evidence

  • "Non-linearity and heteroscedasticity were checked using a scatterplot of the predicted values against the residuals. The plot showed no issues with non-linearity or heteroscedasticity in the data."

  • "A Q-Q plot of the standardized residuals indicated the residuals were fairly normally distributed, with deviations at the tails."

  • "Influential cases were checked using Cook’s distance with all values below 0.05 suggesting there were no influential cases in our data. The standardised residuals were inspected for outliers; all cases fell within the expected ranges."

14 / 20

Providing Evidence

  • "Non-linearity and heteroscedasticity were checked using a scatterplot of the predicted values against the residuals. The plot showed no issues with non-linearity or heteroscedasticity in the data."

  • "A Q-Q plot of the standardized residuals indicated the residuals were fairly normally distributed, with deviations at the tails."

  • "Influential cases were checked using Cook’s distance with all values below 0.05 suggesting there were no influential cases in our data. The standardised residuals were inspected for outliers; all cases fell within the expected ranges."

  • "Robust models were fit to the data as a sensitivity check, and showed the same pattern of results as the original model."

14 / 20

Providing Evidence

  • "Non-linearity and heteroscedasticity were checked using a scatterplot of the predicted values against the residuals. The plot showed no issues with non-linearity or heteroscedasticity in the data."

  • "A Q-Q plot of the standardized residuals indicated the residuals were fairly normally distributed, with deviations at the tails."

  • "Influential cases were checked using Cook’s distance with all values below 0.05 suggesting there were no influential cases in our data. The standardised residuals were inspected for outliers; all cases fell within the expected ranges."

  • "Robust models were fit to the data as a sensitivity check, and showed the same pattern of results as the original model."

  • "Therefore, the final model reported is the unadjusted model predicting teaching evaluation scores from instructor beauty ratings and whether the instructor is a native english speaker."

14 / 20

Providing Evidence

  • "Non-linearity and heteroscedasticity were checked using a scatterplot of the predicted values against the residuals. The plot showed no issues with non-linearity or heteroscedasticity in the data."

  • "A Q-Q plot of the standardized residuals indicated the residuals were fairly normally distributed, with deviations at the tails."

  • "Influential cases were checked using Cook’s distance with all values below 0.05 suggesting there were no influential cases in our data. The standardised residuals were inspected for outliers; all cases fell within the expected ranges."

  • "Robust models were fit to the data as a sensitivity check, and showed the same pattern of results as the original model."

  • "Therefore, the final model reported is the unadjusted model predicting teaching evaluation scores from instructor beauty ratings and whether the instructor is a native english speaker."

Again, don't directly copy this in your assessments otherwise you'll be flagged for academic misconduct!! Use it to guide you..

14 / 20

That's BetteR!

15 / 20

Reporting our Model*

  • We should report the effect of each predictor in full, with statistics and a plain language summary, and should compare the standardised betas to evidence which predictor had a stronger relationship with our outcome, e.g.,
16 / 20

Reporting our Model*

  • We should report the effect of each predictor in full, with statistics and a plain language summary, and should compare the standardised betas to evidence which predictor had a stronger relationship with our outcome, e.g.,

    • "In the final model, instructors' beauty ratings significantly predicted their teaching evaluation scores (b = 0.13, SE(b) = 0.03, t = 4.21, p < .001, 95% CI [0.07, 0.20])"
16 / 20

Reporting our Model*

  • We should report the effect of each predictor in full, with statistics and a plain language summary, and should compare the standardised betas to evidence which predictor had a stronger relationship with our outcome, e.g.,

    • "In the final model, instructors' beauty ratings significantly predicted their teaching evaluation scores (b = 0.13, SE(b) = 0.03, t = 4.21, p < .001, 95% CI [0.07, 0.20])"

    • "The findings suggest that as instructors’ beauty scores increase by one point on a scale from one to five, their teaching evaluations increase by 0.13 points (scale: 1-10)."

16 / 20

Reporting our Model*

  • We should report the effect of each predictor in full, with statistics and a plain language summary, and should compare the standardised betas to evidence which predictor had a stronger relationship with our outcome, e.g.,

    • "In the final model, instructors' beauty ratings significantly predicted their teaching evaluation scores (b = 0.13, SE(b) = 0.03, t = 4.21, p < .001, 95% CI [0.07, 0.20])"

    • "The findings suggest that as instructors’ beauty scores increase by one point on a scale from one to five, their teaching evaluations increase by 0.13 points (scale: 1-10)."

    • "The standardised estimates for instructor beauty and native English speaker suggest that whether the instructor is a native English speaker is a stronger predictor (B = -0.60, SE(B) = 0.19, 95% CI [-0.97, -0.23]) of teaching evaluations compared to instructor beauty (B = 0.19, SE(B) = 0.05, 95% CI [ 0.10, 0.28])"

16 / 20

Reporting our Model*

  • We should report the effect of each predictor in full, with statistics and a plain language summary, and should compare the standardised betas to evidence which predictor had a stronger relationship with our outcome, e.g.,

    • "In the final model, instructors' beauty ratings significantly predicted their teaching evaluation scores (b = 0.13, SE(b) = 0.03, t = 4.21, p < .001, 95% CI [0.07, 0.20])"

    • "The findings suggest that as instructors’ beauty scores increase by one point on a scale from one to five, their teaching evaluations increase by 0.13 points (scale: 1-10)."

    • "The standardised estimates for instructor beauty and native English speaker suggest that whether the instructor is a native English speaker is a stronger predictor (B = -0.60, SE(B) = 0.19, 95% CI [-0.97, -0.23]) of teaching evaluations compared to instructor beauty (B = 0.19, SE(B) = 0.05, 95% CI [ 0.10, 0.28])"

  • We can also include an APA style table of our results...

16 / 20
  • Don't define any statistical concepts you would find in a statistics/psychology textbook

    • NO: "The p-value is the probability of finding this result…"
  • You should definitely explain the decisions & results specific to your study

    • YES: "Using an alpha level of .05, there was a significant relationship between…"
  • You can assume that your audience are interested in & somewhat familiar with the field of research & know analysis techniques/stats terms

  • But they do not know any of the details of the study you have conducted and analysed & have no idea what your data look like:

    • Don't EVER refer to measures as the variable name as it appears in R!!!!
17 / 20
  • Don't define any statistical concepts you would find in a statistics/psychology textbook

    • NO: "The p-value is the probability of finding this result…"
  • You should definitely explain the decisions & results specific to your study

    • YES: "Using an alpha level of .05, there was a significant relationship between…"
  • You can assume that your audience are interested in & somewhat familiar with the field of research & know analysis techniques/stats terms

  • But they do not know any of the details of the study you have conducted and analysed & have no idea what your data look like:

    • Don't EVER refer to measures as the variable name as it appears in R!!!!
  • If you're not sure whether you've given enough evidence or clearly justified a decision, for each claim you make just ask yourself...

17 / 20

The GuRu

18 / 20

Final RemindRs!

19 / 20

KahootR!

20 / 20

Overview

  • Making Claims & Providing Evidence

    • Reporting Statistical Values
  • Reporting Model Fit

  • Justified Decisions

    • Reporting Assumption Checks
  • Reporting Model Parameters

  • KahootR

2 / 20
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
sToggle scribble toolbox
Esc Back to slideshow