Lab 7

Overview

In this assignment you will:

  • Distinguish between z- and t-tests
  • Use hypothesis testing
  • Calculate a test statistic
  • Make decisions based on test results
  • Utilize real data to connect statistics with geography

Statistical questions about spatial data

You are working for a consulting company which has asked you to address the following question:

  • Are 2015 average home values in block groups in Eau Claire County greater than the average for the entire country?

Due to data collection constraints, you will be working with a random sample of homes within the county. According to U.S. News, the average home price in the United States was $176,500 in 2015.

First, complete the data preparation steps below. Then, address the questions which follow.

# load packages: sf, BSDA, and haffutils

# read data
bg <- _____("https://gitlab.com/mhaffner/data/-/raw/master/ec_county_bg_sample.geojson")

# save data as a shapefile (only needs to be completed once)
_____(bg, _____)
  1. List the assumptions of a one-group z-test and a one-group t-test (1 pt).

  2. Evaluate the assumptions and state the appropriate test. Include any graphical measures used or test outputs here (1 pt.).

  3. Report the mean, median, and standard deviation. Additionally, report the minimum and maximum values in the dataset (1 pt.).

  4. State the null and alternative hypotheses. Be sure to use mathematical symbols rather than words (1 pt.).

  5. Select a significance level and justify your choice (1 pt.).

  6. State the critical value(s) (1 pt.).

  7. Report the test statistic (1 pt.).

  8. State your conclusion (1 pt.).

  9. Create a map that helps explain your results (2 pt.).

Bonus (optional section)

Recall the question posed in Part 2 of Lab 4 about teachers and Eau Claire North vs. Eau Claire Memorial. Conduct a two-sample unpaired t-test to compare the means of the two datasets. Use the t.test() function as demonstrated below. You may want to refer to this resource to understand how a two-sample t-test works.

t.test(x, y, alternative = "less", var.equal = FALSE)

In the code above, x should be a vector of the scores for students at Eau Claire North and y is a vector of scores for students at Eau Claire Memorial. What is the conclusion of the test? Are scores at Eau Claire North significantly less than those are Eau Claire Memorial? How should the code be altered if it were a two-tailed test? (1 pt.)