5 Test Yourselves 2
Please read the Using tidyverse section before working on this exercise, especially if you have never used tidyverse before.
As before, suggested answers are at the bottom of this page, but please do try the exercise before looking at them. :)
5.1 Instructions
- Download the data file here: SWB.csv.
- Start an R Project in the folder where you saved the data file.
- Open up a new script file in RStudio.
- Load
tidyverse
andpsych
packages. - Read in the data file
SWB.csv
. Give this dataset the namedataset
. - Examine
dataset
. Minimally you should be able to tell there are 343 rows and 23 columns. You should also be able to tell the variable names and the data type for each variable. - Convert
marital_status
andhave_children
from integer to factor. Add value labels to the factors. See the legend below to see how each variable is coded. - Check that the two variables are indeed converted to factors.
- Create a subset of the dataset of only married parents (i.e., married people who have children). Further subset the dataset such that only the variables
swls2021_1
,swls2021_2
,swls2021_3
,swls2021_4
, andswls2021_5
are in the dataset. You should have 68 rows and 5 columns in this subset. Give this subset the namesubset
. - Using
subset
, compute the average ofswls2021_1
,swls2021_2
,swls2021_3
,swls2021_4
, andswls2021_5
for each person. Call this variableswls2021_avg
. - Create a histogram for
swls2021_avg
. Although I’d be okay with just the basic histogram, I’d like you to play around with the different customisations! Change the bar fill color, change the range of the x-axis, make the background white, etc. (I just want you to explore and have fun here!) - Using
subset
, get the following summary statistics forswls2021_avg
: maximum, minimum, mean, and sd. Specifically, use thesummarise()
function fromdplyr
package. Then confirm the results using thedescribe()
function from thepsych
package. Check that the maximum is 5.4, the minimum is 2.6, the mean is 4.18, and the sd is 0.707. - Combine different functions from the
dplyr
package to find out the number of married parents who haveswls2021_avg
greater than or equal to 4.18. Check that the number is 36.
5.2 Legend
Variable Name | Variable Label | Value Label |
---|---|---|
pin | participant identification number | |
gender | gender | 0 = male, 1 = female |
marital_status | marital status | 1 = married, 2 = divorced, 3 = widowed |
have_children | parental status | 0 = no children, 1 = have children |
mvs_1 | My life would be better if I own certain things I don’t have. | 1 = strongly disagree, 2 = disagree, 3 = neither disagree nor agree, 4 = agree, 5 = strongly agree |
mvs_2 | The things I own say a lot about how well I’m doing. | 1 = strongly disagree, 2 = disagree, 3 = neither disagree nor agree, 4 = agree, 5 = strongly agree |
mvs_3 | I’d be happier if I could afford to buy more things. | 1 = strongly disagree, 2 = disagree, 3 = neither disagree nor agree, 4 = agree, 5 = strongly agree |
mvs_4 | It bothers me that I can’t afford to buy things I’d like. | 1 = strongly disagree, 2 = disagree, 3 = neither disagree nor agree, 4 = agree, 5 = strongly agree |
mvs_5 | Buying things gives me a lot of pleasure. | 1 = strongly disagree, 2 = disagree, 3 = neither disagree nor agree, 4 = agree, 5 = strongly agree |
mvs_6 | I admire people who own expensive homes, cars, clothes. | 1 = strongly disagree, 2 = disagree, 3 = neither disagree nor agree, 4 = agree, 5 = strongly agree |
mvs_7 | I like to own things that impress people. | 1 = strongly disagree, 2 = disagree, 3 = neither disagree nor agree, 4 = agree, 5 = strongly agree |
mvs_8 | I like a lot of luxury in my life. | 1 = strongly disagree, 2 = disagree, 3 = neither disagree nor agree, 4 = agree, 5 = strongly agree |
mvs_9 | I try to keep my life simple, as far as possessions are concerned. (R) | 1 = strongly disagree, 2 = disagree, 3 = neither disagree nor agree, 4 = agree, 5 = strongly agree |
swls2019_1 | In most ways my life is close to my ideal. (2019) | 1 = strongly disagree, 2 = disagree, 3 = slightly disagree, 4 = neither disagree nor agree, 5 = slightly agree, 6 = agree, 7 = strongly agree |
swls2019_2 | The conditions of my life are excellent. (2019) | 1 = strongly disagree, 2 = disagree, 3 = slightly disagree, 4 = neither disagree nor agree, 5 = slightly agree, 6 = agree, 7 = strongly agree |
swls2019_3 | I am satisfied with my life. (2019) | 1 = strongly disagree, 2 = disagree, 3 = slightly disagree, 4 = neither disagree nor agree, 5 = slightly agree, 6 = agree, 7 = strongly agree |
swls2019_4 | So far I have gotten the important things I want in life. (2019) | 1 = strongly disagree, 2 = disagree, 3 = slightly disagree, 4 = neither disagree nor agree, 5 = slightly agree, 6 = agree, 7 = strongly agree |
swls2019_5 | In most ways my life is close to my ideal. (2019) | 1 = strongly disagree, 2 = disagree, 3 = slightly disagree, 4 = neither disagree nor agree, 5 = slightly agree, 6 = agree, 7 = strongly agree |
swls2021_1 | In most ways my life is close to my ideal. (2021) | 1 = strongly disagree, 2 = disagree, 3 = slightly disagree, 4 = neither disagree nor agree, 5 = slightly agree, 6 = agree, 7 = strongly agree |
swls2021_2 | The conditions of my life are excellent. (2021) | 1 = strongly disagree, 2 = disagree, 3 = slightly disagree, 4 = neither disagree nor agree, 5 = slightly agree, 6 = agree, 7 = strongly agree |
swls2021_3 | I am satisfied with my life. (2021) | 1 = strongly disagree, 2 = disagree, 3 = slightly disagree, 4 = neither disagree nor agree, 5 = slightly agree, 6 = agree, 7 = strongly agree |
swls2021_4 | So far I have gotten the important things I want in life. (2021) | 1 = strongly disagree, 2 = disagree, 3 = slightly disagree, 4 = neither disagree nor agree, 5 = slightly agree, 6 = agree, 7 = strongly agree |
swls2021_5 | In most ways my life is close to my ideal. (2021) | 1 = strongly disagree, 2 = disagree, 3 = slightly disagree, 4 = neither disagree nor agree, 5 = slightly agree, 6 = agree, 7 = strongly agree |
5.3 Suggested Answers
# Skipped Steps 1 to 3.
# 4. Load packages
library(tidyverse)
library(psych)
# 5. Read .csv file
dataset <- read_csv("SWB.csv")
# read.csv("SWB.csv") also okay
# 6. Examine the data
# Take a look at the variable names and data types
glimpse(dataset) # str(dataset) or View(dataset) also okay
# 7. Convert integer to factor and add value labels to the factor.
dataset$marital_status <- factor(dataset$marital_status,
levels = c(1, 2, 3),
labels = c("married", "divorced", "widowed"))
dataset$have_children <- factor(dataset$have_children,
levels = c(0, 1),
labels = c("no children", "have children"))
# 8. Check variables are now factors
glimpse(dataset$marital_status) # str(dataset$marital_status) also okay
glimpse(dataset$have_children) # str(dataset$have_children) also okay
# 9. Create subset
subset <- dataset %>%
filter(marital_status == "married" & have_children == "have children") %>%
select(swls2021_1:swls2021_5)
# 10. Compute swls2021_avg
subset <- subset %>%
mutate(swls2021_avg = rowMeans(across(c(swls2021_1:swls2021_5))))
# 11. Create a histogram for swls2021_avg
ggplot(data = subset) +
geom_histogram(aes(x = swls2021_avg)) +
scale_x_continuous(expand = c(0, 0),
limits = c(0, 8),
# Every 0.5 units, a tick will appear on x-axis
breaks = seq(0, 8, 0.5)) +
scale_y_continuous(expand = c(0, 0),
# Every 1 unit, a tick will appear on y-axis
breaks = seq(0, 20, 1)) +
labs(x = "Average 2021 Satisfaction With Life Scores",
y = "Frequency",
title = "Histogram of Average 2021 Satisfaction With Life Scores") +
# Choose the theme (there are different themes to choose, just type theme and some suggestions will pop up)
theme_classic()
# 12A. Get summary statistics using the summarise() function
# Maximum is 5.4, minimum is 2.6, mean is 4.18, and sd is 0.707.
subset %>%
summarise(avg_swls = mean(swls2021_avg),
min_swls = min(swls2021_avg),
max_swls = max(swls2021_avg),
sd_swls = sd(swls2021_avg))
# 12B. Get summary statistics using the describe() function
# Maximum is 5.4, minimum is 2.6, mean is 4.18, and sd is 0.707.
describe(subset$swls2021_avg)
# 13A. Create a subset that only contains people with an average SWLS of 4.18 and above
# Then, count the number of people remaining
subset %>%
filter(swls2021_avg >= 4.18) %>%
summarise (total = n())
# 13B. Create a subset that only contains people with an average SWLS of 4.18 and above
# Then, count the number of people remaining
subset %>%
filter(swls2021_avg >= 4.18) %>%
nrow # count the number of rows remaining