Installation
You can install the development version of tidyusoc from GitHub with:
# install.packages("devtools")
devtools::install_github("wklimowicz/tidyusoc")Setup
After downloading the Understanding Society Data from the UK Data Service website, run:
library(tidyusoc)
# Convert to Rds to save space and time
usoc_convert(usoc_directory = "spss/spss25",
new_directory = "rds",
filter_files = "indresp")
# Compile the INDRESP files in the rds folder
usoc <- usoc_compile(directory = "rds")A small set of core variables is included by default:
| Variable | Description |
|---|---|
| pidp | Personal ID |
| age | Age |
| race | Race/Ethnicity |
| sex | Sex |
| country | Country |
| gor_dv | Government Office Region |
| qfhigh_dv | Highest Qualification (Detailed, UKHLS Only) |
| hiqual_dv | Highest Qualification |
| jbstat | Job Status |
| jbsoc00_cc | Occupation |
| fimnlabgrs_dv | Labour Income |
To add more variables, use the extra_mappings function. For variables that change over time use pick_var. For example, life satisfaction is:
-
lfsatoin waves 6-10 and 12-18 of the BHPS. -
sclfsatoin waves 1-11 of the UKHLS.
The variable BMI (bmi_dv) doesn’t change over time, so it’s passed as a string.
custom_mappings <- function(cols) {
life_sat <- pick_var(c("sclfsato", "lfsato"), cols)
custom_variables <- tibble::tribble(
~usoc_name, ~new_name, ~type,
life_sat, "life_satisfaction", "factor",
"bmi_dv", "bmi_dv", "numeric"
)
return(custom_variables)
}
usoc <- usoc_compile("rds", extra_mappings = custom_mappings)A good place to start looking for variables is the Understanding Society Variable Search, and the Documentation.
Compiling different surveys.
By default, these scripts compile the indresp file - you can change this with the file argument. For example, to compile the youth response files:
usoc_convert(usoc_directory = "spss/spss25",
new_directory = "rds",
filter_files = "youth")
usoc <- usoc_compile(directory = "rds", file = "youth")Using the data across multiple projects.
Adding an environment variable called DATA_DIRECTORY pointing at a folder, and using the save_to_folder argument saves an fst of the final dataset, so it can be loaded from anywhere.
library(tidyusoc)
usoc_compile(directory = "rds", save_to_folder = TRUE)
usoc <- usoc_load()To read any of the other compiled files change the file argument:
usoc <- usoc_load(file = "youth")