Installation
You can install the development version of tidylfs from GitHub with:
# install.packages("devtools")
devtools::install_github("wklimowicz/tidylfs")
Quickstart
Copy the LFS .sav
files into a directory (in my example lfs_data_folder
). They should look like this:
You can then run:
library(tidylfs)
# Converts `.sav` files to `.Rds` files, to save
# space and for quicker loading
lfs_convert(lfs_directory = "lfs_data_folder/",
output_directory = "lfs_rds_folder/")
# Compiles into one file.
lfs <- lfs_compile(lfs_directory = "lfs_rds_folder/")
To reproduce official ONS publications, such as
- Unemployent UNEM01 NSA
lfs %>%
lfs_summarise_unemployment(QUARTER)
- Salary by occupation EARN06
lfs %>%
dplyr::filter(
!is.na(OCCUPATION_MAJOR), # Filter out NA's
FTPTWK == "Full-time" # Take only full-time employees
) %>%
lfs_summarise_salary(QUARTER, OCCUPATION_MAJOR)
Extending them is easy:
- Unemployment by quarter, sex, and age category
lfs %>%
lfs_summarise_unemployment(QUARTER, SEX, AGES)
For more information on the ONS variables, see the LFS Guidance.
Some of the variables included by default are:
Variable Name | Definition |
---|---|
YEAR | Year |
QUARTER | Quarter |
SEX | Sex |
GOVTOR | Government Office Region |
AGE | Age |
FTPTWK | Part-Time/Full-time Status |
EDAGE | Age when completed full time education |
TTACHR | Actual hours worked |
UNION | In union? |
HIQUALD | Highest Qualification |
DEGREE_SUBJECT | Degree Subject |
OCCUPATION | Occupation in main job |
PARENTAL_OCCUPATION | Parental Occupation at 14 |
ETHNICITY | Ethnicity |
Using the data across multiple projects
To avoid storing multiple copies of the compiled dataset, you can set the DATA_DIRECTORY
environment variable before compiling. This can be done on the system or in .Rprofile
:
Sys.setenv(DATA_DIRECTORY = "path/to/folder")
When compiling the dataset include the save_to_folder option as TRUE
:
lfs_compile(lfs_directory = "lfs_rds_folder/", save_to_folder = TRUE)
The compiled dataset will be saved to the directory as an fst file, and you can load it from anywhere using this command:
lfs <- lfs_load(data.table = FALSE)