This tutorial describes how to process laboratory data using Pro-Laboratory module in the system.
EHR
package.This tutorial describes how to use the Pro-Laboratory module to process laboratory data (see Choi et al.\(^{1}\) for details).
To begin we load the EHR
package.
We will use example creatinine lab data to demonstrate the Pro-Laboratory module. The raw data is shown below.
creat.in <- read.csv(system.file("examples", "str_ex2","Creatinine_DATA.csv", package="EHR"))
head(creat.in)
Subject.uniq date time creat
1 28579217 02/05/17 4:00 0.52
2 28579217 02/06/17 5:00 0.53
3 28579217 10/03/11 4:28 0.42
4 28579217 10/04/11 4:15 0.35
5 28579217 10/06/11 4:25 0.29
6 28579217 10/09/11 4:45 0.28
This data consists of a patient ID, date, time, and the creatinine level.
The patient ID may need to be renamed so that all input datasets have the same name for the patient ID. This is necessary when combining the datasets to create a crosswalk between the original ID variables and the new ID variables used in the Pro-Laboratory module. See “2. EHR Vignette for Structured Data” of the EHR
package for more information. We demonstrate how to rename the patient ID variable below.
creat.new <- dataTransformation(creat.in, rename = c('Subject.uniq' = 'subject_uid'))
In practice, the laboratory data will need to be combined with other input datasets. This process involves creating a crosswalk between original ID variables and new ID variables. The new ID variables that are required to be the same across all datasets are mod_id
, mod_visit
, and mod_id_visit
. See “2. EHR Vignette for Structured Data” of EHR
package and “Build-PK-IV - Comprehensive Workshop” for examples of this process.
For simplicity, we will skip this step in this tutorial.
We need to save our dataset as an RDS file using saveRDS
as shown below. Here, we create a temporary directory to store the file using tempdir
; however, dataDir
can be a specific directory on your computer.
The next section demonstrates how to use the run_Labs
function to run the Pro-Laboratory module.
run_Labs
The following arguments must be specified:
lab.path
: The file path where the laboratory data exist. It must be an RDS file.lab.select
: The list of variables in the laboratory data to be retained.lab.mod.list
: A list containing modifications to variables in the laboratory data.Below we show how we would run run_Labs
using the example laboratory data from above.
creat.out <- run_Labs(lab.path=file.path(dataDir,"creat_new.rds"),
lab.select = c('subject_uid','date.time','creat'),
lab.mod.list = list(date.time = expression(parse_dates(fixDates(paste(date, time))))))
In the above code, the lab.mod.list
argument specifies a modification to our dataset to include a date.time
variable, which is created by combining the original date
and time
variables. The lab.select
argument says that we want to keep the subject_uid
, date.time
, and creat
variables.
run_Labs
head(creat.out)
subject_uid date.time creat
1 28579217 2017-02-05 04:00:00 0.52
2 28579217 2017-02-06 05:00:00 0.53
3 28579217 2011-10-03 04:28:00 0.42
4 28579217 2011-10-04 04:15:00 0.35
5 28579217 2011-10-06 04:25:00 0.29
6 28579217 2011-10-09 04:45:00 0.28
This data can be merged with data from other modules. See the “Build-PK-IV - Comprehensive Workshop” for an example.
If you see mistakes or want to suggest changes, please create an issue on the source repository.