Build-PK-IV - Comprehensive

This tutorial describes a comprehensive PK data building procedure for medications that are intravenously administered. There are two phases: data processing which standardizes and combines the input data (Pro-Demographic, Pro-Med-Str, Pro-Drug Level, Pro-Laboratory) and data building which creates the final PK data (Build-PK-IV).

Nathan T. James


This tutorial describes four modules for processing data (Pro-Demographic, Pro-Med-Str, Pro-Drug Level, Pro-Laboratory) and one module for PK data building (Build-PK-IV) using data extracted from a structured database.

To begin we load the EHR package, the pkdata package, and the lubridate package.

# load EHR package and dependencies
# define 3 directories
rawDataDir <- system.file("examples", "str_ex2", package="EHR") # directory for raw data

td <- tempdir()
checkDir <- file.path(td, 'checks') # directory for interactive checking

dataDir <- file.path(td, 'data') # directory for processed data

# examine raw data files in rawDataDir
[1] "Albumin_DATA.csv"             "Creatinine_DATA.csv"          "Demographics_DATA.csv"        "e-rx_DATA.csv"               
[5] "FLOW_DATA.csv"                "MAR_DATA.csv"                 "medChecked-fent.csv"          "SampleConcentration_DATA.csv"
[9] "SampleTimes_DATA.csv"        

Pre-Processing for Raw Extracted Data

The raw datasets must go through a pre-processing stage which creates new ID variables and datasets that can be used by the data processing modules. There are three pre-processing steps:

  1. read and clean raw data
  2. merge raw data to create new ID variables
  3. make new data for use with modules.

Each raw dataset should contain a subject unique ID, a subject visit ID, or both ids. In this example the subject unique ID is called subject_uid and the subject visit ID is called subject_id. The subject visit ID is a combination of subject and visit/course – e.g., subject_id 14.0 is the first course for subject 14, subject_id 14.1 is the second course for subject 14, and so on. subject_uid is a unique ID that is the same for all subject records. The integer part of subject_id has a 1-to-1 correspondence with subject_uid – for this example, subject_uid 62734832 is associated with both subject_id 14.0 and subject_id 14.1. If there is only a single visit/course per subject only the subject unique ID is needed.

(1) Read and clean raw data

# demographics data <- readTransform(file.path(rawDataDir, "Demographics_DATA.csv"))
  subject_id subject_uid gender weight height surgery_date ageatsurgery stat_sts cpb_sts in_hospital_mortality add_ecmo date_icu_dc
1       1106    34364670      0   5.14  59.18    6/28/2014          141        3     133                     0        0    7/2/2014
2       1444    36792472      1   5.67  62.90    1/10/2016          292        1      65                     0        0   1/12/2016
3       1465    36292449      0  23.67 118.02    3/19/2016         2591        2     357                     0        0   3/20/2016
4       1520    34161967      0  14.07  97.04    7/18/2016         1320        5      93                     0        0   7/19/2016
5       1524    37857374      1  23.40 102.80    7/23/2016         1561        3      87                     1        0   7/30/2016
6       1550    37826262      1   6.21  62.03     9/4/2016          208        1     203                     0        0   9/11/2016
1        1657
2        1325
3          NA
4        1745
5        1847
6        1210
# read SampleTimes_DATA.csv <- readTransform(file.path(rawDataDir, "SampleTimes_DATA.csv"),
    rename = c('Study.ID' = 'subject_id'),
    modify = list(samp = expression(as.numeric(sub('Sample ', '', Event.Name)))))
  subject_id Event.Name Sample.Collection.Date.and.Time samp
1      466.1   Sample 1                  2/3/2017 10:46    1
2      466.1   Sample 2                  2/4/2017 20:30    2
3     1106.0   Sample 1                 6/28/2014 13:40    1
4     1106.0   Sample 2                 6/29/2014 03:10    2
5     1106.0   Sample 3                 6/30/2014 03:35    3
6     1106.0   Sample 4                  7/1/2014 03:45    4
# helper function used to make subject_id
sampId <- function(x) {
  # remove leading zeroes or trailing periods
  subid <- gsub('(^0*|\\.$)', '', x)
  # change _ to .
  gsub('_([0-9]+[_].*)$', '.\\1', subid)

# read SampleConcentration_DATA.csv <- readTransform(file.path(rawDataDir, "SampleConcentration_DATA.csv"),
  modify = list(
    subid = expression(sampId(name)),
    subject_id = expression(as.numeric(sub('[_].*', '', subid))),
    samp = expression(sub('[^_]*[_]', '', subid)),
    name = NULL,
    data_file = NULL,
    subid = NULL
  record_id fentanyl_calc_conc subject_id samp
1         1         0.01413622      466.1    1
2         2         0.27982075      466.1    2
3         3         6.11873679     1106.0    1
4         4         0.59161716     1106.0    2
5         5         0.11280471     1106.0    3
6         6         0.02112153     1106.0    4
# FLOW dosing data <- readTransform(file.path(rawDataDir, "FLOW_DATA.csv"),
                         rename = c('Subject.Id' = 'subject_id',
                                    'Subject.Uniq.Id' = 'subject_uid')) 
# pre-process the flow data 
# date.time variable should be in an appropriate form[,'date.time'] <- pkdata::parse_dates(EHR:::fixDates([,'Perform.Date']))
# unit and rate are required: separate unit and rate from 'Final.Rate..NFR.units.' if needed[,'unit'] <- sub('.*[ ]', '',[,'Final.Rate..NFR.units.'])[,'rate'] <- as.numeric(sub('([0-9.]+).*', '\\1',[,'Final.Rate..NFR.units.']))
  subject_id subject_uid     Perform.Date FOCUS_MEDNAME Final.Rate..NFR.units. Final.Units Flow           date.time      unit
1       1596    38340814   12/4/2016 5:30      Fentanyl          6.75            1 mcg/kg/hr       3.375   NA 2016-12-04 05:30:00 mcg/kg/hr
2       1596    38340814   12/4/2016 6:00      Fentanyl          6.75            1 mcg/kg/hr       6.750  0.1 2016-12-04 06:00:00 mcg/kg/hr
3       1596    38340814   12/4/2016 7:00      Fentanyl          6.75            1 mcg/kg/hr       4.500  0.1 2016-12-04 07:00:00 mcg/kg/hr
4       1596    38340814   12/4/2016 7:40      Fentanyl          6.75            0 mcg/kg/hr       0.000   NA 2016-12-04 07:40:00 mcg/kg/hr
5       1607    38551767 12/24/2016 19:30      Fentanyl          2.60            2 mcg/kg/hr       2.600   NA 2016-12-24 19:30:00 mcg/kg/hr
6       1607    38551767 12/24/2016 20:00      Fentanyl          2.60            2 mcg/kg/hr       5.200  0.2 2016-12-24 20:00:00 mcg/kg/hr
1    1
2    1
3    1
4    0
5    2
6    2
# MAR dosing data
mar.in0 <- read.csv(file.path(rawDataDir, "MAR_DATA.csv"), check.names = FALSE) <- dataTransformation(mar.in0, rename = c('Uniq.Id' = 'subject_uid'))
  subject_uid       Date  Time                 med:mDrug   med:dosage med:route med:freq med:given
1    28579217 2017-02-04 19:15               Nicardipine 3 mcg/kg/min        IV     <NA>     Given
2    28579217 2011-10-02 22:11                Famotidine       4.5 mg        IV   q12hrs     Given
3    28579217 2011-10-02 20:17          Morphine sulfate         1 mg        IV  q2h prn     Given
4    28579217 2011-10-03 02:28 Diphenhydramine injection        12 mg        IV      now     Given
5    28579217 2011-10-02 22:11                 Cefazolin       225 mg        IV    q8hrs     Given
6    28579217 2011-10-02 23:30          Morphine sulfate         1 mg        IV  q2h prn     Given
# Serum creatinine lab data <- readTransform(file.path(rawDataDir, "Creatinine_DATA.csv"),
    rename = c('Subject.uniq' = 'subject_uid'))
  subject_uid     date time creat
1    28579217 02/05/17 4:00  0.52
2    28579217 02/06/17 5:00  0.53
3    28579217 10/03/11 4:28  0.42
4    28579217 10/04/11 4:15  0.35
5    28579217 10/06/11 4:25  0.29
6    28579217 10/09/11 4:45  0.28
# Albumin lab data <- readTransform(file.path(rawDataDir, "Albumin_DATA.csv"),
    rename = c('Subject.uniq' = 'subject_uid'))
  subject_uid     date time creat
1    28579217 02/05/17 4:00  0.52
2    28579217 02/06/17 5:00  0.53
3    28579217 10/03/11 4:28  0.42
4    28579217 10/04/11 4:15  0.35
5    28579217 10/06/11 4:25  0.29
6    28579217 10/09/11 4:45  0.28

(2) Merge data to create new ID variables

# define list of input datasets
data <-  list(,

# define list of vectors or character strings that identify the ID variables
idcols <-  list(c('subject_id', 'subject_uid'), # id vars in
                'subject_id', # id var in
                'subject_id', # id var in
                c('subject_id', 'subject_uid'), # id vars in
                'subject_uid', # id var in
                'subject_uid', # id var in
                'subject_uid') # id var in

# merge all IDs from cleaned datasets and create new ID variables
id.xwalk <- idCrosswalk(data, idcols,"subject_id","subject_uid")
saveRDS(id.xwalk, file=file.path(dataDir,"module_id_xwalk.rds"))
  subject_id subject_uid mod_visit mod_id mod_id_visit
1      466.0    28579217         1      1          1.1
2      466.1    28579217         2      1          1.2
3     1106.0    34364670         1      2          2.1
4     1444.0    36792472         1      3          3.1
5     1465.0    36292449         1      4          4.1
6     1520.0    34161967         1      5          5.1

(3) Make new data for use with modules

pullFakeId(data, id.xwalk, firstCols = NULL, orderBy = NULL)
## demographics data
demo.cln <- pullFakeId(, id.xwalk,
    firstCols = c('mod_id', 'mod_visit', 'mod_id_visit'), = 'subject_uid')
  mod_id mod_visit mod_id_visit gender weight height surgery_date ageatsurgery stat_sts cpb_sts in_hospital_mortality add_ecmo date_icu_dc
1      2         1          2.1      0   5.14  59.18    6/28/2014          141        3     133                     0        0    7/2/2014
2      3         1          3.1      1   5.67  62.90    1/10/2016          292        1      65                     0        0   1/12/2016
3      4         1          4.1      0  23.67 118.02    3/19/2016         2591        2     357                     0        0   3/20/2016
4      5         1          5.1      0  14.07  97.04    7/18/2016         1320        5      93                     0        0   7/19/2016
5      6         1          6.1      1  23.40 102.80    7/23/2016         1561        3      87                     1        0   7/30/2016
6      7         1          7.1      1   6.21  62.03     9/4/2016          208        1     203                     0        0   9/11/2016
1        1657
2        1325
3          NA
4        1745
5        1847
6        1210
saveRDS(demo.cln, file=file.path(dataDir,"demo_mod_id.rds"))

## drug level data
# sampling times
samp.cln <- pullFakeId(, id.xwalk,
    firstCols = c('mod_id', 'mod_visit', 'mod_id_visit', 'samp'), 
    orderBy = c('mod_id_visit','samp'), = 'subject_uid')
  mod_id mod_visit mod_id_visit samp Event.Name Sample.Collection.Date.and.Time
1      1         2          1.2    1   Sample 1                  2/3/2017 10:46
2      1         2          1.2    2   Sample 2                  2/4/2017 20:30
3     10         1         10.1    1   Sample 1                12/23/2016 05:15
4     10         1         10.1    2   Sample 2                12/24/2016 18:00
5     10         1         10.1    3   Sample 3                12/25/2016 03:00
6     10         1         10.1    4   Sample 4                12/26/2016 04:00
saveRDS(samp.cln, file=file.path(dataDir,"samp_mod_id.rds"))

# drug concentration measurements
conc.cln <- pullFakeId(, id.xwalk,
    firstCols = c('record_id', 'mod_id', 'mod_visit', 'mod_id_visit', 'samp'),
    orderBy = 'record_id', = 'subject_uid')
  record_id mod_id mod_visit mod_id_visit samp fentanyl_calc_conc
1         1      1         2          1.2    1         0.01413622
2         2      1         2          1.2    2         0.27982075
3         3      2         1          2.1    1         6.11873679
4         4      2         1          2.1    2         0.59161716
5         5      2         1          2.1    3         0.11280471
6         6      2         1          2.1    4         0.02112153
saveRDS(conc.cln, file=file.path(dataDir,"conc_mod_id.rds"))

## dosing data
# flow
flow.cln <- pullFakeId(, id.xwalk,
    firstCols = c('mod_id', 'mod_visit', 'mod_id_visit'), = 'subject_uid')
  mod_id mod_visit mod_id_visit     Perform.Date FOCUS_MEDNAME Final.Rate..NFR.units. Final.Units Flow           date.time
1      9         1          9.1   12/4/2016 5:30      Fentanyl          6.75            1 mcg/kg/hr       3.375   NA 2016-12-04 05:30:00
2      9         1          9.1   12/4/2016 6:00      Fentanyl          6.75            1 mcg/kg/hr       6.750  0.1 2016-12-04 06:00:00
3      9         1          9.1   12/4/2016 7:00      Fentanyl          6.75            1 mcg/kg/hr       4.500  0.1 2016-12-04 07:00:00
4      9         1          9.1   12/4/2016 7:40      Fentanyl          6.75            0 mcg/kg/hr       0.000   NA 2016-12-04 07:40:00
5     10         1         10.1 12/24/2016 19:30      Fentanyl          2.60            2 mcg/kg/hr       2.600   NA 2016-12-24 19:30:00
6     10         1         10.1 12/24/2016 20:00      Fentanyl          2.60            2 mcg/kg/hr       5.200  0.2 2016-12-24 20:00:00
       unit rate
1 mcg/kg/hr    1
2 mcg/kg/hr    1
3 mcg/kg/hr    1
4 mcg/kg/hr    0
5 mcg/kg/hr    2
6 mcg/kg/hr    2
saveRDS(flow.cln, file=file.path(dataDir,"flow_mod_id.rds"))

# mar
mar.cln <- pullFakeId(, id.xwalk, firstCols = 'mod_id', = 'subject_uid')
  mod_id       Date  Time                 med:mDrug   med:dosage med:route med:freq med:given
1      1 2017-02-04 19:15               Nicardipine 3 mcg/kg/min        IV     <NA>     Given
2      1 2011-10-02 22:11                Famotidine       4.5 mg        IV   q12hrs     Given
3      1 2011-10-02 20:17          Morphine sulfate         1 mg        IV  q2h prn     Given
4      1 2011-10-03 02:28 Diphenhydramine injection        12 mg        IV      now     Given
5      1 2011-10-02 22:11                 Cefazolin       225 mg        IV    q8hrs     Given
6      1 2011-10-02 23:30          Morphine sulfate         1 mg        IV  q2h prn     Given
saveRDS(mar.cln, file=file.path(dataDir,"mar_mod_id.rds"))

## laboratory data
# creatinine
creat.cln <- pullFakeId(, id.xwalk, 'mod_id', = 'subject_uid')
  mod_id     date time creat
1      1 02/05/17 4:00  0.52
2      1 02/06/17 5:00  0.53
3      1 10/03/11 4:28  0.42
4      1 10/04/11 4:15  0.35
5      1 10/06/11 4:25  0.29
6      1 10/09/11 4:45  0.28
saveRDS(creat.cln, file=file.path(dataDir,"creat_mod_id.rds"))

# albumin
alb.cln <- pullFakeId(, id.xwalk, 'mod_id', = 'subject_uid')
  mod_id     date  time alb
1      8 07/30/20  5:23 2.9
2      8 07/28/20  3:12 2.0
3      8 07/29/20  1:39 2.7
4      8 08/21/20 10:35 4.1
5      4 06/13/15 17:20 4.1
6      6 07/25/16  8:35 2.3
saveRDS(alb.cln, file=file.path(dataDir,"alb_mod_id.rds"))
# set crosswalk option 
xwalk <- readRDS(file.path(dataDir, "module_id_xwalk.rds"))
options(pkxwalk = 'xwalk')

# define parameters
drugname <- 'fent'
LLOQ <- 0.05


# helper function
exclude_val <- function(x, val=1) { ! & x == val }

demo.out <- run_Demo(demo.path = file.path(dataDir, "demo_mod_id.rds"),
    demo.columns = list(id = 'mod_id_visit'),
    toexclude = expression(exclude_val(in_hospital_mortality) | exclude_val(add_ecmo)),
    demo.mod.list = list(length_of_icu_stay = 
                        expression(daysDiff(surgery_date, date_icu_dc))))
The number of subjects in the demographic data, who meet the exclusion criteria: 2
  mod_id mod_visit mod_id_visit gender weight height surgery_date ageatsurgery stat_sts cpb_sts in_hospital_mortality add_ecmo date_icu_dc
1      2         1          2.1      0   5.14  59.18    6/28/2014          141        3     133                     0        0    7/2/2014
2      3         1          3.1      1   5.67  62.90    1/10/2016          292        1      65                     0        0   1/12/2016
3      4         1          4.1      0  23.67 118.02    3/19/2016         2591        2     357                     0        0   3/20/2016
4      5         1          5.1      0  14.07  97.04    7/18/2016         1320        5      93                     0        0   7/19/2016
5      6         1          6.1      1  23.40 102.80    7/23/2016         1561        3      87                     1        0   7/30/2016
6      7         1          7.1      1   6.21  62.03     9/4/2016          208        1     203                     0        0   9/11/2016
  time_fromor length_of_icu_stay
1        1657                  4
2        1325                  2
3          NA                  1
4        1745                  1
5        1847                  7
6        1210                  7
[1] "6.1"  "13.1"

Pro-Med-Str Part I: IV dose data

ivdose.out <- run_MedStrI(
    mar.columns = list(id='mod_id', datetime=c('Date','Time'), dose='med:dosage', drug='med:mDrug', given='med:given'),
    medGivenReq = TRUE,
    flow.columns = list(id = 'mod_id', datetime = 'date.time', finalunits = 'Final.Units', 
                        unit = 'unit', rate = 'rate', weight = ''),
    medchk.path=file.path(system.file("examples", "str_ex2", package="EHR"), sprintf('medChecked-%s.csv', drugname)),
    demo.list = NULL,
    demo.columns = list(),
    missing.wgt.path = NULL,
    wgt.columns = list(),
    check.path = checkDir,
    failflow_fn = 'FailFlow',
    failunit_fn = 'Unit',
    failnowgt_fn = 'NoWgt',
    infusion.unit = 'mcg/kg/hr',
    bolus.unit = 'mcg',
    bol.rate.thresh = Inf,
    rateunit = 'mcg/hr',
    ratewgtunit = 'mcg/kg/hr',
    weightunit = 'kg',
    drugname = drugname)
The number of rows in the original data                124
The number of rows after removing the duplicates       124
no units other than mcg/kg/hr or mcg, file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/failUnit-fent.csv not created
33 rows from 1 subjects with "kg" in infusion unit but missing weight, see file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/failNoWgt-fent.csv AND create /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/fixNoWgt-fent.csv
censor dates created, please see /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/CensorTime-fent.csv

  mod_id  date.dose infuse.time.real infuse.time infuse.dose          bolus.time bolus.dose given.dose maxint weight
1      1 2011-10-02             <NA>        <NA>          NA 2011-10-02 15:35:00         25         NA      0     NA
2      1 2011-10-02             <NA>        <NA>          NA 2011-10-02 17:26:00         25         NA      0     NA
3      1 2017-02-04             <NA>        <NA>          NA 2017-02-04 16:15:00         50         NA      0     NA
4      1 2017-02-04             <NA>        <NA>          NA 2017-02-04 16:30:00         20         NA      0     NA
5      1 2017-02-04             <NA>        <NA>          NA 2017-02-04 20:57:00         20         NA      0     NA
6      2 2014-06-28             <NA>        <NA>          NA 2014-06-28 08:15:00         20         NA      0     NA

Pro-Drug Level

conc.out <- run_DrugLevel(conc.path=file.path(dataDir,"conc_mod_id.rds"),
    conc.columns = list(id = 'mod_id', conc = 'conc.level', idvisit = 'mod_id_visit', samplinkid = 'mod_id_event'),'mod_id','mod_id_visit','samp','fentanyl_calc_conc'),
    conc.rename=c(fentanyl_calc_conc = 'conc.level', samp= 'event'),
    conc.mod.list=list(mod_id_event = expression(paste(mod_id_visit, event, sep = '_'))),
    samp.columns = list(conclinkid = 'mod_id_event', datetime = 'Sample.Collection.Date.and.Time'),
    samp.mod.list=list(mod_id_event = expression(paste(mod_id_visit, samp, sep = '_'))),
    failmiss_fn = 'MissingConcDate-',
    multsets_fn = 'multipleSetsConc-',
    faildup_fn = 'DuplicateConc-', 
    demo.columns = list(id = 'mod_id', idvisit = 'mod_id_visit'))
3 rows need review, see file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/failMissingConcDate-fent.csv AND create /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/fixMissingConcDate-fent.csv
subjects with concentration missing from sample file
 mod_id mod_id_event
      8        8.1_1
      8        8.1_2
      8        8.1_3
1 subjects have multiple sets of concentration data
16 total unique subjects ids (including multiple visits) currently in the concentration data
15 total unique subjects in the concentration data
15 rows need review, see file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/multipleSetsConc-fent2023-11-02.csv
15 total unique subjects ids (after excluding multiple visits) in the concentration data
15 total unique subjects in the concentration data
   mod_id mod_id_visit event  conc.level mod_id_event           date.time eid
1       1          1.2     1 0.014136220        1.2_1 2017-02-03 10:46:00   1
2       1          1.2     2 0.279820752        1.2_2 2017-02-04 20:30:00   1
55     10         10.1     2 3.136047304       10.1_2 2016-12-24 18:00:00   1
56     10         10.1     9 0.004720171       10.1_9 2017-01-01 04:20:00   1
57     10         10.1    10 0.017136367      10.1_10 2017-01-02 04:42:00   1
58     10         10.1    12 0.006335571      10.1_12 2017-01-04 03:40:00   1
( <- read.csv(file.path(checkDir,"failMissingConcDate-fent.csv")) )
  subject_id subject_uid mod_id_event datetime
1       1566    35885929        8.1_1       NA
2       1566    35885929        8.1_2       NA
3       1566    35885929        8.1_3       NA[,"datetime"] <- c("9/30/2016 09:32","10/1/2016 19:20","10/2/2016 02:04")
  subject_id subject_uid mod_id_event        datetime
1       1566    35885929        8.1_1 9/30/2016 09:32
2       1566    35885929        8.1_2 10/1/2016 19:20
3       1566    35885929        8.1_3 10/2/2016 02:04
write.csv(, file.path(checkDir,"fixMissingConcDate-fent.csv"))
conc.out <- run_DrugLevel(conc.path=file.path(dataDir,"conc_mod_id.rds"),
    conc.columns = list(id = 'mod_id', conc = 'conc.level', idvisit = 'mod_id_visit', samplinkid = 'mod_id_event'),'mod_id','mod_id_visit','samp','fentanyl_calc_conc'),
    conc.rename=c(fentanyl_calc_conc = 'conc.level', samp= 'event'),
    conc.mod.list=list(mod_id_event = expression(paste(mod_id_visit, event, sep = '_'))),
    samp.columns = list(conclinkid = 'mod_id_event', datetime = 'Sample.Collection.Date.and.Time'),
    samp.mod.list=list(mod_id_event = expression(paste(mod_id_visit, samp, sep = '_'))),
    failmiss_fn = 'MissingConcDate-',
    multsets_fn = 'multipleSetsConc-',
    faildup_fn = 'DuplicateConc-',
    demo.columns = list(id = 'mod_id', idvisit = 'mod_id_visit'))
3 rows need review, see file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/failMissingConcDate-fent.csv AND create /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/fixMissingConcDate-fent.csv
file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/fixMissingConcDate-fent.csv read with failures replaced
1 subjects have multiple sets of concentration data
16 total unique subjects ids (including multiple visits) currently in the concentration data
15 total unique subjects in the concentration data
15 rows need review, see file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/multipleSetsConc-fent2023-11-02.csv
15 total unique subjects ids (after excluding multiple visits) in the concentration data
15 total unique subjects in the concentration data


creat.out <- run_Labs(lab.path=file.path(dataDir,"creat_mod_id.rds"), = c('mod_id','date.time','creat'),
    lab.mod.list = list(date.time = expression(parse_dates(fixDates(paste(date, time))))))

alb.out <- run_Labs(lab.path=file.path(dataDir,"alb_mod_id.rds"), = c('mod_id','date.time','alb'),
    lab.mod.list = list(date.time = expression(parse_dates(fixDates(paste(date, time))))))

lab.out <- list(creat.out, alb.out)

List of 2
 $ :'data.frame':   266 obs. of  3 variables:
  ..$ mod_id   : int [1:266] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ date.time: POSIXct[1:266], format: "2017-02-05 04:00:00" "2017-02-06 05:00:00" "2011-10-03 04:28:00" "2011-10-04 04:15:00" ...
  ..$ creat    : num [1:266] 0.52 0.53 0.42 0.35 0.29 0.28 0.34 0.59 0.54 0.26 ...
 $ :'data.frame':   44 obs. of  3 variables:
  ..$ mod_id   : int [1:44] 8 8 8 8 4 6 6 9 10 10 ...
  ..$ date.time: POSIXct[1:44], format: "2020-07-30 05:23:00" "2020-07-28 03:12:00" "2020-07-29 01:39:00" "2020-08-21 10:35:00" ...
  ..$ alb      : num [1:44] 2.9 2 2.7 4.1 4.1 2.3 2.6 3 3.1 4.2 ...


pk_dat <- run_Build_PK_IV(
    conc.columns = list(id = 'mod_id', datetime = 'date.time', druglevel = 'conc.level', 
                        idvisit = 'mod_id_visit'),
    dose.columns = list(id = 'mod_id', date = 'date.dose', infuseDatetime = 'infuse.time', 
                        infuseDose = 'infuse.dose', infuseTimeExact= 'infuse.time.real',
                        bolusDatetime = 'bolus.time', bolusDose = 'bolus.dose', 
                        gap = 'maxint', weight = 'weight'),
    demo.list = demo.out,
    demo.columns = list(id = 'mod_id', idvisit = 'mod_id_visit'),
    lab.list = lab.out,
    lab.columns = list(id = 'mod_id', datetime = 'date.time'),
    date.format="%m/%d/%y %H:%M:%S","America/Chicago")
0 duplicated rows
The dimension of the PK data before merging with demographics: 234 x 9
The number of subjects in the PK data before merging with demographics: 15
The number of subjects in the demographic file, who meet the exclusion criteria: 2
check NA frequency in demographics, see file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/fent-missing-demo.csv
Some demographic variables are missing and will be excluded: 
The list of final demographic variables: mod_visit
Checked: there are no missing creat
List of IDs missing at least 1 alb: 1.2
The dimension of the final PK data exported with the key demographics: 197 x 24 with 13 distinct subjects (mod_id)
# convert id back to original IDs
pk_dat <- pullRealId(pk_dat,

     subject_id subject_uid time   amt        dv rate mdv evid              date gender weight height surgery_date ageatsurgery stat_sts
2         466.1    28579217 0.00  50.0        NA  0.0   1    1 02/04/17 16:15:00      0  21.99 116.90     2/4/2017         2451        1
2.1       466.1    28579217 0.25  20.0        NA  0.0   1    1 02/04/17 16:30:00      0  21.99 116.90     2/4/2017         2451        1
2.2       466.1    28579217 4.25    NA 0.2798208   NA   0    0 02/04/17 20:30:00      0  21.99 116.90     2/4/2017         2451        1
12       1607.0    38551767 0.00 109.2        NA 10.4   1    1 12/24/16 07:15:00      0   2.60  45.94   12/24/2016           23        3
12.1     1607.0    38551767 0.00  10.0        NA  0.0   1    1 12/24/16 07:15:00      0   2.60  45.94   12/24/2016           23        3
12.2     1607.0    38551767 1.25  15.0        NA  0.0   1    1 12/24/16 08:30:00      0   2.60  45.94   12/24/2016           23        3
     cpb_sts in_hospital_mortality add_ecmo date_icu_dc time_fromor length_of_icu_stay weight_demo creat alb
2        107                     0        0    2/5/2017        1322                  1       21.99  0.54  NA
2.1      107                     0        0    2/5/2017        1322                  1       21.99  0.54  NA
2.2      107                     0        0    2/5/2017        1322                  1       21.99  0.54  NA
12       110                     0        0    1/5/2017          NA                 12        2.76  0.66 1.6
12.1     110                     0        0    1/5/2017          NA                 12        2.76  0.66 1.6
12.2     110                     0        0    1/5/2017          NA                 12        2.76  0.66 1.6


  1. Choi L, Beck C, McNeer E, Weeks HL, Williams ML, James NT, Niu X, Abou-Khalil BW, Birdwell KA, Roden DM, Stein CM. Development of a System for Post-marketing Population Pharmacokinetic and Pharmacodynamic Studies using Real-World Data from Electronic Health Records. Clinical Pharmacology & Therapeutics. 2020 Apr;107(4):934-43. doi: 10.1002/cpt.1787.


If you see mistakes or want to suggest changes, please create an issue on the source repository.