Build-PK-IV - Comprehensive

This tutorial describes a comprehensive PK data building procedure for medications that are intravenously administered. There are two phases: data processing which standardizes and combines the input data (Pro-Demographic, Pro-Med-Str, Pro-Drug Level, Pro-Laboratory) and data building which creates the final PK data (Build-PK-IV).

Nathan T. James

Introduction

This tutorial describes four modules for processing data (Pro-Demographic, Pro-Med-Str, Pro-Drug Level, Pro-Laboratory) and one module for PK data building (Build-PK-IV) using data extracted from a structured database.

To begin we load the EHR package, the pkdata package, and the lubridate package.

# load EHR package and dependencies
library(EHR)
library(pkdata)
library(lubridate)
# define 3 directories
rawDataDir <- system.file("examples", "str_ex2", package="EHR") # directory for raw data

td <- tempdir()
checkDir <- file.path(td, 'checks') # directory for interactive checking
dir.create(checkDir)

dataDir <- file.path(td, 'data') # directory for processed data
dir.create(dataDir)

# examine raw data files in rawDataDir
dir(rawDataDir)
[1] "Albumin_DATA.csv"             "Creatinine_DATA.csv"          "Demographics_DATA.csv"        "e-rx_DATA.csv"               
[5] "FLOW_DATA.csv"                "MAR_DATA.csv"                 "medChecked-fent.csv"          "SampleConcentration_DATA.csv"
[9] "SampleTimes_DATA.csv"        

Pre-Processing for Raw Extracted Data

The raw datasets must go through a pre-processing stage which creates new ID variables and datasets that can be used by the data processing modules. There are three pre-processing steps:

  1. read and clean raw data
  2. merge raw data to create new ID variables
  3. make new data for use with modules.

Each raw dataset should contain a subject unique ID, a subject visit ID, or both ids. In this example the subject unique ID is called subject_uid and the subject visit ID is called subject_id. The subject visit ID is a combination of subject and visit/course – e.g., subject_id 14.0 is the first course for subject 14, subject_id 14.1 is the second course for subject 14, and so on. subject_uid is a unique ID that is the same for all subject records. The integer part of subject_id has a 1-to-1 correspondence with subject_uid – for this example, subject_uid 62734832 is associated with both subject_id 14.0 and subject_id 14.1. If there is only a single visit/course per subject only the subject unique ID is needed.

(1) Read and clean raw data

# demographics data
demo.in <- readTransform(file.path(rawDataDir, "Demographics_DATA.csv"))
head(demo.in)
  subject_id subject_uid gender weight height surgery_date ageatsurgery stat_sts cpb_sts in_hospital_mortality add_ecmo date_icu_dc
1       1106    34364670      0   5.14  59.18    6/28/2014          141        3     133                     0        0    7/2/2014
2       1444    36792472      1   5.67  62.90    1/10/2016          292        1      65                     0        0   1/12/2016
3       1465    36292449      0  23.67 118.02    3/19/2016         2591        2     357                     0        0   3/20/2016
4       1520    34161967      0  14.07  97.04    7/18/2016         1320        5      93                     0        0   7/19/2016
5       1524    37857374      1  23.40 102.80    7/23/2016         1561        3      87                     1        0   7/30/2016
6       1550    37826262      1   6.21  62.03     9/4/2016          208        1     203                     0        0   9/11/2016
  time_fromor
1        1657
2        1325
3          NA
4        1745
5        1847
6        1210
# read SampleTimes_DATA.csv
samp.in <- readTransform(file.path(rawDataDir, "SampleTimes_DATA.csv"),
    rename = c('Study.ID' = 'subject_id'),
    modify = list(samp = expression(as.numeric(sub('Sample ', '', Event.Name)))))
head(samp.in)
  subject_id Event.Name Sample.Collection.Date.and.Time samp
1      466.1   Sample 1                  2/3/2017 10:46    1
2      466.1   Sample 2                  2/4/2017 20:30    2
3     1106.0   Sample 1                 6/28/2014 13:40    1
4     1106.0   Sample 2                 6/29/2014 03:10    2
5     1106.0   Sample 3                 6/30/2014 03:35    3
6     1106.0   Sample 4                  7/1/2014 03:45    4
# helper function used to make subject_id
sampId <- function(x) {
  # remove leading zeroes or trailing periods
  subid <- gsub('(^0*|\\.$)', '', x)
  # change _ to .
  gsub('_([0-9]+[_].*)$', '.\\1', subid)
}

# read SampleConcentration_DATA.csv
conc.in <- readTransform(file.path(rawDataDir, "SampleConcentration_DATA.csv"),
  modify = list(
    subid = expression(sampId(name)),
    subject_id = expression(as.numeric(sub('[_].*', '', subid))),
    samp = expression(sub('[^_]*[_]', '', subid)),
    name = NULL,
    data_file = NULL,
    subid = NULL
    )
  )
head(conc.in)
  record_id fentanyl_calc_conc subject_id samp
1         1         0.01413622      466.1    1
2         2         0.27982075      466.1    2
3         3         6.11873679     1106.0    1
4         4         0.59161716     1106.0    2
5         5         0.11280471     1106.0    3
6         6         0.02112153     1106.0    4
# FLOW dosing data
flow.in <- readTransform(file.path(rawDataDir, "FLOW_DATA.csv"),
                         rename = c('Subject.Id' = 'subject_id',
                                    'Subject.Uniq.Id' = 'subject_uid')) 
# pre-process the flow data 
# date.time variable should be in an appropriate form
flow.in[,'date.time'] <- pkdata::parse_dates(EHR:::fixDates(flow.in[,'Perform.Date']))
# unit and rate are required: separate unit and rate from 'Final.Rate..NFR.units.' if needed
flow.in[,'unit'] <- sub('.*[ ]', '', flow.in[,'Final.Rate..NFR.units.'])
flow.in[,'rate'] <- as.numeric(sub('([0-9.]+).*', '\\1', flow.in[,'Final.Rate..NFR.units.']))
head(flow.in)
  subject_id subject_uid     Perform.Date FOCUS_MEDNAME Final.Wt..kg. Final.Rate..NFR.units. Final.Units Flow           date.time      unit
1       1596    38340814   12/4/2016 5:30      Fentanyl          6.75            1 mcg/kg/hr       3.375   NA 2016-12-04 05:30:00 mcg/kg/hr
2       1596    38340814   12/4/2016 6:00      Fentanyl          6.75            1 mcg/kg/hr       6.750  0.1 2016-12-04 06:00:00 mcg/kg/hr
3       1596    38340814   12/4/2016 7:00      Fentanyl          6.75            1 mcg/kg/hr       4.500  0.1 2016-12-04 07:00:00 mcg/kg/hr
4       1596    38340814   12/4/2016 7:40      Fentanyl          6.75            0 mcg/kg/hr       0.000   NA 2016-12-04 07:40:00 mcg/kg/hr
5       1607    38551767 12/24/2016 19:30      Fentanyl          2.60            2 mcg/kg/hr       2.600   NA 2016-12-24 19:30:00 mcg/kg/hr
6       1607    38551767 12/24/2016 20:00      Fentanyl          2.60            2 mcg/kg/hr       5.200  0.2 2016-12-24 20:00:00 mcg/kg/hr
  rate
1    1
2    1
3    1
4    0
5    2
6    2
# MAR dosing data
mar.in0 <- read.csv(file.path(rawDataDir, "MAR_DATA.csv"), check.names = FALSE)
mar.in <- dataTransformation(mar.in0, rename = c('Uniq.Id' = 'subject_uid'))
head(mar.in)
  subject_uid       Date  Time                 med:mDrug   med:dosage med:route med:freq med:given
1    28579217 2017-02-04 19:15               Nicardipine 3 mcg/kg/min        IV     <NA>     Given
2    28579217 2011-10-02 22:11                Famotidine       4.5 mg        IV   q12hrs     Given
3    28579217 2011-10-02 20:17          Morphine sulfate         1 mg        IV  q2h prn     Given
4    28579217 2011-10-03 02:28 Diphenhydramine injection        12 mg        IV      now     Given
5    28579217 2011-10-02 22:11                 Cefazolin       225 mg        IV    q8hrs     Given
6    28579217 2011-10-02 23:30          Morphine sulfate         1 mg        IV  q2h prn     Given
# Serum creatinine lab data
creat.in <- readTransform(file.path(rawDataDir, "Creatinine_DATA.csv"),
    rename = c('Subject.uniq' = 'subject_uid'))
head(creat.in)
  subject_uid     date time creat
1    28579217 02/05/17 4:00  0.52
2    28579217 02/06/17 5:00  0.53
3    28579217 10/03/11 4:28  0.42
4    28579217 10/04/11 4:15  0.35
5    28579217 10/06/11 4:25  0.29
6    28579217 10/09/11 4:45  0.28
# Albumin lab data
alb.in <- readTransform(file.path(rawDataDir, "Albumin_DATA.csv"),
    rename = c('Subject.uniq' = 'subject_uid'))
head(creat.in)
  subject_uid     date time creat
1    28579217 02/05/17 4:00  0.52
2    28579217 02/06/17 5:00  0.53
3    28579217 10/03/11 4:28  0.42
4    28579217 10/04/11 4:15  0.35
5    28579217 10/06/11 4:25  0.29
6    28579217 10/09/11 4:45  0.28

(2) Merge data to create new ID variables

# define list of input datasets
data <-  list(demo.in,
              samp.in,
              conc.in,
              flow.in,
              mar.in,
              creat.in,
              alb.in)

# define list of vectors or character strings that identify the ID variables
idcols <-  list(c('subject_id', 'subject_uid'), # id vars in demo.in
                'subject_id', # id var in samp.in
                'subject_id', # id var in conc.in
                c('subject_id', 'subject_uid'), # id vars in flow.in
                'subject_uid', # id var in mar.in
                'subject_uid', # id var in creat.in
                'subject_uid') # id var in creat.in

# merge all IDs from cleaned datasets and create new ID variables
id.xwalk <- idCrosswalk(data, idcols, visit.id="subject_id", uniq.id="subject_uid")
saveRDS(id.xwalk, file=file.path(dataDir,"module_id_xwalk.rds"))
head(id.xwalk)
  subject_id subject_uid mod_visit mod_id mod_id_visit
1      466.0    28579217         1      1          1.1
2      466.1    28579217         2      1          1.2
3     1106.0    34364670         1      2          2.1
4     1444.0    36792472         1      3          3.1
5     1465.0    36292449         1      4          4.1
6     1520.0    34161967         1      5          5.1

(3) Make new data for use with modules

pullFakeId(data, id.xwalk, firstCols = NULL, orderBy = NULL)
## demographics data
demo.cln <- pullFakeId(demo.in, id.xwalk,
    firstCols = c('mod_id', 'mod_visit', 'mod_id_visit'),
    uniq.id = 'subject_uid')
head(demo.cln)
  mod_id mod_visit mod_id_visit gender weight height surgery_date ageatsurgery stat_sts cpb_sts in_hospital_mortality add_ecmo date_icu_dc
1      2         1          2.1      0   5.14  59.18    6/28/2014          141        3     133                     0        0    7/2/2014
2      3         1          3.1      1   5.67  62.90    1/10/2016          292        1      65                     0        0   1/12/2016
3      4         1          4.1      0  23.67 118.02    3/19/2016         2591        2     357                     0        0   3/20/2016
4      5         1          5.1      0  14.07  97.04    7/18/2016         1320        5      93                     0        0   7/19/2016
5      6         1          6.1      1  23.40 102.80    7/23/2016         1561        3      87                     1        0   7/30/2016
6      7         1          7.1      1   6.21  62.03     9/4/2016          208        1     203                     0        0   9/11/2016
  time_fromor
1        1657
2        1325
3          NA
4        1745
5        1847
6        1210
saveRDS(demo.cln, file=file.path(dataDir,"demo_mod_id.rds"))

## drug level data
# sampling times
samp.cln <- pullFakeId(samp.in, id.xwalk,
    firstCols = c('mod_id', 'mod_visit', 'mod_id_visit', 'samp'), 
    orderBy = c('mod_id_visit','samp'),
    uniq.id = 'subject_uid')
head(samp.cln)
  mod_id mod_visit mod_id_visit samp Event.Name Sample.Collection.Date.and.Time
1      1         2          1.2    1   Sample 1                  2/3/2017 10:46
2      1         2          1.2    2   Sample 2                  2/4/2017 20:30
3     10         1         10.1    1   Sample 1                12/23/2016 05:15
4     10         1         10.1    2   Sample 2                12/24/2016 18:00
5     10         1         10.1    3   Sample 3                12/25/2016 03:00
6     10         1         10.1    4   Sample 4                12/26/2016 04:00
saveRDS(samp.cln, file=file.path(dataDir,"samp_mod_id.rds"))

# drug concentration measurements
conc.cln <- pullFakeId(conc.in, id.xwalk,
    firstCols = c('record_id', 'mod_id', 'mod_visit', 'mod_id_visit', 'samp'),
    orderBy = 'record_id',
    uniq.id = 'subject_uid')
head(conc.cln)
  record_id mod_id mod_visit mod_id_visit samp fentanyl_calc_conc
1         1      1         2          1.2    1         0.01413622
2         2      1         2          1.2    2         0.27982075
3         3      2         1          2.1    1         6.11873679
4         4      2         1          2.1    2         0.59161716
5         5      2         1          2.1    3         0.11280471
6         6      2         1          2.1    4         0.02112153
saveRDS(conc.cln, file=file.path(dataDir,"conc_mod_id.rds"))

## dosing data
# flow
flow.cln <- pullFakeId(flow.in, id.xwalk,
    firstCols = c('mod_id', 'mod_visit', 'mod_id_visit'),
    uniq.id = 'subject_uid')
head(flow.cln)
  mod_id mod_visit mod_id_visit     Perform.Date FOCUS_MEDNAME Final.Wt..kg. Final.Rate..NFR.units. Final.Units Flow           date.time
1      9         1          9.1   12/4/2016 5:30      Fentanyl          6.75            1 mcg/kg/hr       3.375   NA 2016-12-04 05:30:00
2      9         1          9.1   12/4/2016 6:00      Fentanyl          6.75            1 mcg/kg/hr       6.750  0.1 2016-12-04 06:00:00
3      9         1          9.1   12/4/2016 7:00      Fentanyl          6.75            1 mcg/kg/hr       4.500  0.1 2016-12-04 07:00:00
4      9         1          9.1   12/4/2016 7:40      Fentanyl          6.75            0 mcg/kg/hr       0.000   NA 2016-12-04 07:40:00
5     10         1         10.1 12/24/2016 19:30      Fentanyl          2.60            2 mcg/kg/hr       2.600   NA 2016-12-24 19:30:00
6     10         1         10.1 12/24/2016 20:00      Fentanyl          2.60            2 mcg/kg/hr       5.200  0.2 2016-12-24 20:00:00
       unit rate
1 mcg/kg/hr    1
2 mcg/kg/hr    1
3 mcg/kg/hr    1
4 mcg/kg/hr    0
5 mcg/kg/hr    2
6 mcg/kg/hr    2
saveRDS(flow.cln, file=file.path(dataDir,"flow_mod_id.rds"))

# mar
mar.cln <- pullFakeId(mar.in, id.xwalk, firstCols = 'mod_id', uniq.id = 'subject_uid')
head(mar.cln)
  mod_id       Date  Time                 med:mDrug   med:dosage med:route med:freq med:given
1      1 2017-02-04 19:15               Nicardipine 3 mcg/kg/min        IV     <NA>     Given
2      1 2011-10-02 22:11                Famotidine       4.5 mg        IV   q12hrs     Given
3      1 2011-10-02 20:17          Morphine sulfate         1 mg        IV  q2h prn     Given
4      1 2011-10-03 02:28 Diphenhydramine injection        12 mg        IV      now     Given
5      1 2011-10-02 22:11                 Cefazolin       225 mg        IV    q8hrs     Given
6      1 2011-10-02 23:30          Morphine sulfate         1 mg        IV  q2h prn     Given
saveRDS(mar.cln, file=file.path(dataDir,"mar_mod_id.rds"))

## laboratory data
# creatinine
creat.cln <- pullFakeId(creat.in, id.xwalk, 'mod_id',uniq.id = 'subject_uid')
head(creat.cln)
  mod_id     date time creat
1      1 02/05/17 4:00  0.52
2      1 02/06/17 5:00  0.53
3      1 10/03/11 4:28  0.42
4      1 10/04/11 4:15  0.35
5      1 10/06/11 4:25  0.29
6      1 10/09/11 4:45  0.28
saveRDS(creat.cln, file=file.path(dataDir,"creat_mod_id.rds"))

# albumin
alb.cln <- pullFakeId(alb.in, id.xwalk, 'mod_id', uniq.id = 'subject_uid')
head(alb.cln)
  mod_id     date  time alb
1      8 07/30/20  5:23 2.9
2      8 07/28/20  3:12 2.0
3      8 07/29/20  1:39 2.7
4      8 08/21/20 10:35 4.1
5      4 06/13/15 17:20 4.1
6      6 07/25/16  8:35 2.3
saveRDS(alb.cln, file=file.path(dataDir,"alb_mod_id.rds"))
# set crosswalk option 
xwalk <- readRDS(file.path(dataDir, "module_id_xwalk.rds"))
options(pkxwalk = 'xwalk')

# define parameters
drugname <- 'fent'
LLOQ <- 0.05

Pro-Demographic

# helper function
exclude_val <- function(x, val=1) { !is.na(x) & x == val }

demo.out <- run_Demo(demo.path = file.path(dataDir, "demo_mod_id.rds"),
    demo.columns = list(id = 'mod_id_visit'),
    toexclude = expression(exclude_val(in_hospital_mortality) | exclude_val(add_ecmo)),
    demo.mod.list = list(length_of_icu_stay = 
                        expression(daysDiff(surgery_date, date_icu_dc))))
The number of subjects in the demographic data, who meet the exclusion criteria: 2
head(demo.out$demo)
  mod_id mod_visit mod_id_visit gender weight height surgery_date ageatsurgery stat_sts cpb_sts in_hospital_mortality add_ecmo date_icu_dc
1      2         1          2.1      0   5.14  59.18    6/28/2014          141        3     133                     0        0    7/2/2014
2      3         1          3.1      1   5.67  62.90    1/10/2016          292        1      65                     0        0   1/12/2016
3      4         1          4.1      0  23.67 118.02    3/19/2016         2591        2     357                     0        0   3/20/2016
4      5         1          5.1      0  14.07  97.04    7/18/2016         1320        5      93                     0        0   7/19/2016
5      6         1          6.1      1  23.40 102.80    7/23/2016         1561        3      87                     1        0   7/30/2016
6      7         1          7.1      1   6.21  62.03     9/4/2016          208        1     203                     0        0   9/11/2016
  time_fromor length_of_icu_stay
1        1657                  4
2        1325                  2
3          NA                  1
4        1745                  1
5        1847                  7
6        1210                  7
demo.out$exclude
[1] "6.1"  "13.1"

Pro-Med-Str Part I: IV dose data

ivdose.out <- run_MedStrI(
    mar.path=file.path(dataDir,"mar_mod_id.rds"),
    mar.columns = list(id='mod_id', datetime=c('Date','Time'), dose='med:dosage', drug='med:mDrug', given='med:given'),
    medGivenReq = TRUE,
    flow.path=file.path(dataDir,"flow_mod_id.rds"),
    flow.columns = list(id = 'mod_id', datetime = 'date.time', finalunits = 'Final.Units', 
                        unit = 'unit', rate = 'rate', weight = 'Final.Wt..kg.'),
    medchk.path=file.path(system.file("examples", "str_ex2", package="EHR"), sprintf('medChecked-%s.csv', drugname)),
    demo.list = NULL,
    demo.columns = list(),
    missing.wgt.path = NULL,
    wgt.columns = list(),
    check.path = checkDir,
    failflow_fn = 'FailFlow',
    failunit_fn = 'Unit',
    failnowgt_fn = 'NoWgt',
    infusion.unit = 'mcg/kg/hr',
    bolus.unit = 'mcg',
    bol.rate.thresh = Inf,
    rateunit = 'mcg/hr',
    ratewgtunit = 'mcg/kg/hr',
    weightunit = 'kg',
    drugname = drugname)
The number of rows in the original data                124
The number of rows after removing the duplicates       124
no units other than mcg/kg/hr or mcg, file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/failUnit-fent.csv not created
#########################
33 rows from 1 subjects with "kg" in infusion unit but missing weight, see file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/failNoWgt-fent.csv AND create /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/fixNoWgt-fent.csv
#########################
#########################
censor dates created, please see /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/CensorTime-fent.csv
#########################

head(ivdose.out)
  mod_id  date.dose infuse.time.real infuse.time infuse.dose          bolus.time bolus.dose given.dose maxint weight
1      1 2011-10-02             <NA>        <NA>          NA 2011-10-02 15:35:00         25         NA      0     NA
2      1 2011-10-02             <NA>        <NA>          NA 2011-10-02 17:26:00         25         NA      0     NA
3      1 2017-02-04             <NA>        <NA>          NA 2017-02-04 16:15:00         50         NA      0     NA
4      1 2017-02-04             <NA>        <NA>          NA 2017-02-04 16:30:00         20         NA      0     NA
5      1 2017-02-04             <NA>        <NA>          NA 2017-02-04 20:57:00         20         NA      0     NA
6      2 2014-06-28             <NA>        <NA>          NA 2014-06-28 08:15:00         20         NA      0     NA

Pro-Drug Level

conc.out <- run_DrugLevel(conc.path=file.path(dataDir,"conc_mod_id.rds"),
    conc.columns = list(id = 'mod_id', conc = 'conc.level', idvisit = 'mod_id_visit', samplinkid = 'mod_id_event'),
    conc.select=c('mod_id','mod_id_visit','samp','fentanyl_calc_conc'),
    conc.rename=c(fentanyl_calc_conc = 'conc.level', samp= 'event'),
    conc.mod.list=list(mod_id_event = expression(paste(mod_id_visit, event, sep = '_'))),
    samp.path=file.path(dataDir,"samp_mod_id.rds"),
    samp.columns = list(conclinkid = 'mod_id_event', datetime = 'Sample.Collection.Date.and.Time'),
    samp.mod.list=list(mod_id_event = expression(paste(mod_id_visit, samp, sep = '_'))),
    check.path=checkDir,
    failmiss_fn = 'MissingConcDate-',
    multsets_fn = 'multipleSetsConc-',
    faildup_fn = 'DuplicateConc-', 
    drugname=drugname,
    LLOQ=LLOQ,
    demo.list=demo.out,
    demo.columns = list(id = 'mod_id', idvisit = 'mod_id_visit'))
#########################
3 rows need review, see file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/failMissingConcDate-fent.csv AND create /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/fixMissingConcDate-fent.csv
#########################
subjects with concentration missing from sample file
 mod_id mod_id_event
      8        8.1_1
      8        8.1_2
      8        8.1_3
1 subjects have multiple sets of concentration data
16 total unique subjects ids (including multiple visits) currently in the concentration data
15 total unique subjects in the concentration data
#########################
15 rows need review, see file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/multipleSetsConc-fent2023-11-02.csv
#########################
15 total unique subjects ids (after excluding multiple visits) in the concentration data
15 total unique subjects in the concentration data
head(conc.out)
   mod_id mod_id_visit event  conc.level mod_id_event           date.time eid
1       1          1.2     1 0.014136220        1.2_1 2017-02-03 10:46:00   1
2       1          1.2     2 0.279820752        1.2_2 2017-02-04 20:30:00   1
55     10         10.1     2 3.136047304       10.1_2 2016-12-24 18:00:00   1
56     10         10.1     9 0.004720171       10.1_9 2017-01-01 04:20:00   1
57     10         10.1    10 0.017136367      10.1_10 2017-01-02 04:42:00   1
58     10         10.1    12 0.006335571      10.1_12 2017-01-04 03:40:00   1
( fail.miss.conc.date <- read.csv(file.path(checkDir,"failMissingConcDate-fent.csv")) )
  subject_id subject_uid mod_id_event datetime
1       1566    35885929        8.1_1       NA
2       1566    35885929        8.1_2       NA
3       1566    35885929        8.1_3       NA
fail.miss.conc.date[,"datetime"] <- c("9/30/2016 09:32","10/1/2016 19:20","10/2/2016 02:04")
fail.miss.conc.date
  subject_id subject_uid mod_id_event        datetime
1       1566    35885929        8.1_1 9/30/2016 09:32
2       1566    35885929        8.1_2 10/1/2016 19:20
3       1566    35885929        8.1_3 10/2/2016 02:04
write.csv(fail.miss.conc.date, file.path(checkDir,"fixMissingConcDate-fent.csv"))
conc.out <- run_DrugLevel(conc.path=file.path(dataDir,"conc_mod_id.rds"),
    conc.columns = list(id = 'mod_id', conc = 'conc.level', idvisit = 'mod_id_visit', samplinkid = 'mod_id_event'),
    conc.select=c('mod_id','mod_id_visit','samp','fentanyl_calc_conc'),
    conc.rename=c(fentanyl_calc_conc = 'conc.level', samp= 'event'),
    conc.mod.list=list(mod_id_event = expression(paste(mod_id_visit, event, sep = '_'))),
    samp.path=file.path(dataDir,"samp_mod_id.rds"),
    samp.columns = list(conclinkid = 'mod_id_event', datetime = 'Sample.Collection.Date.and.Time'),
    samp.mod.list=list(mod_id_event = expression(paste(mod_id_visit, samp, sep = '_'))),
    check.path=checkDir,
    failmiss_fn = 'MissingConcDate-',
    multsets_fn = 'multipleSetsConc-',
    faildup_fn = 'DuplicateConc-',
    drugname=drugname,
    LLOQ=LLOQ,
    demo.list=demo.out,
    demo.columns = list(id = 'mod_id', idvisit = 'mod_id_visit'))
#########################
3 rows need review, see file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/failMissingConcDate-fent.csv AND create /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/fixMissingConcDate-fent.csv
#########################
file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/fixMissingConcDate-fent.csv read with failures replaced
1 subjects have multiple sets of concentration data
16 total unique subjects ids (including multiple visits) currently in the concentration data
15 total unique subjects in the concentration data
#########################
15 rows need review, see file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/multipleSetsConc-fent2023-11-02.csv
#########################
15 total unique subjects ids (after excluding multiple visits) in the concentration data
15 total unique subjects in the concentration data

Pro-Laboratory

creat.out <- run_Labs(lab.path=file.path(dataDir,"creat_mod_id.rds"),
    lab.select = c('mod_id','date.time','creat'),
    lab.mod.list = list(date.time = expression(parse_dates(fixDates(paste(date, time))))))

alb.out <- run_Labs(lab.path=file.path(dataDir,"alb_mod_id.rds"),
    lab.select = c('mod_id','date.time','alb'),
    lab.mod.list = list(date.time = expression(parse_dates(fixDates(paste(date, time))))))

lab.out <- list(creat.out, alb.out)

str(lab.out)
List of 2
 $ :'data.frame':   266 obs. of  3 variables:
  ..$ mod_id   : int [1:266] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ date.time: POSIXct[1:266], format: "2017-02-05 04:00:00" "2017-02-06 05:00:00" "2011-10-03 04:28:00" "2011-10-04 04:15:00" ...
  ..$ creat    : num [1:266] 0.52 0.53 0.42 0.35 0.29 0.28 0.34 0.59 0.54 0.26 ...
 $ :'data.frame':   44 obs. of  3 variables:
  ..$ mod_id   : int [1:44] 8 8 8 8 4 6 6 9 10 10 ...
  ..$ date.time: POSIXct[1:44], format: "2020-07-30 05:23:00" "2020-07-28 03:12:00" "2020-07-29 01:39:00" "2020-08-21 10:35:00" ...
  ..$ alb      : num [1:44] 2.9 2 2.7 4.1 4.1 2.3 2.6 3 3.1 4.2 ...

Build-PK-IV

pk_dat <- run_Build_PK_IV(
    conc=conc.out,
    conc.columns = list(id = 'mod_id', datetime = 'date.time', druglevel = 'conc.level', 
                        idvisit = 'mod_id_visit'),
    dose=ivdose.out,
    dose.columns = list(id = 'mod_id', date = 'date.dose', infuseDatetime = 'infuse.time', 
                        infuseDose = 'infuse.dose', infuseTimeExact= 'infuse.time.real',
                        bolusDatetime = 'bolus.time', bolusDose = 'bolus.dose', 
                        gap = 'maxint', weight = 'weight'),
    demo.list = demo.out,
    demo.columns = list(id = 'mod_id', idvisit = 'mod_id_visit'),
    lab.list = lab.out,
    lab.columns = list(id = 'mod_id', datetime = 'date.time'),
    pk.vars=c('date'),
    drugname=drugname,
    check.path=checkDir,
    missdemo_fn='-missing-demo',
    faildupbol_fn='DuplicateBolus-',
    date.format="%m/%d/%y %H:%M:%S",
    date.tz="America/Chicago")
0 duplicated rows
The dimension of the PK data before merging with demographics: 234 x 9
The number of subjects in the PK data before merging with demographics: 15
The number of subjects in the demographic file, who meet the exclusion criteria: 2
check NA frequency in demographics, see file /var/folders/06/0qv1dr5508j_tbzqdjfqjf680000gn/T//RtmpqLJ9qE/checks/fent-missing-demo.csv
Some demographic variables are missing and will be excluded: 
The list of final demographic variables: mod_visit
gender
weight
height
surgery_date
ageatsurgery
stat_sts
cpb_sts
in_hospital_mortality
add_ecmo
date_icu_dc
time_fromor
length_of_icu_stay
weight_demo
Checked: there are no missing creat
List of IDs missing at least 1 alb: 1.2
11.1
15.1
2.1
3.1
4.1
5.1
7.1
8.1
The dimension of the final PK data exported with the key demographics: 197 x 24 with 13 distinct subjects (mod_id)
# convert id back to original IDs
pk_dat <- pullRealId(pk_dat, remove.mod.id=TRUE)

head(pk_dat)
     subject_id subject_uid time   amt        dv rate mdv evid              date gender weight height surgery_date ageatsurgery stat_sts
2         466.1    28579217 0.00  50.0        NA  0.0   1    1 02/04/17 16:15:00      0  21.99 116.90     2/4/2017         2451        1
2.1       466.1    28579217 0.25  20.0        NA  0.0   1    1 02/04/17 16:30:00      0  21.99 116.90     2/4/2017         2451        1
2.2       466.1    28579217 4.25    NA 0.2798208   NA   0    0 02/04/17 20:30:00      0  21.99 116.90     2/4/2017         2451        1
12       1607.0    38551767 0.00 109.2        NA 10.4   1    1 12/24/16 07:15:00      0   2.60  45.94   12/24/2016           23        3
12.1     1607.0    38551767 0.00  10.0        NA  0.0   1    1 12/24/16 07:15:00      0   2.60  45.94   12/24/2016           23        3
12.2     1607.0    38551767 1.25  15.0        NA  0.0   1    1 12/24/16 08:30:00      0   2.60  45.94   12/24/2016           23        3
     cpb_sts in_hospital_mortality add_ecmo date_icu_dc time_fromor length_of_icu_stay weight_demo creat alb
2        107                     0        0    2/5/2017        1322                  1       21.99  0.54  NA
2.1      107                     0        0    2/5/2017        1322                  1       21.99  0.54  NA
2.2      107                     0        0    2/5/2017        1322                  1       21.99  0.54  NA
12       110                     0        0    1/5/2017          NA                 12        2.76  0.66 1.6
12.1     110                     0        0    1/5/2017          NA                 12        2.76  0.66 1.6
12.2     110                     0        0    1/5/2017          NA                 12        2.76  0.66 1.6

References

  1. Choi L, Beck C, McNeer E, Weeks HL, Williams ML, James NT, Niu X, Abou-Khalil BW, Birdwell KA, Roden DM, Stein CM. Development of a System for Post-marketing Population Pharmacokinetic and Pharmacodynamic Studies using Real-World Data from Electronic Health Records. Clinical Pharmacology & Therapeutics. 2020 Apr;107(4):934-43. doi: 10.1002/cpt.1787.

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.