# Pro-Med-Str - Part I

This tutorial describes how to process structured medication data, especially focusing on intravenously given dose data.

# Introduction

This tutorial describes how to use the Pro-Med-Str - Part I module in EHR2PKPD system to process intravenous (IV) dose data (see Choi et al.$$^{1}$$ for details).

To begin we load the EHR package,

# load EHR package
library(EHR)


# Input Raw Dose Data

We will use example dose data to demonstrate the Pro-Med-Str Part I module. The raw data is shown below.

mar.in0 <- read.csv(system.file("examples", "str_ex2","MAR_DATA.csv", package="EHR"), check.names = FALSE)

   Uniq.Id       Date  Time                 med:mDrug   med:dosage med:route med:freq med:given
1 28579217 2017-02-04 19:15               Nicardipine 3 mcg/kg/min        IV     <NA>     Given
2 28579217 2011-10-02 22:11                Famotidine       4.5 mg        IV   q12hrs     Given
3 28579217 2011-10-02 20:17          Morphine sulfate         1 mg        IV  q2h prn     Given
4 28579217 2011-10-03 02:28 Diphenhydramine injection        12 mg        IV      now     Given
5 28579217 2011-10-02 22:11                 Cefazolin       225 mg        IV    q8hrs     Given
6 28579217 2011-10-02 23:30          Morphine sulfate         1 mg        IV  q2h prn     Given

This data consists of a patient ID, date, time, medication name, dosage, route, frequency, and medication given.

The patient ID may need to be renamed so that all input datasets have the same name for the patient ID. This is necessary when combining the datasets to create a crosswalk between the original ID variables and the new ID variables used in the Pro-Med-Str Part I module. We demonstrate how to rename the patient ID variable below.

mar.new <- dataTransformation(mar.in0, rename = c('Uniq.Id' = 'subject_uid'))


# Preparing Raw Dose Data

In practice, the dose data will need to be combined with other input datasets. This process involves creating a crosswalk between original ID variables and new ID variables. The new ID variables that are required to be the same across all datasets are mod_id, mod_visit, and mod_id_visit. See “2. EHR Vignette for Structured Data” of EHR package and “Build-PK-IV - Comprehensive Workshop” for examples of this process.

For simplicity, we will skip this step in this tutorial.

We save our dataset as an RDS file using saveRDS as shown below (CSV or RData is also acceptable data form). Here, we create a temporary directory to store the files using tempdir. We define two directories, dataDir for processed data and checkDir containing files used for interactive checking (optional). Instead of using a temporary directory, dataDir and checkDir can be specific directories on your computer.

# define 2 directories
td <- tempdir()
dataDir <- file.path(td, 'data') # directory for processed data
checkDir <- file.path(td, 'checks') # directory for interactive checking
dir.create(checkDir)


The next section demonstrates how to use the run_MedStrI function to run the Pro-Med-Str module.

# Running run_MedStrI()

• IV dose data can be in various forms, which may need to be pre-processed. IV dose data can be obtained from different data sources when using electronic health records (EHRs), but we illustrate this module using an example dataset (called “MAR”) that comes from one data source for simplicity (see “Build-PK-IV - Comprehensive Workshop” for a case when dose data are obtained from two data sources).

• run_MedStrI() is the main function to process IV dose data in Pro-Med-Str Part I module.

• The module can be semi-interactive for data checking (although it is not required, we recommend using this feature). If check.path is provided (the default is NULL), it generates several files to check potential data errors and get feedback from an investigator; otherwise, the interactive checking will not be performed. If corrected information (‘fix’ files) are provided, the module should be re-run to incorporate the corrections.

• run_MedStrI() can take two types of raw IV dose data (e.g., flow data and MAR data). In this tutorial, we describe arguments only relevant to processing one of these datasets, MAR data. A detailed description of all arguments can be found in the EHR package manual of run_MedStrI().

• mar.path: file name of MAR data (CSV, RData, RDS), or data.frame
• mar.columns: a named list that should specify columns in MAR data; ‘id’, ‘datetime’ and ‘dose’ are required. ‘drug’, ‘weight’, ‘given’ may also be specified. ‘datetime’ is date and time for data measurement, which can refer to a single date-time variable (datetime = ‘date_time’) or two variables holding date and time separately (e.g., datetime = c(‘Date’, ‘Time’)). ‘dose’ can also be given as a single variable or two variables. If given as a single column, the column’s values should contain dose and units such as ‘25 mcg’. If given as two column names, the dose column should come before the unit column (e.g., dose = c(‘doseamt’, ‘unit’)). If ‘drug’ is present, the ‘medchk.path’ argument should also be provided, which contain a csv file with a list of drug names that should be used to subset the relevant drugs. The ‘given’ variable should be used in conjunction with the ‘medGivenReq’ argument.
• medGivenReq: indicator if values in the MAR given column should equal “Given”; if this is FALSE (the default), NA values are also acceptable. This is a variable that flags whether the medication for inpatients is given as sometimes other dose instead of actual dose given can be specified for communications (e.g., a scheduled dose change). Depending on users’ EHR system and data types, we recommend confirming actual dose given and this argument can be useful for that purpose.
• medchk.path: (optional) file name containing data set (CSV, RData, RDS), or data.frame; should have the column ‘medname’ with list of acceptable drug names used to filter MAR data.
• demo.list: (optional) demographic information; if available, the output from ‘run_Demo’ or a correctly formatted data.frame; if provided, ‘weight’ is required in demo.columns as it is used to impute weight when missing.
• demo.columns: a named list that should specify columns in demographic data; ‘id’, ‘datetime’, and ‘weight’ are required.
• check.path: (optional) file path where the generated files for data checking are stored, and the corresponding data files with fixed data exist. The default (NULL) will not produce any check files.
• All the following arguments are optional (i.e., default values can be used), but if the required data items for the dose data such as ‘unit’ are not provided, this function does not work.

• failunit_fn: filename stub for records with units other than those specified with infusion.unit and bolus.unit (default: ‘Unit’)
• infusion.unit: string specifying units for infusion doses (default: ‘mcg/kg/hr’)
• bolus.unit: string specifying units for bolus doses (default: ‘mcg’)
• bol.rate.thresh: upper bound for retaining bolus doses. Bolus units with a rate above the threshold are dropped (default: Inf; i.e., keep all bolus doses)
• rateunit: string specifying units for hourly rate (default: ‘mcg/hr’)
• ratewgtunit: string specifying units for hourly rate by weight (default: ‘mcg/kg/hr’)
• weightunit: string specifying units for weight (default: ‘kg’)
• drugname: drug name of interest (e.g., dex, fent)

Below we show how we would run run_MedStrI() using the cleaned IV dose dataset, “mar_new.rds”, as input.

# define parameters
drugname <- 'fent'

ivdose.out <- run_MedStrI(
mar.columns = list(id='subject_uid', datetime=c('Date','Time'), dose='med:dosage', drug='med:mDrug', given='med:given'),
medGivenReq = TRUE,
medchk.path = file.path(system.file("examples", "str_ex2", package="EHR"), sprintf('medChecked-%s.csv', drugname)),
demo.list = NULL,
demo.columns = list(),
check.path=checkDir,
failunit_fn = 'Unit',
infusion.unit = 'mcg/kg/hr',
bolus.unit = 'mcg',
bol.rate.thresh = Inf,
rateunit = "mcg/hr",
ratewgtunit = "mcg/kg/hr",
weightunit = "kg",
drugname = drugname
)

no units other than mcg/kg/hr or mcg, file /tmp/Rtmp3dfjBP/checks/failUnit-fent.csv not created
#########################
75 rows from 2 subjects with "kg" in infusion unit but missing weight, see file /tmp/Rtmp3dfjBP/checks/failNoWgt-fent.csv AND create /tmp/Rtmp3dfjBP/checks/fixNoWgt-fent.csv
#########################
#########################
censor dates created, please see /tmp/Rtmp3dfjBP/checks/CensorTime-fent.csv
#########################

After run_MedStrI() is executed, it provides messages such as those shown above, which contain some information about data checking.

# Output of run_MedStrI()

head(ivdose.out)

  subject_uid  date.dose infuse.time.real infuse.time infuse.dose          bolus.time bolus.dose given.dose maxint weight
1    28579217 2011-10-02             <NA>        <NA>          NA 2011-10-02 15:35:00         25         NA      0     NA
2    28579217 2011-10-02             <NA>        <NA>          NA 2011-10-02 17:26:00         25         NA      0     NA
3    28579217 2017-02-04             <NA>        <NA>          NA 2017-02-04 16:15:00         50         NA      0     NA
4    28579217 2017-02-04             <NA>        <NA>          NA 2017-02-04 16:30:00         20         NA      0     NA
5    28579217 2017-02-04             <NA>        <NA>          NA 2017-02-04 20:57:00         20         NA      0     NA
6    34161967 2016-07-18             <NA>        <NA>          NA 2016-07-18 14:56:00        100         NA      0     NA
• The output from the Pro-Med-Str - Part I module is a data.frame with the following columns:
• subject_uid: the new id that will be used to merge datasets.
• date.dose: dose given date.
• infuse.time.real: infusion dose time recorded in the raw data.
• infuse.time: infusion dose time processed (rounded time) based on continuous infusion time.
• infuse.dose: infusion dose amount (e.g., calculated by rate*weight if the unit includes weight).
• bolus.time: bolus dose time.
• bolus.dose: bolus dose.
• given.dose: the raw dose given is preserved from the flow sheet data when available (identified as “finalunits” in flow.columns).
• maxint: infusion recording interval (e.g., default: maxint = 15 min for MAR data; maxint = 60 min for flow data). In a typical setting this variable should be unused but it may be specified as the ‘gap’ variable in the dose.columns argument of the run_Build_PK_IV function. The user may also modify this variable to set an appropriate recording interval. For more details review the run_Build_PK_IV section of 2. EHR Vignette for Structured Data of the EHR package.
• weight: subject body weight recorded near infusion time, which will be used in dose calculation.
• Note:

This output is the input of run_Build_PK_IV() function in Build-PK-IV module. In the above output, all variables except given.dose are the necessary variables that should be reserved to use run_Build_PK_IV().

# References

1. Choi L, Beck C, McNeer E, Weeks HL, Williams ML, James NT, Niu X, Abou-Khalil BW, Birdwell KA, Roden DM, Stein CM. Development of a System for Post-marketing Population Pharmacokinetic and Pharmacodynamic Studies using Real-World Data from Electronic Health Records. Clinical Pharmacology & Therapeutics. 2020 Apr;107(4):934-43. doi: 10.1002/cpt.1787.

### Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.