Build-PK-Oral

This tutorial describes the PK data building module in the system for medications that are typically orally administrated. It demonstrates how to quickly build PK data using Build-PK-Oral when drug dose data that can be provided by users or generated from unstructured clinical notes using extracted dosing information with the Extract-Med module and processed with the Pro-Med-NLP module in the system.

Michael L. Williams

Introduction

We describe oral dosing data commonly obtained from electronic health records (EHRs) in the presence and absence of last dose time information. Finally, we explain how to run the Build-PK-Oral module.

EHR Sourced Oral Dosing Data

Drug dose information can be obtained from structured e-prescription databases, or extracted from clinical notes using the module Extract-Med (see Extract-Med to understand that process). The obtained dose information can be processing using Part II of Pro-Med-Str for the one from e-prescription databases and Pro-Med-NLP for the one extracted from clinical notes. Other data can be also processed using relevant modules (e.g., Pro-Drug Level, Pro-Laboratory). Our interest is in the intermediate dataset generated by these modules but not yet in a suitable format for PK analysis (see PK-Data-Oral-Dosing for an appropriate PK data form). Here is an example of such data.

library(EHR)
# Data generating function for examples
mkdat <- function() {
  npat=3
  visits <- floor(runif(npat, min=2, max=6))
  id <- rep(1:npat, visits)
  dt <- as.POSIXct(paste(as.Date(sort(sample(700, sum(visits))), 
                                 origin = '2019-01-01'), '10:00:00'), tz = 'UTC') 
  + rnorm(sum(visits), 0, 1*60*60)
  dose_morn <- sample(c(2.5,5,7.5,10), sum(visits), replace = TRUE)
  conc <- round(rnorm(sum(visits), 1.5*dose_morn, 1),1)
  ld <- dt - sample(10:16, sum(visits), replace = TRUE) * 3600
  ld[rnorm(sum(visits)) < .3] <- NA
  age <- rep(sample(40:75, npat), visits)
  weight <- rep(round(rnorm(npat, 180, 20)),visits)
  hgb <- round(rep(rnorm(npat, 10, 2), visits),1)
  data.frame(id, dt, dose_morn, conc, age, weight, hgb, ld)
}

# Make example data
set.seed(20)
dat <- mkdat()
ex <- dat
dat2 <- dat[,-8]
ex
   id                  dt dose_morn conc age weight  hgb                  ld
1   1 2019-02-27 10:00:00      10.0 16.1  46    154 12.0 2019-02-26 22:00:00
2   1 2019-05-08 10:00:00       5.0  6.4  46    154 12.0 2019-05-07 18:00:00
3   1 2019-06-30 10:00:00       5.0  5.6  46    154 12.0                <NA>
4   1 2019-11-20 10:00:00       5.0  6.6  46    154 12.0 2019-11-19 18:00:00
5   1 2020-04-01 10:00:00       2.5  5.0  46    154 12.0                <NA>
6   2 2020-04-10 10:00:00      10.0 15.1  72    193  8.6                <NA>
7   2 2020-05-13 10:00:00       7.5 11.7  72    193  8.6 2020-05-12 21:00:00
8   2 2020-05-26 10:00:00       2.5  2.9  72    193  8.6                <NA>
9   2 2020-06-10 10:00:00       2.5  2.2  72    193  8.6                <NA>
10  2 2020-06-25 10:00:00       2.5  4.3  72    193  8.6                <NA>
11  3 2020-07-07 10:00:00       2.5  3.4  56    167 12.2                <NA>
12  3 2020-09-25 10:00:00       7.5 10.2  56    167 12.2 2020-09-24 22:00:00
13  3 2020-10-04 10:00:00       5.0  7.5  56    167 12.2                <NA>

The ID, time, and covariate data in this dataframe is self explanatory. The difficult data to deal with can be found in dose_morn, conc, and ld. conc is a measured blood concentration which is recorded at the time indicated in time. dose_morn is the dose which was taken in the morning of that measured concentration, as extracted by Extract-Med. ld is the last-dose time, the extracted time of the dose which precedes the measured concentration, if present in the EHR. This information can be used to construct a course of drug dosing but some assumptions need to be made. Consider the two following assumptions:

  1. Dosing only occurs on the morning of the recorded dose.
  2. Dosing occurs twice daily at 8am and 8pm.

Obviously, the correct assumption will depend on the drug of interest – some drugs are taken on a regular schedule, while others are as needed. Drugs undergoing routine therapeutic drug monitoring tend to be the latter so assumption 2 will likely be the better assumption.

If a twice-daily dosing assumption is appropriate, then we can begin to think about how to best choose the timing of each dose. Can we exploit some information in our dataset to determine dose timings which are more realistic than, say every day 8am and 8pm. In doing so we need to remember that the timing of the dose before the measured concentration is most important for estimating the PK profile. Some other things to consider:

  1. The timing of the concentration measurement itself may impact dosing.
  2. There may or may not be an extracted last-dose time to work with.

For this reason, we will break up our algorithm into with and without last-dose times and discuss how we arrive at a final dose-building algorithm.

Without Last-Dose Times

When we do not have an extracted last-dose time, we must work off of only our measured concentration timing and assumptions about how routine therapeutic drug monitoring is typically performed. The majority of these labs are conducted in the morning, and patients will typically hold off on taking their medications until after the blood draw. For that reason, we assume that a dose is taken 30 minutes after a measured blood concentration, then proceeds with a dose every 12 hours. The final dose in the sequence will occur 6-18 hours before the next measured concentration, the timing of which will determine the next sequence of doses, and so on until the final measured concentration. The first measured concentration requires special attention. Typical therapeutic drug monitoring procedure suggests it is reasonable to assume that the drug has been taken regularly before this first concentration. The safest assumptiom, assuming no further information is available, is that the drug has been taken every 12 hours for long enough to reach a steady state of trough concentrations. In our PK dose building algorithm, we use a default of 336 hours (i.e., 14 days), ending 12 hours before the first measured concentration.

With Last-Dose times

In the case where extracted last-dose times are available, we can begin by building the same dataset as without last-dose times. We then add a row for a dose corresponding to the extracted last dose time and eliminate an appropriate number of doses from the preceding dose sequence to avoid incorrect double-dosing; this is done by removing doses until the last dose in the sequence is 6-18 hours before the extracted last-dose time. This allows for last-dose times which are more than 12 hours before the extracted concentration to be appropriately accounted for.

Build-PK-Oral

We now describe how to build the PK data without last-dose time and with last-dose time, both of which can be built using run_Build_PK_Oral() by defining the argument ldCol differently.

To begin we load the EHR package, the pkdata package, and the lubridate package.

# load EHR package and dependencies
library(EHR)
library(pkdata)
library(lubridate)

Let’s have another look at the example data we want to process:

ex
   id                  dt dose_morn conc age weight  hgb                  ld
1   1 2019-02-27 10:00:00      10.0 16.1  46    154 12.0 2019-02-26 22:00:00
2   1 2019-05-08 10:00:00       5.0  6.4  46    154 12.0 2019-05-07 18:00:00
3   1 2019-06-30 10:00:00       5.0  5.6  46    154 12.0                <NA>
4   1 2019-11-20 10:00:00       5.0  6.6  46    154 12.0 2019-11-19 18:00:00
5   1 2020-04-01 10:00:00       2.5  5.0  46    154 12.0                <NA>
6   2 2020-04-10 10:00:00      10.0 15.1  72    193  8.6                <NA>
7   2 2020-05-13 10:00:00       7.5 11.7  72    193  8.6 2020-05-12 21:00:00
8   2 2020-05-26 10:00:00       2.5  2.9  72    193  8.6                <NA>
9   2 2020-06-10 10:00:00       2.5  2.2  72    193  8.6                <NA>
10  2 2020-06-25 10:00:00       2.5  4.3  72    193  8.6                <NA>
11  3 2020-07-07 10:00:00       2.5  3.4  56    167 12.2                <NA>
12  3 2020-09-25 10:00:00       7.5 10.2  56    167 12.2 2020-09-24 22:00:00
13  3 2020-10-04 10:00:00       5.0  7.5  56    167 12.2                <NA>

There are 3 individuals in the dataset. Each has a set of EHR-extracted dose and blood concentrations data along with demographic data and information commonly found with laboratory data:

All concentrations are being taken in the morning. Given that this is a drug which should be taken orally every 12 hours, we can construct a reasonable dosing schedule which details the amount and timing of each dose.

run_Build_PK_Oral() will build an appropriate dataset for population PK analysis for drugs orally administered, given specification of appropriate columns:

(1) Build the PK data without last-dose time

Suppose we do not have the last-dose time information. For illustrative purpose, we remove this information (i.e., column 8 is omitted in this example data).

# Build PK data without last-dose times
run_Build_PK_Oral(x = dat[,-8],
                  idCol = "id",
                  dtCol = "dt",
                  doseCol = "dose_morn",
                  concCol = "conc",
                  ldCol = NULL,
                  first_interval_hours = 336,
                  imputeClosest = NULL)
   id   time  amt   dv mdv evid addl II                date age weight  hgb
1   1    0.0 10.0   NA   1    1   27 12 2019-02-13 10:00:00  46    154 12.0
2   1  336.0   NA 16.1   0    0   NA NA 2019-02-27 10:00:00  46    154 12.0
3   1  336.5 10.0   NA   1    1  139 12 2019-02-27 10:30:00  46    154 12.0
4   1 2015.0   NA  6.4   0    0   NA NA 2019-05-08 10:00:00  46    154 12.0
5   1 2015.5  5.0   NA   1    1  105 12 2019-05-08 10:30:00  46    154 12.0
6   1 3287.0   NA  5.6   0    0   NA NA 2019-06-30 10:00:00  46    154 12.0
7   1 3287.5  5.0   NA   1    1  285 12 2019-06-30 10:30:00  46    154 12.0
8   1 6720.0   NA  6.6   0    0   NA NA 2019-11-20 10:00:00  46    154 12.0
9   1 6720.5  5.0   NA   1    1  265 12 2019-11-20 10:30:00  46    154 12.0
10  1 9911.0   NA  5.0   0    0   NA NA 2020-04-01 10:00:00  46    154 12.0
11  2    0.0 10.0   NA   1    1   27 12 2020-03-27 10:00:00  72    193  8.6
12  2  336.0   NA 15.1   0    0   NA NA 2020-04-10 10:00:00  72    193  8.6
13  2  336.5 10.0   NA   1    1   65 12 2020-04-10 10:30:00  72    193  8.6
14  2 1128.0   NA 11.7   0    0   NA NA 2020-05-13 10:00:00  72    193  8.6
15  2 1128.5  7.5   NA   1    1   25 12 2020-05-13 10:30:00  72    193  8.6
16  2 1440.0   NA  2.9   0    0   NA NA 2020-05-26 10:00:00  72    193  8.6
17  2 1440.5  2.5   NA   1    1   29 12 2020-05-26 10:30:00  72    193  8.6
18  2 1800.0   NA  2.2   0    0   NA NA 2020-06-10 10:00:00  72    193  8.6
19  2 1800.5  2.5   NA   1    1   29 12 2020-06-10 10:30:00  72    193  8.6
20  2 2160.0   NA  4.3   0    0   NA NA 2020-06-25 10:00:00  72    193  8.6
21  3    0.0  2.5   NA   1    1   27 12 2020-06-23 10:00:00  56    167 12.2
22  3  336.0   NA  3.4   0    0   NA NA 2020-07-07 10:00:00  56    167 12.2
23  3  336.5  2.5   NA   1    1  159 12 2020-07-07 10:30:00  56    167 12.2
24  3 2256.0   NA 10.2   0    0   NA NA 2020-09-25 10:00:00  56    167 12.2
25  3 2256.5  7.5   NA   1    1   17 12 2020-09-25 10:30:00  56    167 12.2
26  3 2472.0   NA  7.5   0    0   NA NA 2020-10-04 10:00:00  56    167 12.2

Note that addl and II dictate an every-twelve-hour dosing schedule which leads up to the proceeding concentration. Covariates are preserved and a time variable which represents hours since first dose is generated. This data is now in an appropriate format for PK analysis but makes no use of the last-dose times.

(2) Build the PK data with last-dose time

Suppose we do now have the last-dose time information although they are extracted along with some (but not all) concentrations. When last-dose times are avaiable, they can be specified in the argument ldCol in the input data (e.g., ldCol = "ld". Then, the sequence of doses leading up to the extracted dose is reduced and a new row is inserted which accurately describes the timing of the dose which precedes the relevant concentration.

# Build PK data with last-dose times
run_Build_PK_Oral(x = dat,
                  idCol = "id",
                  dtCol = "dt",
                  doseCol = "dose_morn",
                  concCol = "conc",
                  ldCol = "ld",
                  first_interval_hours = 336,
                  imputeClosest = NULL)
   id   time  amt   dv mdv evid addl II                date age weight  hgb
1   1    0.0 10.0   NA   1    1   26 12 2019-02-13 10:00:00  46    154 12.0
2   1  324.0 10.0   NA   1    1    0 NA 2019-02-26 22:00:00  46    154 12.0
3   1  336.0   NA 16.1   0    0   NA NA 2019-02-27 10:00:00  46    154 12.0
4   1  336.5 10.0   NA   1    1  138 12 2019-02-27 10:30:00  46    154 12.0
5   1 1999.0 10.0   NA   1    1    0 NA 2019-05-07 18:00:00  46    154 12.0
6   1 2015.0   NA  6.4   0    0   NA NA 2019-05-08 10:00:00  46    154 12.0
7   1 2015.5  5.0   NA   1    1  105 12 2019-05-08 10:30:00  46    154 12.0
8   1 3287.0   NA  5.6   0    0   NA NA 2019-06-30 10:00:00  46    154 12.0
9   1 3287.5  5.0   NA   1    1  284 12 2019-06-30 10:30:00  46    154 12.0
10  1 6704.0  5.0   NA   1    1    0 NA 2019-11-19 18:00:00  46    154 12.0
11  1 6720.0   NA  6.6   0    0   NA NA 2019-11-20 10:00:00  46    154 12.0
12  1 6720.5  5.0   NA   1    1  265 12 2019-11-20 10:30:00  46    154 12.0
13  1 9911.0   NA  5.0   0    0   NA NA 2020-04-01 10:00:00  46    154 12.0
14  2    0.0 10.0   NA   1    1   27 12 2020-03-27 10:00:00  72    193  8.6
15  2  336.0   NA 15.1   0    0   NA NA 2020-04-10 10:00:00  72    193  8.6
16  2  336.5 10.0   NA   1    1   64 12 2020-04-10 10:30:00  72    193  8.6
17  2 1115.0 10.0   NA   1    1    0 NA 2020-05-12 21:00:00  72    193  8.6
18  2 1128.0   NA 11.7   0    0   NA NA 2020-05-13 10:00:00  72    193  8.6
19  2 1128.5  7.5   NA   1    1   25 12 2020-05-13 10:30:00  72    193  8.6
20  2 1440.0   NA  2.9   0    0   NA NA 2020-05-26 10:00:00  72    193  8.6
21  2 1440.5  2.5   NA   1    1   29 12 2020-05-26 10:30:00  72    193  8.6
22  2 1800.0   NA  2.2   0    0   NA NA 2020-06-10 10:00:00  72    193  8.6
23  2 1800.5  2.5   NA   1    1   29 12 2020-06-10 10:30:00  72    193  8.6
24  2 2160.0   NA  4.3   0    0   NA NA 2020-06-25 10:00:00  72    193  8.6
25  3    0.0  2.5   NA   1    1   27 12 2020-06-23 10:00:00  56    167 12.2
26  3  336.0   NA  3.4   0    0   NA NA 2020-07-07 10:00:00  56    167 12.2
27  3  336.5  2.5   NA   1    1  158 12 2020-07-07 10:30:00  56    167 12.2
28  3 2244.0  2.5   NA   1    1    0 NA 2020-09-24 22:00:00  56    167 12.2
29  3 2256.0   NA 10.2   0    0   NA NA 2020-09-25 10:00:00  56    167 12.2
30  3 2256.5  7.5   NA   1    1   17 12 2020-09-25 10:30:00  56    167 12.2
31  3 2472.0   NA  7.5   0    0   NA NA 2020-10-04 10:00:00  56    167 12.2

Individual 1 has no extracted last-dose times so their data is unchanged from before. Compare, however, rows 7-9 to rows 7-8 of the previous dataset constructed without last-dose times. The measured concentration of 14.1 on date 2019-11-01 is associated with a last-dose time. addl drops from 69 to 68 and the extracted last-dose is added in row 8 with additional date 2019-10-31 20:58:36 which is the last-dose time extracted from clinical notes. Notice that the number of doses leading up to the concentration is unchanged and the timing of the final dose has been adjusted to reflect information in the EHR (i.e., the calculated time of 1162.70 for time). This dataset still relies on assumptions about dosing, but should reflect the actual dosing schedule better by incorporating last-dose times from the EHR.

Note that this dataset includes standard NONMEM formatted variables. For details of data items, see PK-Data-Oral-Dosing.

References

  1. Choi L, Beck C, McNeer E, Weeks HL, Williams ML, James NT, Niu X, Abou-Khalil BW, Birdwell KA, Roden DM, Stein CM. Development of a System for Post-marketing Population Pharmacokinetic and Pharmacodynamic Studies using Real-World Data from Electronic Health Records. Clinical Pharmacology & Therapeutics. 2020 Apr;107(4):934-43. doi: 10.1002/cpt.1787.

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.