Task 3: Respirator Allocation on MIMIC-IV

Published

February 16, 2024

Inspecting the Respiration Data from MIMIC-IV dataset

We begin by loading and inspecting the data from MIMIC-IV:

src <- "miiv"
data <- as.data.frame(load_resp_data(src))
knitr::kable(head(data), caption = "MIMIC-IV Respiration data.")
MIMIC-IV Respiration data.
stay_id o2prior sofa resp po2 sex age o2post respirator
30000646 96.16667 2 17.66667 71 1 43 96.50000 0
30001336 97.00000 1 26.66667 71 1 77 96.00000 0
30001396 87.66667 2 25.66667 76 1 40 80.66667 0
30001446 99.33333 11 19.66667 98 1 56 99.00000 0
30001471 94.00000 1 15.66667 98 1 86 92.16667 0
30001555 94.33333 9 16.00000 98 0 33 96.66667 0

We consider the cohort of all patients in the database admitted to the ICU. Patients who are mechanically ventilated immediately upon entering the ICU are subsequently removed. By focusing on the time window of the first 48 hours from admission to ICU, for each patient we determine the earliest time of mechanical ventilation, labeled \(t_{MV}\). Since mechanical ventilation is used to stabilize the respiratory profile of patients, for each patient we determine the average oxygen saturation in the three-hour period \([t_{MV}-3, t_{MV}]\) prior to mechanical ventilation, labeled O\(_2\)-pre. We also determine the oxygen saturation in the three-hour period following ventilation \([t_{MV}, t_{MV}+3]\), labeled O\(_2\)-post. For controls (patient not ventilated at any point in the first 48 hours), we take the reference point as 12 hours after ICU admission, and calculate the values in three hours before and after this time. Patients’ respiratory stability, which represents the outcome of interest \(Y\), is measured as follows:

\[\begin{align} Y := \begin{cases} 0 \text{ if O}_2\text{-post} \geq 97, \\ -(\text{O}_2\text{-post}-97)^2 \text{ otherwise}. \end{cases} \end{align}\]

Values of oxygen saturation above 97 are considered stable, and the larger the distance from this stability value, the higher the risk for the patient. We also collect other important patient characteristics before intervention that are the key predictors of outcome, including the SOFA score, respiratory rate, and partial oxygen pressure (PaO\(_2\)).

Constructing the SFM

We next construct the Standard Fairness Model, with also a decision \(D\):

# constructing the SFM
X <- "sex"
Z <- "age"
W <- c("sofa", "po2", "o2prior")
D <- "respirator"
Y <- "o2post"

Obtaining Fair Decisions

Fair Decisions can be obtained by passing the data, SFM, and any transform functions of the potential outcome \(Y_{d}\) (po_transform argument) to the fair_decisions() function:

# fit the outcome control model
resp_oc <- fair_decisions(
  data, X = X, Z = Z, W = W, Y = Y, D = D, x0 = 0, x1 = 1,
  po_transform = function(x) ifelse(x < 97, -(x-97)^2, 0), po_diff_sign = 1
)

Note that in faircause 0.2.0 the columns of the input data to functions fair_decisions() and fair_predictions() need to be either numeric or integer. Use one-hot encoding wherever appropriate.

When constructing fair decisions, xgboost is first used to estimate the conditional average treatment effect (CATE) of \(D\) on \(Y\), labeled \(\Delta\), also referred to as benefit:

\[\begin{align} \Delta = E[Y_{d_1} - Y_{d_0} \mid x, z, w]. \end{align}\]

In the above, resp_oc is S3-class object of type fair_decision. We can use the autoplot() generic to analyze various important aspects of the decision making process:

# inspect the decomposition of D
autoplot(resp_oc, type = "decision")

# inspect the decomposition of Delta
autoplot(resp_oc, type = "delta")

# inspect benefit fairness
autoplot(resp_oc, type = "benefit_fairness")

Further generics can be applied to fair_decision objects, such as predict(), which allows us to get estimates of the benefit \(\Delta\) on some test data, and also find the optimal policy that satisfies benefit fairness:

# make predictions on test
test_data <- data
fair_dec <- predict(resp_oc, newdata = test_data, budget = 0.2)

# inspect the estimate benefit \Delta
head(fair_dec$delta, n = 100L)
  [1] 8.017566e-01 3.724638e-01 5.191853e+01 0.000000e+00 6.212983e+00
  [6] 4.203570e+00 1.008966e+01 1.269065e-01 0.000000e+00 3.286853e-01
 [11] 1.251144e+00 5.440242e+00 2.653410e+00 0.000000e+00 3.287728e-01
 [16] 6.071516e-01 4.757139e-01 0.000000e+00 1.529186e+00 2.971644e-01
 [21] 8.352732e-01 3.049420e-01 0.000000e+00 1.316464e-01 9.029436e-01
 [26] 2.170649e-02 2.222031e+00 6.038357e+00 1.162410e+00 9.170604e+00
 [31] 1.005439e+01 1.413266e+01 3.665277e-01 0.000000e+00 0.000000e+00
 [36] 3.129165e+00 4.450426e+00 1.166478e-01 0.000000e+00 2.907858e-01
 [41] 7.480980e-04 0.000000e+00 1.167838e+00 1.495915e+00 0.000000e+00
 [46] 0.000000e+00 0.000000e+00 8.373803e-01 1.013286e+01 0.000000e+00
 [51] 6.888062e-06 9.781933e-01 5.623761e+00 4.914779e+00 1.095073e+01
 [56] 2.950802e-05 1.074799e+01 2.525700e-01 4.529369e+00 2.937705e+01
 [61] 5.786662e-04 0.000000e+00 0.000000e+00 4.605885e+00 4.780608e+00
 [66] 6.791864e+00 1.930003e+01 0.000000e+00 5.194264e+00 2.610412e+00
 [71] 0.000000e+00 0.000000e+00 4.935268e-03 1.658654e+00 8.200450e-01
 [76] 9.657683e-02 6.761817e+00 1.841890e+00 2.859673e+00 0.000000e+00
 [81] 9.752755e-02 0.000000e+00 4.548055e-01 1.569106e+00 1.290448e+00
 [86] 2.937638e-01 1.533631e+01 5.638602e+00 1.379101e+01 0.000000e+00
 [91] 0.000000e+00 0.000000e+00 6.255346e+00 9.792346e-02 0.000000e+00
 [96] 1.116764e+00 9.761857e+00 2.054724e+00 3.707430e-01 4.080177e+00
# inspect the allocated decisions
head(fair_dec$decision, n = 100L)
  [1] 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0
 [38] 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0
 [75] 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 0