Task 3: Respirator Allocation on MIMIC-IV

Published

February 16, 2024

Inspecting the Respiration Data from MIMIC-IV dataset

We begin by loading and inspecting the data from MIMIC-IV:

src <- "miiv"
data <- as.data.frame(load_resp_data(src))
knitr::kable(head(data), caption = "MIMIC-IV Respiration data.")

MIMIC-IV Respiration data.
stay_id	o2prior	sofa	resp	po2	sex	age	o2post
30000646	96.16667	2	17.66667	71	1	43	96.50000
30001336	97.00000	1	26.66667	71	1	77	96.00000
30001396	87.66667	2	25.66667	76	1	40	80.66667
30001446	99.33333	11	19.66667	98	1	56	99.00000
30001471	94.00000	1	15.66667	98	1	86	92.16667
30001555	94.33333	9	16.00000	98	0	33	96.66667

We consider the cohort of all patients in the database admitted to the ICU. Patients who are mechanically ventilated immediately upon entering the ICU are subsequently removed. By focusing on the time window of the first 48 hours from admission to ICU, for each patient we determine the earliest time of mechanical ventilation, labeled \(t_{MV}\). Since mechanical ventilation is used to stabilize the respiratory profile of patients, for each patient we determine the average oxygen saturation in the three-hour period \([t_{MV}-3, t_{MV}]\) prior to mechanical ventilation, labeled O\(_2\)-pre. We also determine the oxygen saturation in the three-hour period following ventilation \([t_{MV}, t_{MV}+3]\), labeled O\(_2\)-post. For controls (patient not ventilated at any point in the first 48 hours), we take the reference point as 12 hours after ICU admission, and calculate the values in three hours before and after this time. Patients’ respiratory stability, which represents the outcome of interest \(Y\), is measured as follows:

\[\begin{align} Y := \begin{cases} 0 \text{ if O}_2\text{-post} \geq 97, \\ -(\text{O}_2\text{-post}-97)^2 \text{ otherwise}. \end{cases} \end{align}\]

Values of oxygen saturation above 97 are considered stable, and the larger the distance from this stability value, the higher the risk for the patient. We also collect other important patient characteristics before intervention that are the key predictors of outcome, including the SOFA score, respiratory rate, and partial oxygen pressure (PaO\(_2\)).

Constructing the SFM

We next construct the Standard Fairness Model, with also a decision \(D\):

# constructing the SFM
X <- "sex"
Z <- "age"
W <- c("sofa", "po2", "o2prior")
D <- "respirator"
Y <- "o2post"

Obtaining Fair Decisions

Fair Decisions can be obtained by passing the data, SFM, and any transform functions of the potential outcome \(Y_{d}\) (po_transform argument) to the fair_decisions() function:

# fit the outcome control model
resp_oc <- fair_decisions(
  data, X = X, Z = Z, W = W, Y = Y, D = D, x0 = 0, x1 = 1,
  po_transform = function(x) ifelse(x < 97, -(x-97)^2, 0), po_diff_sign = 1
)

Note that in faircause 0.2.0 the columns of the input data to functions fair_decisions() and fair_predictions() need to be either numeric or integer. Use one-hot encoding wherever appropriate.

When constructing fair decisions, xgboost is first used to estimate the conditional average treatment effect (CATE) of \(D\) on \(Y\), labeled \(\Delta\), also referred to as benefit:

\[\begin{align} \Delta = E[Y_{d_1} - Y_{d_0} \mid x, z, w]. \end{align}\]

In the above, resp_oc is S3-class object of type fair_decision. We can use the autoplot() generic to analyze various important aspects of the decision making process:

# inspect the decomposition of D
autoplot(resp_oc, type = "decision")

# inspect the decomposition of Delta
autoplot(resp_oc, type = "delta")

# inspect benefit fairness
autoplot(resp_oc, type = "benefit_fairness")

Further generics can be applied to fair_decision objects, such as predict(), which allows us to get estimates of the benefit \(\Delta\) on some test data, and also find the optimal policy that satisfies benefit fairness:

# make predictions on test
test_data <- data
fair_dec <- predict(resp_oc, newdata = test_data, budget = 0.2)

# inspect the estimate benefit \Delta
head(fair_dec$delta, n = 100L)

  [1] 8.017566e-01 3.724638e-01 5.191853e+01 0.000000e+00 6.212983e+00
  [6] 4.203570e+00 1.008966e+01 1.269065e-01 0.000000e+00 3.286853e-01
 [11] 1.251144e+00 5.440242e+00 2.653410e+00 0.000000e+00 3.287728e-01
 [16] 6.071516e-01 4.757139e-01 0.000000e+00 1.529186e+00 2.971644e-01
 [21] 8.352732e-01 3.049420e-01 0.000000e+00 1.316464e-01 9.029436e-01
 [26] 2.170649e-02 2.222031e+00 6.038357e+00 1.162410e+00 9.170604e+00
 [31] 1.005439e+01 1.413266e+01 3.665277e-01 0.000000e+00 0.000000e+00
 [36] 3.129165e+00 4.450426e+00 1.166478e-01 0.000000e+00 2.907858e-01
 [41] 7.480980e-04 0.000000e+00 1.167838e+00 1.495915e+00 0.000000e+00
 [46] 0.000000e+00 0.000000e+00 8.373803e-01 1.013286e+01 0.000000e+00
 [51] 6.888062e-06 9.781933e-01 5.623761e+00 4.914779e+00 1.095073e+01
 [56] 2.950802e-05 1.074799e+01 2.525700e-01 4.529369e+00 2.937705e+01
 [61] 5.786662e-04 0.000000e+00 0.000000e+00 4.605885e+00 4.780608e+00
 [66] 6.791864e+00 1.930003e+01 0.000000e+00 5.194264e+00 2.610412e+00
 [71] 0.000000e+00 0.000000e+00 4.935268e-03 1.658654e+00 8.200450e-01
 [76] 9.657683e-02 6.761817e+00 1.841890e+00 2.859673e+00 0.000000e+00
 [81] 9.752755e-02 0.000000e+00 4.548055e-01 1.569106e+00 1.290448e+00
 [86] 2.937638e-01 1.533631e+01 5.638602e+00 1.379101e+01 0.000000e+00
 [91] 0.000000e+00 0.000000e+00 6.255346e+00 9.792346e-02 0.000000e+00
 [96] 1.116764e+00 9.761857e+00 2.054724e+00 3.707430e-01 4.080177e+00

# inspect the allocated decisions
head(fair_dec$decision, n = 100L)

  [1] 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0
 [38] 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0
 [75] 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 0