This function takes input and creates data products necessary for scul procedure. This input data will be called by all later programs
OrganizeDataAndSetup(
time,
y,
TreatmentBeginsAt,
x.DonorPool,
CohensDThreshold = 0.25,
NumberInitialTimePeriods = nrow(y) - TreatmentBeginsAt + 1,
x.PlaceboPool = x.DonorPool,
TrainingPostPeriodLength = nrow(y) - TreatmentBeginsAt + 1,
OutputFilePath = getwd()
)
time | A dataframe that is a column vector (T by 1) with the running time variable. Must be sorted by time (oldest to most recent). |
---|---|
y | A dataframe that is a column vector (T by 1) containing the target variable of interest. Must be sorted by time (oldest to most recent). |
TreatmentBeginsAt | An integer indicating the time period (<T) in which treatment begins. |
x.DonorPool | A (T by K) data frame containing all set of donor pool candidate that will be used to construct synthetic groups. Must be sorted by time (oldest to most recent). |
CohensDThreshold | A real number greater than 0, indicating the Cohen's D threshold at which fit is determined to be "poor". The difference is in standard deviation units. Default is .25 |
NumberInitialTimePeriods | An integer indicating the minimum number of pre-treatment time periods to be included in the trainging data for the first cross-validation run. Default is the length of the post-treatment time period. |
x.PlaceboPool | A (T by J) data frame containing all products that you wish to include in the placebo distribution Must be sorted by time. Default is to be the same as x. |
TrainingPostPeriodLength | The number of timer periods post-treatment for training data. Defaults to all time since treatment begins. |
OutputFilePath | A file path to store output. Default is current working directory |
InputDataForSCUL A list of items that will be called on for running the SCUL procedure.