This function takes input and creates data products necessary for scul procedure. This input data will be called by all later programs

OrganizeDataAndSetup(
  time,
  y,
  TreatmentBeginsAt,
  x.DonorPool,
  CohensDThreshold = 0.25,
  NumberInitialTimePeriods = nrow(y) - TreatmentBeginsAt + 1,
  x.PlaceboPool = x.DonorPool,
  TrainingPostPeriodLength = nrow(y) - TreatmentBeginsAt + 1,
  OutputFilePath = getwd()
)

Arguments

time

A dataframe that is a column vector (T by 1) with the running time variable. Must be sorted by time (oldest to most recent).

y

A dataframe that is a column vector (T by 1) containing the target variable of interest. Must be sorted by time (oldest to most recent).

TreatmentBeginsAt

An integer indicating the time period (<T) in which treatment begins.

x.DonorPool

A (T by K) data frame containing all set of donor pool candidate that will be used to construct synthetic groups. Must be sorted by time (oldest to most recent).

CohensDThreshold

A real number greater than 0, indicating the Cohen's D threshold at which fit is determined to be "poor". The difference is in standard deviation units. Default is .25

NumberInitialTimePeriods

An integer indicating the minimum number of pre-treatment time periods to be included in the trainging data for the first cross-validation run. Default is the length of the post-treatment time period.

x.PlaceboPool

A (T by J) data frame containing all products that you wish to include in the placebo distribution Must be sorted by time. Default is to be the same as x.

TrainingPostPeriodLength

The number of timer periods post-treatment for training data. Defaults to all time since treatment begins.

OutputFilePath

A file path to store output. Default is current working directory

Value

InputDataForSCUL A list of items that will be called on for running the SCUL procedure.