General description
===================


Moreover, we dealt with 4 optimization models in the paper.
1. Time-dependent and stochastic (DD-TD-OP-STW)
2. Time-dependent (TD-OP-STW)
3. Stochastic (DD-OP-STW)
4. Deterministic and time-independent (OP-STW) 

The following document explains how to generate our problem instances for each optimization model. 

In the paper, we present instances of 3 sizes: 24, 30 and 54. 
In addition, there are four types of problem input that constitute the problem:
1. Time windows
2. Service times
3. Travel times
4. Prizes

The time windows, prizes and the service times data for the instances with size <custmer_number> are stored in a folder named "Instances" + <customer_number>. E.g, the Instances24 folder stores the time windows and service times of instances of size 24. 

Each "Instances" +<customer_number> folder contains three sub-folders. 

The first folder is named "ServiceTimes". Each ServiceTimes folder contains 20 csv files, one for each instance. Each file contains <customer_number> rows and 40 columns. The item in the i'th row and j'th column represents the service time of customer i in scenario j. 

The service times for the DD-TD-OP-STW model and for the DD-OP-STW model are simply the service times as given in columns. The service times for the TD-OP-STW model and the OP-STW model are calculated as the average of the first 40 scenarios rounded to the nearest integer.       

The second folder is named "TimeWindows". Each TimeWindows folder contains 20 csv files, one for each instance. Each file contains <customer_number> rows and 2 columns. The item in the i'th row and first column represents the opening of customer i'th time window (a[i]), while the item in the same row in the second column represents the end of customer i's time window (b[i]). Recall that the time windows are the same for all optimization models.

The third folder is named "Prizes". Each Prizes folder contains 20 csv files, one for each instance. Each file contains 1 row and <customer_number> columns. the item in the j'th column represents the prize for the j'th customer. Recall that the prizes are the same for all optimization models. 

Next, we explain how to generate travel times data.
Due to the relatively large size of the final travel times data in our experiments, this kit contains a pickle file containing some raw data needed to generate the travel times (TravelTimesAllDays60Nodes.pickle) data and a python function that uses the raw data and generates the travel times data used in the experiments, for each optimization model (the function is called CreateMatrixesAndScenarios in the CreateTravelTimeMatrix.py file).  


As we have explained in the paper, each problem instance of each type contains a subset of the 60 locations. The file "Combinations"  <customer_number>.csv" contains the locations used for the instances of size <customer_number>. For example, Combinations24 contains 20 rows, a row for each instance. Each row is of size <customer_number>+1. The first item in all rows is always the depot, which is fixed in our experiments. The next <customer_number> items were chosen randomly (from the other 59 locations) to create instances with one depot and <customer_number> other locations.  

The "CreateTravelTimeMatrix.py" file contains a python function called "CreateMatrixesAndScenarios"  that receives a subset of locations and an optimization model (the options are: 'DD-TD-OP-STW' , 'TD-OP-STW', 'DD-OP-STW', 'OP-STW') and returns the full travel time data used in our experiments for this subset and optimization model. 
 
This function reads the raw data stored in TravelTimesAllDays60Nodes.pickle and performs the required processing to generate the travel time data. Short documentation on the function is given below.
The function receives  two parameters:
Combination - That is a list with a subset of locations for which the data is required    
OptType - optimization model. One of the following: 'DD-TD-OP-STW' , 'TD-OP-STW', 'DD-OP-STW', 'OP-STW'.

It return a 4-dimensional list. The first dimension represents the scenarios (of length 40). The second and third dimensions represent the origin and destination locations (of length equal to the number of locations) and the fourth dimension represents the time. The time dimension relates to 4 * 540 = 2,160 periods, which is a safe upper bound for the length of the day in all problem instances. This data structure stores the travel time in all relevant  scenarios, between all locations in the given combination. 

For the 'TD-OP-STW' type, all travel times between the same pair of locations (i,j) at time t are the same among all scenarios.     

For the 'DD-OP-STW' type, all travel times between the same pair of locations (i,j) at scenario k are the same among all periods. 

For the 'OP-STW' type, all travel times between the same pair of locations (i,j) are the same among all periods and all scenarios. 

If the optimization type or the dataset type are invalid, a string indicating the error is returned. Similarly, if the list of combinations is empty an error string is returned. No other validity checks on the content of the combination is done. Recall the locations are indexed in the range 0 to 59. 


Example
=======
If one wishes to generate the scenario-based time dependent travel time data for the 4'th instance of size 24, he has to:
1. go to the file "Combinations24.csv". The relevant subset of locations of the 4'th instance is in the 4'th row, that is 0,13,35,11,37,10,18,32,49,52,8,33,41,50,38,47,45,57,17,25,23,53,58,28,29
2. Call the function "CreateMatrixesAndScenarios" in the file "CreateTravelTimeMatrix.py" with this subset of locations, optimization type 'DD-TD-OP-STW' , that is:

Data = CreateMatrixesAndScenarios ([0,13,35,11,37,10,18,32,49,52,8,33,41,50,38,47,45,57,17,25,23,53,58,28,29] ,'DD-TD-OP-STW')


  