================================================================
Instances for the DD-TD-TSP-STW based on Arigliano et al (2015)
================================================================

General description
===================

As  described in Avraham and Raviv (2020), each dataset is divided to:
1. Training data that contains the first 40 scenarios
2. Test data that contains the last 20 scenarios

Moreover, we dealt with 4 optimization models in the paper.
1. Time-dependent and stochastic (DD-TD-TSP-STW)
2. Time-dependent (TD-TSP-STW)
3. Stochastic (DD-TSP-STW)
4. Deterministic and time-independent (TSP-STW) 

The following document explains how to generate the problem instances for each optimization model. Note that the test set is only used for the evaluation of the performance of all models in the stochastic and time-dependent reality. 

In the paper, we deal with instances of 4 sizes: 15, 20, 30 and 40. 
In addition, there are two types of problem input that constitute the problem:
1. Time windows
2. Travel times

As noted in the paper, the service times in these instances is 0.

Both the time windows and the travel time matrix are generated by applying the function CreateDataFiles that is in the python file CreateDataFiles.py. 



The CreateDataFiles function - input and output
=================================================
The function has 6 arguments:
OptType - optimization model. One of: 'DD-TD-TSP-STW' , 'TD-TSP-STW', 'DD-TSP-STW', 'TSP-STW'

DataSetType - One of the following: 'Training' , 'Test'

N - number of customers. Four possible options: 15, 20, 30, 40

Width - width of time windows. It has two possible options: 100, 50

Delta - measure for the effect of time dependency. It has three options: 70, 90, 98

pattern - describes the pattern of congestion. It has two options - 'A' , 'B'. 
 
Each unique combination of N, Width, Delta and pattern is related to a single instance that we solved. for more information related to the N, Width, Delta and pattern arguments, the interested reader may see 
the original paper of Arigliano et al (2015). 

Any activation of the function with values that are invalid may yield an error message that terminates the operation of the function. 

The function returns 3 data structures:

a - a list of tuples. Each tuple has two items. The first item is the node (0 represents the depot) and the second item is the beginning of its time window 

b - a list of tuples. Each tuple has two items. The first item is the node (0 represents the depot) and the second item is the end of its time window 

TravelTimes - The related travel times. The data structure related to the travel times changes according to the OptType argument 

For the 'DD-TD-TSP-STW' type, a 4-dimensional list is returned. The first dimension represents the scenarios (of length 40 for training set and 20 for the test set). The second and third dimensions represent the origin and destination locations (of length equal to the number of locations) and the fourth dimension represents the time. In particular, the number of time periods that are considered is equal to the value of the end of the depot's time window. A time granulation of one minute is considered. This data structure stores the travel time in all relevant (training/test) scenarios, between all locations in the given combination, in all periods. It is assumed that journeys that begin after the end of the depot's time window are characterized by the same travel time as those that begin at the end of the depot's time window. 

For the 'TD-TSP-STW' type, a 3-dimensional list is returned. The first and second dimensions represent the origin and destination locations (of length equal to the number of locations) and the third dimension represents the time. In particular, the number of time periods that are considered is equal to the value of the end of the depot's time window. A time granulation of one minute is considered. This data structure stores the travel time between all locations in the given combination, in all periods.  
It is assumed that journeys that begin after the end of the depot's time window are characterized by the same travel time as those that begin at the end of the depot's time window. 

For the 'DD-TSP-STW' type, a 3-dimensional list is returned. The first dimension represents the scenarios (of length 40 for the training set and 20 for the test set). The second and third dimensions represent the origin and destination locations (of length equal to the number of locations). This data structure stores the travel time in all relevant (training/test) scenarios, between all locations in the given combination.

For the 'TSP-STW' type, a 2-dimensional list is returned. The first and second dimensions represent the origin and destination  locations (of length equal to the number of locations)  



The CreateDataFiles function - additional details
=======================================================================================

Comment
=======
You do not need to go in depth into this part for a proper use of the function. This part is mainly helpful for understanding the relation between our test instances and their origins in Arigliano et al (2015). 

Arigliano et al (2015) instances 
==================================
Each of the 48 instances that we generate here is based on an instance presented in Arigliano et al (2015). In Arigliano et al (2015), 30 instances exist for each category of N, Width, Delta, and pattern. These 30 instances are comprised of 3 sets, denoted by A,B and C. Each set contains 10 instances. Recall that we solved only one instance of each category of N, Width, Delta, and pattern. Next, we map each instance we solved to its origin in Arigliano et al (2015):

15,100,70,A: B4
15,100,70,B: C6
15,100,90,A: C7
15,100,90,B: A3
15,100,98,A: B8
15,100,98,B: A2
15,50,70,A:  C9
15,50,70,B:  A5
15,50,90,A:  B9
15,50,90,B:  B6
15,50,98,A:  B5
15,50,98,B:  A1
20,100,70,A: B9
20,100,70,B: B10
20,100,90,A: A5
20,100,90,B: C7
20,100,98,A: A7
20,100,98,B: C8
20,50,70,A:  C6
20,50,70,B:  A2
20,50,90,A:  A9
20,50,90,B:  B4
20,50,98,A;  A8
20,50,98,B:  C9
30,100,70,A: C7
30,100,70,B: A9
30,100,90,A: A5
30,100,90,B: A10
30,100,98,A: C5
30,100,98,B: A1
30,50,70,A:  A7
30,50,70,B:  C2
30,50,90,A:  C4
30,50,90,B:  B2
30,50,98,A:  C9
30,50,98,B:  B5
40,100,70,A: C1
40,100,70,B: B3
40,100,90,A: B2
40,100,90,B: A3
40,100,98,A: A5
40,100,98,B: B9
40,50,70,A:  B10
40,50,70,B;  C5
40,50,90,A:  A6
40,50,90,B:  C7
40,50,98,A:  A8
40,50,98,B:  C9   

Each instance in Arigliano et al (2015) is comprised of two data files. 
The first data file describes the congestion patterns of the city and is called "new_Jams_" + <Delta> + "_C_" + <pattern>. All instances that are identical in their associated values for delta and pattern relate to the same file. That is, in our test instances, 6 (2 patterns multiplied by 3 delta's) files of this sort are required and therefore, kept, in their origianl name:
new_Jams_70_C_A
new_Jams_90_C_B
new_Jams_98_C_A
new_Jams_70_C_B
new_Jams_90_C_A
new_Jams_98_C_B

The second data file contains time windows, distance matrixes, base speeds and other related data. Each problem instance is associated with a different file, and therefore, 48 files exist. their names are as follows:
 
<N> + "_" + <delta> + "_" + <pattern> + "_" + <width> + "_" + <X>

N, delta, pattern and width are as presented earlier. X is obtained from the following mapping that is kept in the file DictInstances.csv :
15,70,A,100:14
15,90,A,100:27
15,98,A,100:18
15,70,B,100:26
15,90,B,100:3
15,98,B,100:2
15,70,A,50: 29
15,90,A,50: 19
15,98,A,50: 15
15,70,B,50: 5
15,90,B,50: 16
15,98,B,50: 1
20,70,A,100:19
20,90,A,100:5
20,98,A,100:7
20,70,B,100:20
20,90,B,100:27
20,98,B,100:28
20,70,A,50: 26
20,90,A,50: 9
20,98,A,50: 8
20,70,B,50: 2
20,90,B,50: 14
20,98,B,50: 29
30,70,A,100:27
30,90,A,100:5
30,98,A,100:25
30,70,B,100:9
30,90,B,100:10
30,98,B,100:1
30,70,A,50: 7
30,90,A,50: 24
30,98,A,50: 29
30,70,B,50: 22
30,90,B,50: 12
30,98,B,50: 15
40,70,A,100:21
40,90,A,100:12
40,98,A,100:5
40,70,B,100:13
40,90,B,100:3
40,98,B,100:19
40,70,A,50: 20
40,90,A,50: 6
40,98,A,50: 8
40,70,B,50: 25
40,90,B,50: 27
40,98,B,50: 29

Stochasticity
=============
Stochasticity is added to the Arigliano et al (2015) instances by creating 60 scenarios for each instance. In each scenario, a factor for each cluster and period (1..73 periods in Arigliano et al (2015)) is created. Next, the speed at each scenario, period and cluster is obtained after multiplying the original speed of Arigliano et al (2015) with the factor. The multipliers of all instances are kept as a 4 dimensional list in the pickle file "MultipliersArgiliano.pickle". 
The first dimension in the list relates to the instance. This dimension is indexed as 0..47 and the mapping of each instance, according to the N, Width, Delta, and pattern values, to its index is given below:   

15,100,70,A: 0
15,100,70,B: 1
15,100,90,A: 2
15,100,90,B: 3
15,100,98,A: 4
15,100,98,B: 5
15,50,70,A:  6
15,50,70,B:  7
15,50,90,A:  8
15,50,90,B:  9
15,50,98,A:  10
15,50,98,B:  11
20,100,70,A: 12
20,100,70,B: 13
20,100,90,A: 14
20,100,90,B: 15
20,100,98,A: 16
20,100,98,B: 17
20,50,70,A:  18
20,50,70,B:  19
20,50,90,A:  20
20,50,90,B:  21
20,50,98,A;  22
20,50,98,B:  23
30,100,70,A: 24
30,100,70,B: 25
30,100,90,A: 26
30,100,90,B: 27
30,100,98,A: 28
30,100,98,B: 29
30,50,70,A:  30
30,50,70,B:  31
30,50,90,A:  32
30,50,90,B:  33
30,50,98,A:  34
30,50,98,B:  35
40,100,70,A: 36
40,100,70,B: 37
40,100,90,A: 38
40,100,90,B: 39
40,100,98,A: 40
40,100,98,B: 41
40,50,70,A:  42
40,50,70,B;  43
40,50,90,A:  44
40,50,90,B:  45
40,50,98,A:  46
40,50,98,B:  47

The second dimension is the scenario, indexed by 0..59. 
The third dimension is the time period, indexed by 0..72.
The fourth dimension is the cluster, indexed by 0..2. 

The source code for creating "MultipliersArgiliano.pickle" is given in the python file CreateMultipliers.py. In this script the random seed is intitalized so the process is replicable.
If you want to generate different scenario, change this seed.


Inner functions
===============

checkTriangolarInEqulaity(TravelTimes) - recieves a time dependent travel time matrix, that possibly does not maintain the traingular inequality in time dependent settings, and returns a time dependent travel time matrix in which the traingular inequality in time dependent settings is maintained

checkTriangolarInEqulaityBasic(TravelTimesBasic) - recieves a travel time matrix, that possibly does not maintain the traingular inequality, and returns a travel time matrix in which the traingular inequality is maintained

checkTriangolarInEqulaityStochastic(TravelTimesScenarios) - recieves a stochastic travel time matrix, that possibly does not maintain the traingular inequality, and returns a stochastic travel time matrix in which the traingular inequality is maintained

returnTravelTime(i,j,t) - calculates the travel time between locations i and j in departure time t. It implements the algorithm of Cordeau et al (2014) and uses the speeds, that were calculated by the main function.

returnTravelTimeScenario (i,j,t,s) -  calculates the travel time between locations i and j in departure time t and scenario s. It implements the algorithm of Cordeau et al (2014) and uses adapted speeds, that are obtained when multiplying the speeds in the main function with the factors kept in the "MultipliersArgiliano.pickle" file. 
 