Clarify better_than_baseline in challenge descriptions, and add extra precision for knapsack.

This commit is contained in:
FiveMovesAhead 2025-04-25 18:05:22 +01:00
parent f4c233aeb4
commit dbb6927ffa
3 changed files with 11 additions and 7 deletions

View File

@ -54,12 +54,14 @@ When evaluating this selection, we can confirm that the total weight is less tha
This selection is 27% better than the baseline:
```
better_than_baseline = (total_value - baseline_value) / baseline_value
= (127 - 100) / 100
better_than_baseline = total_value / baseline_value - 1
= 127 / 100 - 1
= 0.27
```
# Our Challenge
In TIG, the baseline value is determined by a two-stage approach. First, items are selected based on their value-to-weight ratio, including interaction values, until the capacity is reached. Then, a tabu-based local search refines the solution by swapping items to improve value while avoiding reversals, with early termination for unpromising swaps.
Each instance of TIG's knapsack problem contains 16 random sub-instances with their own baseline selection & baseline value. For each sub-instance, the total value of your selection is used to calculate a `better_than_baseline`. Your "average" `better_than_baseline` over the sub-instances must be greater than or equal to the specified difficulty `better_than_baseline`, where the average uses root mean square. Please see the challenge code for a precise specification.
Each instance of TIG's knapsack problem contains 16 random sub-instances, each with its own baseline selection and baseline value. For each sub-instance, we calculate how much your selection's total value exceeds the baseline value, expressed as a percentage improvement. This improvement percentage is called `better_than_baseline`. Your overall performance is measured by taking the root mean square of these 16 `better_than_baseline` percentages. To pass a difficulty level, this overall score must meet or exceed the specified difficulty target.
For precision, `better_than_baseline` is stored as an integer where each unit represents 0.01%. For example, a `better_than_baseline` value of 150 corresponds to 150/10000 = 1.5%.

View File

@ -75,15 +75,17 @@ When evaluating these routes, each route has demand less than 200, the number of
These routes are 20.6% better than the baseline:
```
better_than_baseline = (baseline_total_distance - total_distance) / baseline_total_distance
= (3875 - 3074) / 3875
better_than_baseline = 1 - total_distance / baseline_total_distance
= 1 - 3074 / 3875
= 0.206
```
## Our Challenge
In TIG, the baseline route is determined by using Solomon's I1 insertion heuristic that iteratively inserts customers into routes based on a cost function that balances distance and time constraints. The routes are built one by one until all customers are served.
Each instance of TIG's vehicle routing problem contains 16 random sub-instances with their own baseline routes & baseline distance. For each sub-instance, the total distance of your routes is used to calculate a `better_than_baseline`. Your "average" `better_than_baseline` over the sub-instances must be greater than the specified difficulty `better_than_baseline`, where the average uses root mean square. Please see the challenge code for a precise specification.
Each instance of TIG's vehicle routing problem contains 16 random sub-instances, each with its own baseline routes and baseline distance. For each sub-instance, we calculate how much your routes' total distance is shorter than the baseline distance, expressed as a percentage improvement. This improvement percentage is called `better_than_baseline`. Your overall performance is measured by taking the root mean square of these 16 `better_than_baseline` percentages. To pass a difficulty level, this overall score must meet or exceed the specified difficulty target.
For precision, `better_than_baseline` is stored as an integer where each unit represents 0.1%. For example, a `better_than_baseline` value of 22 corresponds to 22/1000 = 2.2%.
## Applications
* **Logistics & Delivery Services:** Optimizes parcel and ship routing by ensuring vehicles meet customer and operational time constraints, reducing operational costs and environmental impact [^1].

View File

@ -121,7 +121,7 @@ impl crate::ChallengeTrait<Solution, Difficulty, 2> for Challenge {
/ better_than_baselines.len() as f64)
.sqrt()
- 1.0;
let threshold = self.difficulty.better_than_baseline as f64 / 1000.0;
let threshold = self.difficulty.better_than_baseline as f64 / 10000.0;
if average >= threshold {
Ok(())
} else {