score

Functions for scoring predicted ARC grids against ground truth output grids

We need a way of evaluating predicted ARC grids against the true output grids.

A simple approach is to calculate the proportion of correct cells.

def score(
    truth: ArcGrid,  # True ARC grid
    pred: ArcGrid    # Predicted ARC grid
) -> float:
    "Score a predicted grid against the true grid"
    if pred == truth: return 1.0

    return float(np.mean(pred.data == truth.data))

pair = ArcPair(input_grid = np.array([[1,2,3],[4,5,6]]),
               output_grid = np.array([[3,2,1],[4,5,6]]))

pair.plot(titles=['Truth', 'Prediction'])

score(*pair)

0.6666666666666666

Often the predicted grid has the wrong shape. We could simply return 0.0. Instead, let’s pad the grids to be equal shape and assign partial credit to correctly predicted cells in the overlapping region

source

score

 score (truth:arcsolver.task.ArcGrid, pred:arcsolver.task.ArcGrid|None)

Score a predicted grid against the true grid

	Type	Details
truth	ArcGrid	True ARC grid
pred	arcsolver.task.ArcGrid \| None	Predicted ARC grid
Returns	float

Exported source

def score(
    truth: ArcGrid,         # True ARC grid
    pred: ArcGrid | None    # Predicted ARC grid
) -> float:
    "Score a predicted grid against the true grid"
    if pred is None: return 0.0
    if pred == truth: return 1.0
    
    # Calculate shape penalty
    rows_ratio = min(truth.shape[0], pred.shape[0]) / max(truth.shape[0], pred.shape[0])
    cols_ratio = min(truth.shape[1], pred.shape[1]) / max(truth.shape[1], pred.shape[1])
    shape_penalty = rows_ratio * cols_ratio

    # Get overlapping region dimensions
    overlap_rows = min(truth.shape[0], pred.shape[0])
    overlap_cols = min(truth.shape[1], pred.shape[1])

    # Calculate color accuracy in overlapping region
    true_overlap = truth.data[:overlap_rows, :overlap_cols]
    pred_overlap = pred.data[:overlap_rows, :overlap_cols]
    color_accuracy = np.mean(true_overlap == pred_overlap)

    return float(shape_penalty * color_accuracy)

score(*pair)

0.6666666666666666

pair = ArcPair(input_grid = np.array([[1,2,3],[4,5,6]]),
               output_grid = np.array([[3,2,1],[4,5,6], [7,8,9]]))
pair.plot(titles=['Truth', 'Prediction'])

score(*pair)

0.4444444444444444

pair = ArcPair(input_grid = np.array([[1,2,3],[4,5,6],[7,8,9]]),
               output_grid = np.array([[3,2,1],[4,5,6]]))
pair.plot(titles=['Truth', 'Prediction'])

score(*pair)

0.4444444444444444