Process-level evaluation uses step-level rewards or judgments to assess whether each action moves toward the goal.
Why It Matters
Many agent failures are process failures: wrong tool choice, wrong parameter, premature stopping. Outcome-only evaluation misses these.