Episodic Sparse Cost Evaluation for Policy Analysis in Stochastic Shortest Path Problems
english
DOI:
https://doi.org/10.29103/techsi.v16i2.25800Abstract
Conventional evaluations of stochastic shortest path policies typically rely on dense reward or cost signals, which often obscure rare but behaviorally critical interactions. This paper introduces an episodic sparse-cost evaluation framework that assigns costs only to a small subset of state action pairs identified through a short probing phase, thereby decoupling cost accumulation from trajectory length. The objective of this study is to assess whether episodic sparse costs can provide a more interpretable and behavior-focused evaluation of policy execution compared to dense formulations. The framework is empirically validated through controlled navigation experiments under a fixed policy in a grid-based stochastic shortest path setting. In a representative episode, the agent successfully reached the terminal state in 95 steps, while incurring only two cost-triggering events drawn from a sparse support set of size five. This resulted in a total episodic cost of 2.0 and a hit rate of 0.021, indicating that more than 97% of agent environment interactions were cost-free. The temporal distribution of costs appeared as isolated impulses rather than continuous signals, enabling precise localization of critical decision points along the trajectory. These findings demonstrate that episodic sparse-cost evaluation yields bounded, event driven cost behavior that remains stable even for long trajectories. The proposed framework offers a transparent and scalable alternative for analyzing policy behavior in stochastic environments, particularly in scenarios where rare violations, constraints, or risk sensitive interactions are of primary concern. Future research will extend this evaluation paradigm to multi-episode analysis, adaptive policies, and integration with constraint aware learning objectives.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Fahmi Izhari, R. S. Putra, H. F. S. Simbolon, Ade Linhar

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors retain copyright and grant the journal right of first publication and this work is licensed under a Creative Commons Attribution-ShareAlike 4.0 that allows others to share the work with an acknowledgement of the works authorship and initial publication in this journal.
All articles in this journal may be disseminated by listing valid sources and the title of the article should not be omitted. The content of the article is liable to the author.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
In the dissemination of articles by the author must declare the TECHSI Journal as the first party to publish the article.
