Aryaman's research archive
← Library

Detecting and Mitigating System-Level Anomalies of Vision-Based Controllers

Aryaman Gupta, Kaustav Chakraborty, Somil Bansal — 2023 · arXiv

arXiv ↗ PDF ↗ Added May 28, 2026
reachability-analysis anomaly-detection runtime-monitoring fallback-controller out-of-distribution safety

TL;DR

<p class="text-sm leading-relaxed">HJ reachability stress-tests vision-based controllers offline to mine system-level failures; trains runtime anomaly classifier + fallback controller.

Summary

<p class="text-sm leading-relaxed">Vision-based controllers fail catastrophically on OOD inputs; component-level anomaly detectors miss cascading system failures. Offline HJ reachability auto-labels unsafe images; EfficientNet-B0 classifier detects anomalies at runtime; fallback slows aircraft, reducing BRT volume from 25.75% to 18.24% on TaxiNet taxiing task.

Key contributions

  • — HJ reachability-based automatic labeling of system-level anomalous images across diverse simulator conditions—no manual annotation.
  • — Runtime anomaly classifier + velocity-reduction fallback that shrinks closed-loop failure set by ~7.5pp on aircraft taxiing.

Novelty

<p class="text-sm leading-relaxed">Extends Chakraborty & Bansal (RA-L 2023) from offline failure discovery to online detection/mitigation without privileged environment info at test time.

Methods

  • — Hamilton-Jacobi BRT computation via Level Set Toolbox over 101³ grid in X-Plane simulator
  • — EfficientNet-B0 binary classifier trained on BRT-labeled 240K image dataset
  • — Velocity-reduction fallback controller triggered by anomaly classifier at runtime

Strengths

  • — Outperforms prediction-error and ensemble baselines with explicit state-space visualization showing why component-level labels mislead.
  • — Emergent semantic understanding (runway markings, night lights) learned without heuristics—validated against ground-truth BRTs on unseen airports.

Weaknesses

  • — Single domain (aircraft taxiing); 3-state system—curse of dimensionality limits scalability; real-world validation absent.
  • — Fallback controller (velocity reduction) is trivial; no comparison against more capable alternatives (SLAM, MPC) or formal safety guarantees.

Future work

  • — Replace grid-based HJ with DeepReach for high-dimensional systems.
  • — Temporal/visual-history-aware anomaly detector; incremental TaxiNet retraining on mined failures.

Key insights

  • — Component-level prediction error is both pessimistic and optimistic in different state regions—poor proxy for system-level safety.
  • — BRT-derived labels enable a classifier to learn semantic failure modes (runway markings, lighting) without any manual heuristics.

My thoughts

My Thoughts

A supervised learning based runtime monitor trained to do binary classification.

Initial Reaction

Connections

Implementation Notes

Open Questions

Connections

Related papers

Extracted by claude-sonnet-4-6.