← Back to main

Exp14 Step 1: BBox Feature MLP

Pure HF Kosmos-2 grounding에서 얻은 (cx, cy, area, has_bbox) × history=3 window 특징으로 8-class action을 예측하는 작은 MLP 학습.

Rule-based
41.1%
(baseline)
MLP (learned)
74.5%
(390/525)

PM per Path Type (test split)

Path TypeCorrect/TotalPM
center_straight56/56100.0%
center_left24/5444.4%
center_right24/5444.4%
left_straight68/7294.4%
left_left40/5572.7%
left_right39/5768.4%
right_straight68/7294.4%
right_left39/5768.4%
right_right33/4868.8%
참고: Train split은 episode-level stratified 80/20. MLP 입력은 최근 3프레임 BBox. Rule-based보다 유의미하게 나은지 확인하여 Step 2(image feature 결합) 진행 여부 판정.