← Back to main

Exp14 Step 3: Full Dataset + WINDOW=8 + 32×32

Step 2 대비: 데이터 45→150 ep, WINDOW 3→8, image 16²→32² grayscale.
입력 dim: 268 → 1056 floats. MLP: 512→256→128→8.

Step 2 (baseline)

75.9%

Step 2 reference (45 ep, W=3)

Step 3 PM

77.0%

(403/525)

데이터

2626

windows (150 ep)

PM per Path Type (test split)

Path Type	Correct/Total	PM
center_straight	47/56	83.9%
center_left	33/54	61.1%
center_right	25/54	46.3%
left_straight	68/72	94.4%
left_left	39/55	70.9%
left_right	42/57	73.7%
right_straight	71/72	98.6%
right_left	40/57	70.2%
right_right	39/48	81.2%

End-to-End Gate Check (Exp17 / Exp18)

Model	PM	Closed-loop	FPE	Interpretation
Exp14 Step2	75.9%	66.7%	0.55m	Strongest practical baseline
Exp17	76.95%	11.1%	1.04m	PM only improved, rollout failed
Exp18	27.62%	11.1%	1.04m	Text-fusion gate failed

설정
입력: WINDOW=8 × (cx, cy, area, has_bbox) + 32×32 grayscale image
출력: 8-class discrete action | epochs=300 | AdamW lr=2e-3

Exp18 실패 패턴 요약
best checkpoint는 val_loss 1.325였지만, 실제 PM은 27.62%, closed-loop는 11.1%에 그쳤습니다.
PM confusion 기준으로는 FORWARD 95개가 전부 FWD+R로 오분류됐고, closed-loop에서는 center_right만 성공했습니다.
즉 text embedding fusion은 현재 end-to-end instability를 해결하지 못했고, strongest baseline은 여전히 Exp14 Step2입니다.