← Back to main

Exp19: Step2 + Goal-Near Proxy Features

Exp14 Step2의 bbox history + 16x16 grayscale image feature에 non-leaky proxy signal 4개 (area, center_error_x, abs_delta_cx, recent_bbox_consistency) 를 추가한 실험입니다.

Step 2 Reference
75.9%
bbox + image baseline
Exp19 Proxy
76.6%
(121/158)
Delta vs Step 2
+0.6%
same split protocol

PM per Path Type (test split)

Path TypeCorrect/TotalPM
center_straight11/1478.6%
center_left12/1866.7%
center_right10/1855.6%
left_straight17/1894.4%
left_left15/1978.9%
left_right13/1968.4%
right_straight17/1894.4%
right_left16/1888.9%
right_right10/1662.5%
Proxy pack
area, center_error_x, abs_delta_cx, recent_bbox_consistency.
Split protocol과 backbone 용량은 Step 2와 동일하게 유지했습니다.

서버 배포 (2026-05-01)

End-to-end collapse 우회 — Exp19 Proxy를 별도 FastAPI 서버로 배포.
Exp35/36/38 평가에서 end-to-end가 100% FORWARD collapse 확인된 후, decomposition baseline을 메인 운영선으로 격상.
구성: Pure HF Kosmos-2 grounding → bbox+image MLP (Exp19 proxy features) → 8-class action.

가중치 (full 150ep, 2026-05-01)

항목
Datasetdocs/v5/bbox_nav_step1/bbox_dataset_full.json (150ep, 2626 frames)
Train / Test windows2101 / 525
Best test_acc76.4% (220 epoch best)
Weightsdocs/v5/bbox_nav_exp19_proxy/exp19_proxy_mlp.pt (450KB)

실행 (minum 서버)

source .venv/bin/activate
export VLA_API_KEY=<your-key>
export VLA_PROXY_DATASET_FILE=$PWD/docs/v5/bbox_nav_step1/bbox_dataset_full.json
export VLA_PROXY_DEVICE=cuda          # MLP는 cpu도 충분 (~1ms)
export VLA_PROXY_GROUNDING_DEVICE=cuda  # GPU 권장 (CPU는 frame당 ~21초)
python3 robovlm_nav/serve/proxy_inference_server.py --port 8001

API

Endpoint설명
GET /health모델 로드 / test_acc / device 상태
POST /predictX-API-Key 헤더 필요. 입력 {image: b64, instruction: str} → 출력 {action, action_3d, predicted_class, predicted_label, bbox, grounding_caption, ...}
POST /resethistory 초기화 (에피소드 시작 시)

Smoke Test 결과 (frame 0/9/17, center_left 에피소드)

FramePredictedBBox entityCaption
0 (start)RIGHTcaption:center fallback"the center of the image, with the white wall..."
9 (mid)FORWARD"the gray air conditioner" (basket 오인식)"the end of the room, and the gray air conditioner..."
17 (end)FORWARDlarge white wall (is_basket: false)"the end of a hallway, with a white wall..."

클래스 다양성 확인: {FORWARD, RIGHT} → end-to-end Exp35/36/38의 100% FORWARD collapse와 다름. Pure HF grounding이 회색 바스켓을 “air conditioner”로 부르는 건 알려진 recognition 33% 이슈 (RECOGNITION_PROOF_RESULT_20260428) — 방향 신호(cx, cy, area)는 정상.

두 서버 호환

billy (/home/billy/25-1kp/MoNaVLA/ROS_action/...) ↔ minum (/home/minum/minum/26CS/MoNa-pi/...) 자동 resolver 추가. VLA_PROXY_DATA_DIR 환경변수로 강제 가능.