← Back to MoNaVLA

Action-Token Attention Analysis

Action token이 image(64) 영역과 text 영역 중 어디에 attend하는지 측정합니다.
각 모델에 동일 이미지(한 개)를 주고 좌/우/전진 instruction을 바꿔가며 forward 후, 마지막 layer의 action row에서 image/text region attention 합계를 비교합니다.
가설: image_ratio가 text_ratio를 크게 상회하면 instruction 무시의 구조적 원인이 확인됨.

Summary (Last Layer / Mean over Layers)

ModelInstructiontext_len Last Img%Last Text% Mean Img%Mean Text%
exp41b_resume_exp40_ptaleft12100.0%0.0%83.6%0.0%
exp41b_resume_exp40_ptaright12100.0%0.0%83.6%0.0%
exp41b_resume_exp40_ptaforward11100.0%0.0%83.5%0.0%
exp41c_scratch_ptaleft1298.9%0.0%61.3%0.0%
exp41c_scratch_ptaright1298.9%0.0%61.3%0.0%
exp41c_scratch_ptaforward1198.6%0.0%61.2%0.0%
exp42_counterfactual_ptaleft1299.0%0.0%63.4%0.0%
exp42_counterfactual_ptaright1299.0%0.0%63.4%0.0%
exp42_counterfactual_ptaforward1199.0%0.0%63.1%0.0%
exp43_cross_attn_textleft1299.7%0.0%67.0%0.0%
exp43_cross_attn_textright1299.7%0.0%67.0%0.0%
exp43_cross_attn_textforward1199.7%0.0%67.0%0.0%

Per Layer

ModelInstructionLayer Image%Text%
exp41b_resume_exp40_ptaleft046.4%0.0%
exp41b_resume_exp40_ptaleft122.3%0.0%
exp41b_resume_exp40_ptaleft212.1%0.0%
exp41b_resume_exp40_ptaleft332.6%0.0%
exp41b_resume_exp40_ptaleft463.7%0.0%
exp41b_resume_exp40_ptaleft571.7%0.0%
exp41b_resume_exp40_ptaleft686.3%0.0%
exp41b_resume_exp40_ptaleft789.5%0.0%
exp41b_resume_exp40_ptaleft895.0%0.0%
exp41b_resume_exp40_ptaleft998.6%0.0%
exp41b_resume_exp40_ptaleft1097.3%0.0%
exp41b_resume_exp40_ptaleft1195.9%0.0%
exp41b_resume_exp40_ptaleft1298.8%0.0%
exp41b_resume_exp40_ptaleft1399.4%0.0%
exp41b_resume_exp40_ptaleft1499.2%0.0%
exp41b_resume_exp40_ptaleft1599.0%0.0%
exp41b_resume_exp40_ptaleft1699.7%0.0%
exp41b_resume_exp40_ptaleft1799.6%0.0%
exp41b_resume_exp40_ptaleft1899.6%0.0%
exp41b_resume_exp40_ptaleft1999.6%0.0%
exp41b_resume_exp40_ptaleft2099.9%0.0%
exp41b_resume_exp40_ptaleft2199.9%0.0%
exp41b_resume_exp40_ptaleft2299.7%0.0%
exp41b_resume_exp40_ptaleft23100.0%0.0%
exp41b_resume_exp40_ptaright046.4%0.0%
exp41b_resume_exp40_ptaright122.3%0.0%
exp41b_resume_exp40_ptaright212.1%0.0%
exp41b_resume_exp40_ptaright332.6%0.0%
exp41b_resume_exp40_ptaright463.7%0.0%
exp41b_resume_exp40_ptaright571.7%0.0%
exp41b_resume_exp40_ptaright686.3%0.0%
exp41b_resume_exp40_ptaright789.5%0.0%
exp41b_resume_exp40_ptaright895.0%0.0%
exp41b_resume_exp40_ptaright998.6%0.0%
exp41b_resume_exp40_ptaright1097.3%0.0%
exp41b_resume_exp40_ptaright1195.9%0.0%
exp41b_resume_exp40_ptaright1298.8%0.0%
exp41b_resume_exp40_ptaright1399.4%0.0%
exp41b_resume_exp40_ptaright1499.2%0.0%
exp41b_resume_exp40_ptaright1599.0%0.0%
exp41b_resume_exp40_ptaright1699.7%0.0%
exp41b_resume_exp40_ptaright1799.6%0.0%
exp41b_resume_exp40_ptaright1899.6%0.0%
exp41b_resume_exp40_ptaright1999.6%0.0%
exp41b_resume_exp40_ptaright2099.9%0.0%
exp41b_resume_exp40_ptaright2199.9%0.0%
exp41b_resume_exp40_ptaright2299.7%0.0%
exp41b_resume_exp40_ptaright23100.0%0.0%
exp41b_resume_exp40_ptaforward046.2%0.0%
exp41b_resume_exp40_ptaforward121.9%0.0%
exp41b_resume_exp40_ptaforward211.9%0.0%
exp41b_resume_exp40_ptaforward332.8%0.0%
exp41b_resume_exp40_ptaforward463.9%0.0%
exp41b_resume_exp40_ptaforward572.4%0.0%
exp41b_resume_exp40_ptaforward686.8%0.0%
exp41b_resume_exp40_ptaforward789.2%0.0%
exp41b_resume_exp40_ptaforward895.2%0.0%
exp41b_resume_exp40_ptaforward998.3%0.0%
exp41b_resume_exp40_ptaforward1097.1%0.0%
exp41b_resume_exp40_ptaforward1195.4%0.0%
exp41b_resume_exp40_ptaforward1298.8%0.0%
exp41b_resume_exp40_ptaforward1399.3%0.0%
exp41b_resume_exp40_ptaforward1499.1%0.0%
exp41b_resume_exp40_ptaforward1598.9%0.0%
exp41b_resume_exp40_ptaforward1699.7%0.0%
exp41b_resume_exp40_ptaforward1799.6%0.0%
exp41b_resume_exp40_ptaforward1899.5%0.0%
exp41b_resume_exp40_ptaforward1999.6%0.0%
exp41b_resume_exp40_ptaforward2099.9%0.0%
exp41b_resume_exp40_ptaforward2199.9%0.0%
exp41b_resume_exp40_ptaforward2299.7%0.0%
exp41b_resume_exp40_ptaforward23100.0%0.0%
exp41c_scratch_ptaleft031.6%0.0%
exp41c_scratch_ptaleft138.9%0.0%
exp41c_scratch_ptaleft25.2%0.0%
exp41c_scratch_ptaleft316.3%0.0%
exp41c_scratch_ptaleft427.0%0.0%
exp41c_scratch_ptaleft521.2%0.0%
exp41c_scratch_ptaleft626.1%0.0%
exp41c_scratch_ptaleft727.4%0.0%
exp41c_scratch_ptaleft830.1%0.0%
exp41c_scratch_ptaleft938.5%0.0%
exp41c_scratch_ptaleft1061.1%0.0%
exp41c_scratch_ptaleft1174.7%0.0%
exp41c_scratch_ptaleft1267.5%0.0%
exp41c_scratch_ptaleft1384.9%0.0%
exp41c_scratch_ptaleft1489.0%0.0%
exp41c_scratch_ptaleft1583.8%0.0%
exp41c_scratch_ptaleft1691.0%0.0%
exp41c_scratch_ptaleft1796.5%0.0%
exp41c_scratch_ptaleft1890.6%0.0%
exp41c_scratch_ptaleft1990.7%0.0%
exp41c_scratch_ptaleft2091.4%0.0%
exp41c_scratch_ptaleft2189.8%0.0%
exp41c_scratch_ptaleft2299.9%0.0%
exp41c_scratch_ptaleft2398.9%0.0%
exp41c_scratch_ptaright031.6%0.0%
exp41c_scratch_ptaright138.9%0.0%
exp41c_scratch_ptaright25.2%0.0%
exp41c_scratch_ptaright316.3%0.0%
exp41c_scratch_ptaright427.0%0.0%
exp41c_scratch_ptaright521.2%0.0%
exp41c_scratch_ptaright626.1%0.0%
exp41c_scratch_ptaright727.4%0.0%
exp41c_scratch_ptaright830.1%0.0%
exp41c_scratch_ptaright938.5%0.0%
exp41c_scratch_ptaright1061.1%0.0%
exp41c_scratch_ptaright1174.7%0.0%
exp41c_scratch_ptaright1267.5%0.0%
exp41c_scratch_ptaright1384.9%0.0%
exp41c_scratch_ptaright1489.0%0.0%
exp41c_scratch_ptaright1583.8%0.0%
exp41c_scratch_ptaright1691.0%0.0%
exp41c_scratch_ptaright1796.5%0.0%
exp41c_scratch_ptaright1890.6%0.0%
exp41c_scratch_ptaright1990.7%0.0%
exp41c_scratch_ptaright2091.4%0.0%
exp41c_scratch_ptaright2189.8%0.0%
exp41c_scratch_ptaright2299.9%0.0%
exp41c_scratch_ptaright2398.9%0.0%
exp41c_scratch_ptaforward030.9%0.0%
exp41c_scratch_ptaforward136.6%0.0%
exp41c_scratch_ptaforward25.0%0.0%
exp41c_scratch_ptaforward316.4%0.0%
exp41c_scratch_ptaforward427.0%0.0%
exp41c_scratch_ptaforward521.0%0.0%
exp41c_scratch_ptaforward626.4%0.0%
exp41c_scratch_ptaforward728.2%0.0%
exp41c_scratch_ptaforward830.4%0.0%
exp41c_scratch_ptaforward939.6%0.0%
exp41c_scratch_ptaforward1061.2%0.0%
exp41c_scratch_ptaforward1174.8%0.0%
exp41c_scratch_ptaforward1267.3%0.0%
exp41c_scratch_ptaforward1384.6%0.0%
exp41c_scratch_ptaforward1488.2%0.0%
exp41c_scratch_ptaforward1583.4%0.0%
exp41c_scratch_ptaforward1690.3%0.0%
exp41c_scratch_ptaforward1796.4%0.0%
exp41c_scratch_ptaforward1890.6%0.0%
exp41c_scratch_ptaforward1990.7%0.0%
exp41c_scratch_ptaforward2091.4%0.0%
exp41c_scratch_ptaforward2189.6%0.0%
exp41c_scratch_ptaforward2299.9%0.0%
exp41c_scratch_ptaforward2398.6%0.0%
exp42_counterfactual_ptaleft035.0%0.0%
exp42_counterfactual_ptaleft153.2%0.0%
exp42_counterfactual_ptaleft29.0%0.0%
exp42_counterfactual_ptaleft310.6%0.0%
exp42_counterfactual_ptaleft422.8%0.0%
exp42_counterfactual_ptaleft522.6%0.0%
exp42_counterfactual_ptaleft630.5%0.0%
exp42_counterfactual_ptaleft734.3%0.0%
exp42_counterfactual_ptaleft832.0%0.0%
exp42_counterfactual_ptaleft946.2%0.0%
exp42_counterfactual_ptaleft1063.0%0.0%
exp42_counterfactual_ptaleft1182.7%0.0%
exp42_counterfactual_ptaleft1283.6%0.0%
exp42_counterfactual_ptaleft1390.5%0.0%
exp42_counterfactual_ptaleft1493.0%0.0%
exp42_counterfactual_ptaleft1587.0%0.0%
exp42_counterfactual_ptaleft1691.9%0.0%
exp42_counterfactual_ptaleft1792.6%0.0%
exp42_counterfactual_ptaleft1881.0%0.0%
exp42_counterfactual_ptaleft1987.6%0.0%
exp42_counterfactual_ptaleft2085.9%0.0%
exp42_counterfactual_ptaleft2191.5%0.0%
exp42_counterfactual_ptaleft2296.7%0.0%
exp42_counterfactual_ptaleft2399.0%0.0%
exp42_counterfactual_ptaright035.0%0.0%
exp42_counterfactual_ptaright153.2%0.0%
exp42_counterfactual_ptaright29.0%0.0%
exp42_counterfactual_ptaright310.6%0.0%
exp42_counterfactual_ptaright422.8%0.0%
exp42_counterfactual_ptaright522.6%0.0%
exp42_counterfactual_ptaright630.5%0.0%
exp42_counterfactual_ptaright734.3%0.0%
exp42_counterfactual_ptaright832.0%0.0%
exp42_counterfactual_ptaright946.2%0.0%
exp42_counterfactual_ptaright1063.0%0.0%
exp42_counterfactual_ptaright1182.7%0.0%
exp42_counterfactual_ptaright1283.6%0.0%
exp42_counterfactual_ptaright1390.5%0.0%
exp42_counterfactual_ptaright1493.0%0.0%
exp42_counterfactual_ptaright1587.0%0.0%
exp42_counterfactual_ptaright1691.9%0.0%
exp42_counterfactual_ptaright1792.6%0.0%
exp42_counterfactual_ptaright1881.0%0.0%
exp42_counterfactual_ptaright1987.6%0.0%
exp42_counterfactual_ptaright2085.9%0.0%
exp42_counterfactual_ptaright2191.5%0.0%
exp42_counterfactual_ptaright2296.7%0.0%
exp42_counterfactual_ptaright2399.0%0.0%
exp42_counterfactual_ptaforward033.8%0.0%
exp42_counterfactual_ptaforward144.7%0.0%
exp42_counterfactual_ptaforward27.9%0.0%
exp42_counterfactual_ptaforward310.4%0.0%
exp42_counterfactual_ptaforward422.5%0.0%
exp42_counterfactual_ptaforward522.7%0.0%
exp42_counterfactual_ptaforward631.1%0.0%
exp42_counterfactual_ptaforward734.8%0.0%
exp42_counterfactual_ptaforward832.6%0.0%
exp42_counterfactual_ptaforward946.3%0.0%
exp42_counterfactual_ptaforward1063.8%0.0%
exp42_counterfactual_ptaforward1183.3%0.0%
exp42_counterfactual_ptaforward1284.1%0.0%
exp42_counterfactual_ptaforward1390.8%0.0%
exp42_counterfactual_ptaforward1493.0%0.0%
exp42_counterfactual_ptaforward1586.7%0.0%
exp42_counterfactual_ptaforward1691.6%0.0%
exp42_counterfactual_ptaforward1791.9%0.0%
exp42_counterfactual_ptaforward1880.1%0.0%
exp42_counterfactual_ptaforward1987.9%0.0%
exp42_counterfactual_ptaforward2086.2%0.0%
exp42_counterfactual_ptaforward2191.5%0.0%
exp42_counterfactual_ptaforward2296.7%0.0%
exp42_counterfactual_ptaforward2399.0%0.0%
exp43_cross_attn_textleft038.8%0.0%
exp43_cross_attn_textleft111.0%0.0%
exp43_cross_attn_textleft26.5%0.0%
exp43_cross_attn_textleft312.5%0.0%
exp43_cross_attn_textleft418.7%0.0%
exp43_cross_attn_textleft521.5%0.0%
exp43_cross_attn_textleft638.0%0.0%
exp43_cross_attn_textleft740.4%0.0%
exp43_cross_attn_textleft844.7%0.0%
exp43_cross_attn_textleft963.0%0.0%
exp43_cross_attn_textleft1085.4%0.0%
exp43_cross_attn_textleft1190.5%0.0%
exp43_cross_attn_textleft1284.8%0.0%
exp43_cross_attn_textleft1393.2%0.0%
exp43_cross_attn_textleft1494.0%0.0%
exp43_cross_attn_textleft1589.4%0.0%
exp43_cross_attn_textleft1695.2%0.0%
exp43_cross_attn_textleft1797.4%0.0%
exp43_cross_attn_textleft1892.1%0.0%
exp43_cross_attn_textleft1995.0%0.0%
exp43_cross_attn_textleft2098.0%0.0%
exp43_cross_attn_textleft2199.3%0.0%
exp43_cross_attn_textleft2299.9%0.0%
exp43_cross_attn_textleft2399.7%0.0%
exp43_cross_attn_textright038.8%0.0%
exp43_cross_attn_textright111.0%0.0%
exp43_cross_attn_textright26.5%0.0%
exp43_cross_attn_textright312.5%0.0%
exp43_cross_attn_textright418.7%0.0%
exp43_cross_attn_textright521.5%0.0%
exp43_cross_attn_textright638.0%0.0%
exp43_cross_attn_textright740.4%0.0%
exp43_cross_attn_textright844.7%0.0%
exp43_cross_attn_textright963.0%0.0%
exp43_cross_attn_textright1085.4%0.0%
exp43_cross_attn_textright1190.5%0.0%
exp43_cross_attn_textright1284.8%0.0%
exp43_cross_attn_textright1393.2%0.0%
exp43_cross_attn_textright1494.0%0.0%
exp43_cross_attn_textright1589.4%0.0%
exp43_cross_attn_textright1695.2%0.0%
exp43_cross_attn_textright1797.4%0.0%
exp43_cross_attn_textright1892.1%0.0%
exp43_cross_attn_textright1995.0%0.0%
exp43_cross_attn_textright2098.0%0.0%
exp43_cross_attn_textright2199.3%0.0%
exp43_cross_attn_textright2299.9%0.0%
exp43_cross_attn_textright2399.7%0.0%
exp43_cross_attn_textforward038.6%0.0%
exp43_cross_attn_textforward110.4%0.0%
exp43_cross_attn_textforward26.4%0.0%
exp43_cross_attn_textforward312.4%0.0%
exp43_cross_attn_textforward418.2%0.0%
exp43_cross_attn_textforward521.3%0.0%
exp43_cross_attn_textforward637.6%0.0%
exp43_cross_attn_textforward740.1%0.0%
exp43_cross_attn_textforward844.6%0.0%
exp43_cross_attn_textforward962.9%0.0%
exp43_cross_attn_textforward1085.1%0.0%
exp43_cross_attn_textforward1191.0%0.0%
exp43_cross_attn_textforward1284.9%0.0%
exp43_cross_attn_textforward1393.4%0.0%
exp43_cross_attn_textforward1493.6%0.0%
exp43_cross_attn_textforward1589.4%0.0%
exp43_cross_attn_textforward1695.2%0.0%
exp43_cross_attn_textforward1797.4%0.0%
exp43_cross_attn_textforward1892.8%0.0%
exp43_cross_attn_textforward1995.4%0.0%
exp43_cross_attn_textforward2098.3%0.0%
exp43_cross_attn_textforward2199.4%0.0%
exp43_cross_attn_textforward2299.9%0.0%
exp43_cross_attn_textforward2399.7%0.0%