← Back to main

Exp14 Step 0: BBox-based Navigation

Pure HF Kosmos-2 grounding으로 basket 위치 추정 → rule-based로 action 예측. 학습 없이 foundation의 공간 인식만으로 얼마나 navigation 가능한가?

Overall PM: 31.1% (28/90)

PM per Path Type

Path TypeCorrect/TotalPM
center_straight6/1060.0%
center_left4/1040.0%
center_right1/1010.0%
left_straight4/1040.0%
left_left3/1030.0%
left_right0/100.0%
right_straight4/1040.0%
right_left4/1040.0%
right_right2/1020.0%

center_straight 6/10 = 60.0%

FORWARD vs GT FORWARD
the center of the image, with the white wall and the window in the background.
source: caption
FORWARD vs GT FORWARD
the center of the image, with the white wall and the window in the background.
source: caption
STOP vs GT FORWARD
the end of the room, and the white wall is behind it.
source: fallback
STOP vs GT FORWARD
the end of a long hallway, with a white wall and a black floor in the background.
source: fallback
STOP vs GT FORWARD
the end of a hallway, and the white wall is behind it.
source: fallback
FORWARD vs GT FORWARD
the center of the image, with the white wall and the window in the background.
source: caption
FORWARD vs GT FORWARD
the center of the image. The floor is tiled and the walls are white. A gray air conditioner is placed in the middle of t
source: entity_match
FORWARD vs GT FORWARD
the bottom of the image. The white wall is behind it. A gray air conditioner is placed on the floor in front of the wall
source: entity_match
FORWARD vs GT FORWARD
the center of the image. The floor is tiled and the walls are white. A power cord is plugged into the wall.
source: caption
STOP vs GT FORWARD
the end of a hallway, with a white wall and a power outlet in the background.
source: fallback

center_left 4/10 = 40.0%

FORWARD vs GT LEFT
the center of the image, with the gray air conditioner sitting on the floor in front of it.
source: entity_match
FORWARD vs GT FWD+L
the center of the image. It is a small, square, metal box with a handle on the side. The floor is tiled and the walls ar
source: caption
FORWARD vs GT FORWARD
the end of the room, and the air conditioner is on the floor.
source: fallback
FWD+R vs GT FWD+R
the end of a long, white wall. The wall is white and there is a black line running along the right side of the wall.
source: caption
STOP vs GT FORWARD
the end of a hallway, and the wall is white.
source: fallback
STOP vs GT LEFT
the center of the image, with the gray wall and the white ceiling in the background.
source: entity_match
FWD+R vs GT FWD+L
the center of the image. It is sitting on a tiled floor, with a white wall behind it. There is a window above the basket
source: caption
FWD+R vs GT FORWARD
the center of the image. It is sitting on a tiled floor, with a white wall behind it. A TV is visible on the wall to the
source: caption
FWD+R vs GT FWD+R
the center of the image, with the white wall behind it. A black electrical cord is visible on the right side of the bask
source: caption
FORWARD vs GT FORWARD
the top of the trash can, and the gray lid is on top of it.
source: entity_match

center_right 1/10 = 10.0%

FORWARD vs GT RIGHT
the center of the image, with the white wall and the window in the background.
source: caption
FORWARD vs GT FWD+R
the center of the image, with the white wall and floor in the background.
source: caption
STOP vs GT FORWARD
the end of the hall, and the white wall is behind it.
source: fallback
STOP vs GT FWD+L
the end of a hallway, with a white wall and a black electrical cord running along the wall.
source: fallback
FWD+R vs GT FORWARD
the corner of a white wall. The basket is a metal box with a mesh cover.
source: entity_match
FORWARD vs GT RIGHT
the center of the image, with the white wall and the window in the background.
source: caption
FORWARD vs GT FWD+R
the center of the image, with the white wall and the black floor in the background.
source: caption
FORWARD vs GT FORWARD
the center of the image, with the white wall and floor in the background.
source: caption
FORWARD vs GT FWD+L
the bottom of the image. The basket is a large trash can.
source: entity_match
STOP vs GT FORWARD
the top of the image. The basket is a large gray box with a lid.
source: entity_match

left_straight 4/10 = 40.0%

RIGHT vs GT ROT_R
the end of the room, and the chair is in the middle of the image.
source: fallback
FORWARD vs GT FORWARD
the end of the room, and the trash can is in the middle of the floor.
source: fallback
FORWARD vs GT FORWARD
the center of the image, with the white wall and floor in the background.
source: caption
STOP vs GT FORWARD
the end of the hallway, and the white wall is behind it.
source: fallback
FWD+L vs GT FORWARD
the end of the hallway, and the trash can is on the left side of the image.
source: caption
FWD+L vs GT ROT_R
the bottom of the image, and the white trash can is at the top.
source: fallback
FORWARD vs GT FORWARD
the end of the room, and the white trash can is in the middle of the image.
source: fallback
STOP vs GT FORWARD
the end of the room, and the white wall is behind it.
source: fallback
STOP vs GT FORWARD
the end of the hallway, and the white wall is behind it.
source: fallback
FORWARD vs GT FORWARD
the top of the washer, and the silver basket is at the bottom
source: entity_match

left_left 3/10 = 30.0%

LEFT vs GT FWD+L
the far left of the image. The white wall is the background. A window is above the basket.
source: caption
LEFT vs GT FWD+L
the far left of the image. The floor is tiled and the walls are white. A window is visible above the white wall.
source: caption
FORWARD vs GT FORWARD
the end of the room, and the air conditioner is in the middle of the floor.
source: fallback
STOP vs GT FORWARD
the end of a long hallway, with a white wall and a black floor in the background.
source: fallback
FWD+L vs GT FORWARD
the left side of the image, and the white wall is at right. The basket is positioned on the left, and the wall is on the
source: entity_match
FWD+L vs GT FWD+L
the left side of the image, and the gray trash can is at its right side. The floor is tiled, and there is a window above
source: entity_match
STOP vs GT FWD+L
the bottom of the image. The floor is tiled and the walls are white. There is a window above the basket and a wooden cha
source: fallback
FORWARD vs GT FORWARD
the center of the image, with the white wall and the window in the background.
source: caption
STOP vs GT FORWARD
the end of a hallway, with a white wall and a tiled floor in the background.
source: fallback
STOP vs GT FORWARD
the entrance of the building. The basket is located on the floor.
source: entity_match

left_right 0/10 = 0.0%

FWD+L vs GT FWD+R
the far end of the room, and the gray trash can is in the middle.
source: entity_match
FORWARD vs GT FWD+R
the left side of the image. The white air conditioner is on the right side of it.
source: fallback
FWD+L vs GT FORWARD
the end of the room, and the black box is on the left side of the image.
source: caption
STOP vs GT FORWARD
the end of the hall, and the white wall is behind it.
source: fallback
STOP vs GT FORWARD
the end of the hallway, and the white trash can is in the middle of the room.
source: fallback
STOP vs GT FWD+R
the left side of the image, and the white wall is at right. The floor is tiled, and there is a metal trash can in the mi
source: fallback
FORWARD vs GT FWD+R
the end of the room, and the white trash can is in the middle of the floor.
source: fallback
FWD+L vs GT FORWARD
the end of the hallway, and the air conditioner is on the left side of the room.
source: caption
STOP vs GT FORWARD
the end of the room, and the white wall is behind it.
source: fallback
STOP vs GT FORWARD
the end of the hallway, and the white wall is behind it.
source: fallback

right_straight 4/10 = 40.0%

STOP vs GT ROT_L
the end of the room, and the white wall is behind it.
source: fallback
FORWARD vs GT FORWARD
the center of the image, with a chair and a table in the background.
source: caption
FORWARD vs GT FORWARD
the bottom of the image. The air conditioner is sitting on the floor, and there is a chair nearby.
source: fallback
STOP vs GT FORWARD
the bottom of the image, with the white wall and floor in the background.
source: fallback
STOP vs GT FORWARD
the bottom of the image. The white wall is visible in the background.
source: fallback
STOP vs GT ROT_L
the end of the room, and the white wall is behind it.
source: fallback
FORWARD vs GT FORWARD
the center of the image, with a white wall and a wooden chair in the background.
source: caption
FWD+R vs GT FORWARD
the center of the image. It is sitting on a tiled floor, with a white wall behind it. A wooden chair is placed to the ri
source: caption
FORWARD vs GT FORWARD
the center of the image, with a white wall and a wooden chair in the background.
source: caption
STOP vs GT FORWARD
the end of the room, with the white wall and the floor in the background
source: fallback

right_left 4/10 = 40.0%

FWD+R vs GT FWD+L
the bottom of the image, and the gray trash can is at the top of the frame.
source: entity_match
FORWARD vs GT FWD+L
the bottom of the image. The room is empty except for the gray basket.
source: none
FORWARD vs GT FORWARD
the bottom of the image. The air conditioner is sitting on the floor, with the wires coming out of it.
source: fallback
FORWARD vs GT FORWARD
the center of the image, with a white wall and a wooden chair in the background.
source: caption
STOP vs GT FORWARD
the top of the stairs, and the white basket is at the bottom of the steps.
source: entity_match
STOP vs GT FWD+L
the bottom of the image, and the white wall is behind it. The floor is tiled, and there is a window above the wall.
source: fallback
FWD+R vs GT FWD+L
the end of the room, and the chair is on the right side.
source: caption
FORWARD vs GT FORWARD
the bottom of the image. The air conditioner is sitting on the floor next to it.
source: fallback
FORWARD vs GT FORWARD
the center of the image. It is a large, square, plastic container. The container is sitting on a tiled floor, with a whi
source: entity_match
STOP vs GT FORWARD
the top of the stairs, and the white basket is at the bottom of the steps.
source: entity_match

right_right 2/10 = 20.0%

RIGHT vs GT RIGHT
the far right of the image. The white wall is the background.
source: caption
FORWARD vs GT FWD+R
the bottom of the image. The room is empty except for a chair and a table.
source: none
STOP vs GT FORWARD
the bottom of the image, and the white wall is behind it. A chair is placed to the right of the basket, and a table is p
source: fallback
FORWARD vs GT FORWARD
the bottom of the image. The air conditioner is located on the floor, and there is a chair nearby.
source: fallback
FWD+L vs GT FORWARD
the left side of the room. The white wall is the background.
source: caption
STOP vs GT FWD+R
the bottom of the image, and the white wall is behind it. The floor is tiled, and there is a trash can in the middle of
source: fallback
STOP vs GT FWD+R
the bottom of the image, with the white wall and the wooden chair in the background.
source: fallback
STOP vs GT FORWARD
the bottom of the image, and the white wall is behind it.
source: fallback
STOP vs GT FORWARD
the end of the hallway, with the white wall and floor in the background.
source: fallback
STOP vs GT FORWARD
the end of the room, and the white wall is behind it.
source: fallback