Manual handling of objects on-site is performed by combining various actions. For example, in packaging work, precise combinations of actions such as “put down and push” can be executed instantly without bumping into other objects or obstacles. However, with conventional robot control technology, it is difficult to execute actions such as “pushing” and “pulling” with high precision compared to actions such as “picking up” and “putting down.” This is because slight differences in the action or shape of the object greatly affect the movement of the object in response to the action. Furthermore, as the number and types of actions to be considered increase, the combinations and sequences of actions become more complex, making real-time planning difficult.
This technology utilizes a “world model” that accurately predicts the results of a robot’s movements for objects of various shapes from video camera data, enabling the robot to execute precise movements such as “pushing” and “pulling.” In addition, by generating appropriate movement sequences at real-time speed according to the working environment, it enables the robot to instantly and autonomously execute combinations of multiple movements such as “put down and push” and “pull and pick up.”