Abstract: We present ForceSight, a system for text-guided mobile manipulation that predicts visual-force goals using a text-conditioned vision transformer. Given a single RGBD image and a text prompt, ...