Skip to main content
Piglet: Pixel-level grounding of language expressions with transformers

Abstract