042022大淘宝技术A类顶会论文精选TIP 2022技术人的百宝黑皮书2022版大淘宝技术出品Progressive Language-customizedVisual Feature Learning for One-stage Visual GroundingTIP 2022Yue Liao, Aixi Zhang, Zhiyuan Chen, Tianrui Hui, Si LiuAbstract—Visual grounding is a task to localize an object described by a sentence in an image. Conven-tionalvisualgroundingmethodsextractvisualandlinguisticfeaturesisolatedlyandthenperformcross-modal interaction in a postfusion manner. We argue that this post-fusion mechanism does not fullyutilize the i