This Article 
 Bibliographic References 
 Add to: 
Phrasal Recognition
Dec. 2013 (vol. 35 no. 12)
pp. 2854-2865
Ali Farhadi, Dept. of Comput. Sci. & Eng., Univ. of Washington, Seattle, WA, USA
Mohammad Amin Sadeghi, Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
In this paper, we introduce visual phrases, complex visual composites like "a person riding a horse." Visual phrases often display significantly reduced visual complexity compared to their component objects because the appearance of those objects can change profoundly when they participate in relations. We introduce a dataset suitable for phrasal recognition that uses familiar PASCAL object categories, and demonstrate significant experimental gains resulting from exploiting visual phrases. We show that a visual phrase detector significantly outperforms a baseline which detects component objects and reasons about relations, even though visual phrase training sets tend to be smaller than those for objects. We argue that any multiclass detection system must decode detector outputs to produce final results; this is usually done with nonmaximum suppression. We describe a novel decoding procedure that can account accurately for local context without solving difficult inference problems. We show this decoding procedure outperforms the state of the art. Finally, we show that decoding a combination of phrasal and object detectors produces real improvements in detector results.
Index Terms:
object recognition,image coding,inference mechanisms,object detection,object detectors,phrasal recognition,complex visual composites,visual complexity,object appearance,PASCAL object categories,visual phrase detector,visual phrase training sets,multiclass detection system,detector output decoding,nonmaximum suppression,local context,inference problems,phrasal detectors,Data visualization,Detectors,Decoding,Object recognition,Image processing,Complexity theory,object subcategories,Visual phrase,phrasal recognition,visual composites,object recognition,object interactions,scene understanding,single image activity recognition
Ali Farhadi, Mohammad Amin Sadeghi, "Phrasal Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 12, pp. 2854-2865, Dec. 2013, doi:10.1109/TPAMI.2013.168
Usage of this product signifies your acceptance of the Terms of Use.