Joseph Peter Robinson , Electrical and Computer Engineering, Northeastern University, 1848 Boston, Massachusetts United States 02115-5005 (e-mail: email@example.com)
Ming Shao , CS, Northeastern University, CHESTNUT HILL, Massachusetts United States 02467 (e-mail: firstname.lastname@example.org)
Yue Wu , Electrical and Computer Engineering, Northeastern University, 1848 Boston, Massachusetts United States (e-mail: email@example.com)
Hongfu Liu , ECE, Northeastern University, Somerville, Massachusetts United States 02145 (e-mail: firstname.lastname@example.org)
Timothy Gillis , ECE, Northeastern University, Boston, Massachusetts United States (e-mail: email@example.com)
Yun Fu , ECE, Northeastern University, Boston, Massachusetts United States (e-mail: firstname.lastname@example.org)
We present the largest database for visual kinship recognition, Families In the Wild (FIW), with over 13,000 family photos of 1,000 family trees with 4-to-38 members. It took only a small team to build FIW by designing an efficient labeling tools and work-flow. To extend FIW, we further improved upon this process with a novel semi-automatic labeling scheme that used annotated faces and unlabeled text metadata to discover labels, which were then used, along with existing FIW data, for the proposed clustering algorithm that generated label proposals for all newly added data— both processes are shared and compared in depth, showing great savings in time and human input required. Essentially, the clustering algorithm proposed is semi-supervised and uses labeled data to produce more accurate clusters. We statistically compare FIW to related datasets, which unarguably shows enormous gains in overall size and amount of information encapsulated in the labels. We benchmark two tasks, kinship verification and family classification, at scales incomparably larger than ever before. Pre-trained CNN models fine-tuned on FIW outscores other conventional methods and achieved state-of-the-art on the renowned KinWild datasets. We also measure human performance on kinship recognition and compare to a fine-tuned CNN.
Large-Scale Image Dataset, Kinship Verification, Family Classification, Semi-Supervised Clustering, Deep Learning
J. P. Robinson, M. Shao, Y. Wu, H. Liu, T. Gillis and Y. Fu, "Visual Kinship Recognition of Families in the Wild," in IEEE Transactions on Pattern Analysis & Machine Intelligence.