The dataset has good distribution of locations, most of the photos
were captured by DSLR cameras, tags include words from ’instagram’ to
’wedding’ which suggests a range of photos from selfies to high
quality portraits (large amount of the photos came with a tag ’2013’
since the dataset is comprised of recently uploaded photos).
More than 50% (514K) of the face photos in MegaFace have intra-ocular distance larger than 40 pixels (which is LFW's distance). Forty pixels intra-ocular distance is equivalent to a 100 pixels face as shown in the plot.
Some of the faces in our dataset were incorrectly identified to be
faces.
This makes up about 6.67% of our dataset or about
68470 nonfaces out of 1027060 faces.