Yesterday a Venture Beat article tested a few public image recognition engines on a popular large ImageNet. In order to provide consistent tests, they used images from the ImageNet categories only. This is because many research and evaluation image recognition out there are trained on the ImageNet dataset only.
Unfortunately ImageNet does not contain a lot of the categories of object we use and see in our everyday life, such as “person”, “man”, “woman”, “hand”, etc.
Teradeep used a large private dataset of more than 10 Million images to train its image recognition engines.
We run some of our own tests. The first is on this image
Here is Teradeep image recognition result:
It can recognize directly the main subject categories in this picture.
Here is how Metamind did:
Metamind was only trained on ImageNet, so it does not know what to do with this image. The hair of Avril looks like a broom, the closest object that resembles the image.
Here is how Clarifai did:
Better than Metamind, given they use their own dataset. But it is trying to classify the image category as “fashion”, “glamour”, rather than telling us what is in this picture first!
ImageNet categories are clearly not suited for use in practical applications. We could spend the day testing on tons of images, but I think this gives a good starting point.