Description
In a world full of user generated media content, automated image and video processing plays a prominent role, both in social media apps (face, object, and place detection), and in local applications that automatically sort media collections, among other uses. In the last few years, a number of startups such as Clarifai and Indico, and much bigger players such as Google and Microsoft, have launched their own 'machine learning as a service' offerings, including 'vision' products that can recognise objects present in image and video footage and automatically tag them, provide optical character recognition (OCR) services, and even not safe for work (NSFW) classifiers that can be used to omit content from internal networks.
This talk will focus on 'food' recognition within images, and will walk through the basics of how to get started with image classification, from using APIs and SDKs for some of the commercial (but free) offerings available, to trying and recreate those services locally, first by walking through and example of the 'bag of features' technique, using a mix of supervised and unsupervised methods in scikit-learn and OpenCV, and secondly showing examples based on 'transfer learning', a 'deep learning' technique to reuse pre-trained convolutional neural networks to customise classification from smaller training datasets.