PyData SF 2016 Dani Ushizima, Flavio Araujo, Romuere Silva | Searchable datasets in Python: images across domains, experiments, algorithms and learning
pyCBIR is a new python tool for content-based image retrieval (CBIR) capable of searching relevant items in large databases, given unseen samples. While much work in CBIR has targeted ads and recommendation systems, our pyCBIR allows general purpose investigation across image domains. Also, pyCBIR contains ten distance metrics, and six bags of features, including a Convolutional Neural Network.
Image capture turned into an ubiquitous activity in our daily lives, but mechanisms to organize and retrieve images based on their content are available only to a few people or to very specific problems. With the significant improvement in image processing speeds and availability of large storage systems, the development of methods to query and retrieve images is fundamental to simple human activities like cataloguing and complex research such as synthesizing materials. Content-Based Image Retrieval (CBIR) systems use computer vision techniques to describe images in terms of its properties in order to search similar samples given an image itself as a query, instead of keywords. For this reason, the system works independently of annotations, which can be time consuming and impossible in some scenarios, e.g. high-throughput imaging instruments. While much work in CBIR has targeted ads and recommendation systems, our pyCBIR allows general purpose investigation across image domains and experiments. Also, pyCBIR contains different distance metrics, and several feature extraction techniques, including a Convolutional Neural Network (CNN).