Purpose – To provide a better-informed view of the extent of the semantic gap in image retrieval, and the limited potential for bridging it offered by current semantic image retrieval techniques. Design/methodology/approach – Within an ongoing project, a broad spectrum of operational image retrieval activity has been surveyed, and, from a number of collaborating institutions, a test collection assembled which comprises user requests, the images selected in response to those requests, and their associated metadata. This has provided the evidence base upon which to make informed observations on the efficacy of cutting-edge automatic annotation techniques which seek to integrate the text-based and content-based image retrieval paradigms. Findings – Evidence from the real-world practice of image retrieval highlights the existence of a generic-specific continuum of object identification, and the incidence of temporal, spatial, significance and abstract concept facets, manifest in textual indexing and real-query scenarios but often having no directly visible presence in an image. These factors combine to limit the functionality of current semantic image retrieval techniques, which interpret only visible features at the generic extremity of the generic-specific continuum. Research limitations/implications – The project is concerned with the traditional image retrieval environment in which retrieval transactions are conducted on still images which form part of managed collections. The possibilities offered by ontological support for adding functionality to automatic annotation techniques are considered. Originality/value – The paper offers fresh insights into the challenge of migrating content-based image retrieval from the laboratory to the operational environment, informed by newly-assembled, comprehensive, live data.
- Image retrieval, Semantics