Jump to content

Automatic image annotation

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Msbmsb (talk | contribs) at 23:09, 23 May 2005 (Initial release.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Automatic image annotation is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database.

This method can be regarded as a type of multi-class image classification with a very large number of classes - as large as the vocabulary size. Typically, image analysis in the form of extracted feature vectors and the training annotation words are used by machine learning techniques to attempt to automatically apply annotations to new images. The first methods learned the correlations between image features and training annotations, then techniques were developed using machine translation to try and translate the textual vocabulary with the 'visual vocabulary', or clustered regions known as blobs. Work following these efforts have included classification approaches, relevance models and so on.

The advantages of automatic image annotation versus content-based image retrieval are that queries can be more naturally specified by the user. CBIR generally (at present) requires users to search by image concepts such as color and texture, or finding example queries. Certain image features in example images may override the concept that the user is really focusing on. The traditional methods of image retrieval such as those used by libraries have relied on manually annotated images, which is expensive and time-consuming, especially given the large and constantly-growing image databases in existence.

Some major work

  • Word co-occurence model
{{cite conference}}: Empty citation (help)
  • Annotation as machine translation
{{cite conference}}: Empty citation (help)
  • Hierarchical Aspect Cluster Model
{{cite conference}}: Empty citation (help)
  • Latent Dirichlet Allocation model
{{cite conference}}: Empty citation (help)
  • Texture similarity
{{cite conference}}: Empty citation (help)
  • Support Vector Machines
{{cite conference}}: Empty citation (help)
  • Maximum Entropy
{{cite conference}}: Empty citation (help)
  • Relevance models
{{cite conference}}: Empty citation (help)
  • Relevance models using continuous probability density functions
{{cite conference}}: Empty citation (help)
  • Coherent Language Model
{{cite conference}}: Empty citation (help)
  • Inference networks
{{cite conference}}: Empty citation (help)
  • Multiple Bernoulli distribution
{{cite conference}}: Empty citation (help)
  • Multiple design alternatives
{{cite conference}}: Empty citation (help)
  • Natural scene annotation
{{cite conference}}: Empty citation (help)
  • Relevant low-level global filters
{{cite conference}}: Empty citation (help)
  • Global image features and nonparametric density estimation
{{cite conference}}: Empty citation (help)

See also