Cloud-IoT Application for Scene Understanding in Assisted Living: Unleashing the Potential of Image Captioning and Large Language Model (ChatGPT)

Deema Abdal Hafeth, Gokul Lal, Mohammed Al-Khafajiy, Thar Baker, Stefanos Kollias

Research output: Chapter in Book/Conference proceeding with ISSN or ISBNConference contribution with ISSN or ISBNpeer-review

Abstract

Vision is a vital sense that plays a pivotal role in our understanding of the world. The majority of our external information is acquired through our visual system, which significantly impacts various aspects of our lives, including mobility, cognitive abilities, access to information, and how we interact with both our surroundings and other individuals. Hence, individuals who need assisted living due to visual challenges are left behind and rely on human-driven image captioning services to make sense of their surroundings. In response to this challenge, we have developed a proof-of-concept system that integrates a large language model like ChatGPT to provide assistance to individuals with visual impairments in their daily lives through the utilisation of image captioning techniques. Our proposed model leverages the image captioning technique to describe the user’s environment. It is a fusion of concepts from Deep Learning and the Internet of Things, enabling it to provide more informative and enriched image captions. In this process, ChatGPT is stimulated to generate increasingly detailed and informative descriptions of images, allowing users to gain a deeper understanding of their surroundings. Our findings show that the proposed system generates captions that are contextually relevant to the visual content. These captions can assist individuals in various day-today activities, contributing to an improved quality of life.
Original languageEnglish
Title of host publicationDeSE 2023 - Proceedings
Subtitle of host publication16th International Conference on Developments in eSystems Engineering
EditorsDhiya Al-Jumeily Obe, Sulaf Assi, Manoj Jayabalan, Jade Hind, Abir Hussain, Hissam Tawfik, Neil Rowe, Jamila Mustafina
PublisherIEEE
Pages150-155
Number of pages6
ISBN (Electronic)9798350381344
ISBN (Print)9798350381351
DOIs
Publication statusPublished - 21 Mar 2024

Publication series

Name16th International Conference on Developments in eSystems Engineering (DeSE)
PublisherIEEE

Keywords

  • Assisted Living
  • ChatGPT
  • Image Captioning
  • Internet of Things
  • NLP

Fingerprint

Dive into the research topics of 'Cloud-IoT Application for Scene Understanding in Assisted Living: Unleashing the Potential of Image Captioning and Large Language Model (ChatGPT)'. Together they form a unique fingerprint.

Cite this