Exploration of a ViT-based multimodal approach to Vehicle  Accident Detection

Ríos Pérez, Jesús David

Exploration of a ViT-based multimodal approach to Vehicle Accident Detection

dc.contributor.advisor	Sánchez Torres, Germán
dc.contributor.advisor	Henriquez Miranda, Carlos Nelson
dc.contributor.author	Ríos Pérez, Jesús David
dc.contributor.sponsor	Grupo de investigación y Desarrollo en Sistemas y Computación (GIDSYC)	spa
dc.creator.degree	Ingeniero (a) de Sistemas	spa
dc.date.accessioned	2024-07-11T13:43:30Z
dc.date.available	2024-07-11T13:43:30Z
dc.date.issued	2024
dc.date.submitted	2024
dc.description.abstract	Multimodal Deep Learning (MMDL) has emerged as a potent framework for synthesizing information from diverse data sources, enhancing the capability of models to understand and predict complex phenomena. Particularly, Vision Transformers (ViT) have shown promising results in processing visual data alongside other modalities for comprehensive analysis. This study aims to investigate the integration of MMDL and ViT in the context of traffic accident detection, addressing the critical need for advanced predictive models in this domain. Through a literature review, we assess the current landscape of MMDL applications, and highlight the evolution and challenges of multimodal learning. Building on these insights, we propose a novel MMDL architecture designed to leverage video, audio, and metadata for accurate and timely accident detection. Our methodology combines a structured review of recent MMDL research with a theoretical approach to architecture design, emphasizing the fusion of multimodal data through ViT. The review adheres to established guidelines for systematic reviews, focusing on advancements from 2019 to 2023, while the architecture design is grounded in a thorough analysis of modalities relevant to traffic incidents. The main contributions include a taxonomy of MMDL methods and a ViT-based architecture for enhancing traffic safety systems. Integrating multimodal data through advanced deep learning models can improves the prediction accuracy of traffic accident detection. This research underscores the potential of MMDL and ViT in developing robust, real-time monitoring systems, marking a step forward in the application of artificial intelligence for public safety and smart city initiatives.	spa
dc.description.provenance	Submitted by Jesús Rios Perez (jesusriosdp@unimagdalena.edu.co) on 2024-05-14T16:46:12Z workflow start=Step: reviewstep - action:claimaction No. of bitstreams: 1 Exploration of a ViT-based multimodal approach.pdf: 1872021 bytes, checksum: 6742e99f1bfa1457ec7e607813b4ddee (MD5)	en
dc.description.provenance	Step: reviewstep - action:reviewaction Rejected by Programa de Ingeniería de Sistemas Programa de Ingeniería de Sistemas(ingsistemas@unimagdalena.edu.co), reason: Favor agregar la licencia de publicación en formato PDF. Por favor. on 2024-07-02T19:32:46Z (GMT)	en
dc.description.provenance	Submitted by Jesús Rios Perez (jesusriosdp@unimagdalena.edu.co) on 2024-07-02T20:59:41Z workflow start=Step: reviewstep - action:claimaction No. of bitstreams: 1 Exploration of a ViT-based multimodal approach.pdf: 1872021 bytes, checksum: 6742e99f1bfa1457ec7e607813b4ddee (MD5)	en
dc.description.provenance	Step: reviewstep - action:reviewaction Rejected by Programa de Ingeniería de Sistemas Programa de Ingeniería de Sistemas(ingsistemas@unimagdalena.edu.co), reason: Favor agregar la licencia de publicación BI_F12_Formato_Licencia_Publicacion_Trabajos_Grado que se encuentra en el siguiente enlace: https://unimagdalena.edu.co/Content/DocumentosSubItems/BI_F12_Formato_Licencia_Publicacion_Trabajos_Grado.docx En formato PDF. on 2024-07-02T21:09:22Z (GMT)	en
dc.description.provenance	Submitted by Jesús Rios Perez (jesusriosdp@unimagdalena.edu.co) on 2024-07-03T01:16:37Z workflow start=Step: reviewstep - action:claimaction No. of bitstreams: 2 Exploration of a ViT-based multimodal approach.pdf: 1872021 bytes, checksum: 6742e99f1bfa1457ec7e607813b4ddee (MD5) BI_F12_Formato_Licencia_Publicacion_Trabajos_Grado jesus.pdf: 549657 bytes, checksum: 55a0e8f56af35c7d44385ed7d87efd81 (MD5)	en
dc.description.provenance	Step: reviewstep - action:reviewaction Approved for entry into archive by Programa de Ingeniería de Sistemas Programa de Ingeniería de Sistemas(ingsistemas@unimagdalena.edu.co) on 2024-07-03T14:29:32Z (GMT)	en
dc.description.provenance	Step: editstep - action:editaction Approved for entry into archive by Cristhian Camilo Suarez Ibañez(csuarezi@unimagdalena.edu.co) on 2024-07-11T13:43:30Z (GMT)	en
dc.description.provenance	Made available in DSpace on 2024-07-11T13:43:30Z (GMT). No. of bitstreams: 2 Exploration of a ViT-based multimodal approach.pdf: 1872021 bytes, checksum: 6742e99f1bfa1457ec7e607813b4ddee (MD5) BI_F12_Formato_Licencia_Publicacion_Trabajos_Grado jesus.pdf: 549657 bytes, checksum: 55a0e8f56af35c7d44385ed7d87efd81 (MD5) Previous issue date: 2024	en
dc.format	text
dc.identifier.uri	https://repositorio.unimagdalena.edu.co/handle/123456789/21215
dc.language.iso	en	spa
dc.publisher	Universidad del Magdalena
dc.publisher	Universidad del Magdalena	spa
dc.publisher.department	Facultad de Ingeniería	spa
dc.publisher.place	Santa Marta	spa
dc.publisher.program	Ingeniería de Sistemas	spa
dc.rights	Acceso Abierto
dc.rights	info:eu-repo/semantics/openAccess
dc.rights.accessrights	info:eu-repo/semantics/openAccess
dc.rights.cc	Acceso Abierto	spa
dc.rights.creativecommons	https://creativecommons.org/licenses/by-nc-sa/4.0/
dc.rights.creativecommons	atribucionnocomercialcompartir	spa
dc.subject.proposal	Multimodal, Machine Learning, Data Fusion, Deep Learning.	spa
dc.subject.proposal	Multimodalidad, Aprendizaje de máquinas, Fusión de datos, Aprendizaje profundo.	spa
dc.title	Exploration of a ViT-based multimodal approach to Vehicle Accident Detection	spa
dc.title.alternative	Exploración de un enfoque multimodal basado en ViT para la Detección de Accidentes Vehiculares
dc.type	bachelorThesis	spa
dc.type.coar	https://vocabularies.coar-repositories.org/resource_types/c_7a1f/
dc.type.driver	info:eu-repo/semantics/bachelorThesis
dc.type.local	Trabajo de Grado de Pregrado	spa
oaire.accessrights	http://purl.org/coar/access_right/c_abf2
thesis.degree.level	Pregrado	spa

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Exploration of a ViT-based multimodal approach.pdf
Size:: 1.79 MB
Format:: Adobe Portable Document Format
Description:: Multimodal Deep Learning (MMDL) has emerged as a potent framework for synthesizing information from diverse data sources, enhancing the capability of models to understand and predict complex phenomena. Particularly, Vision Transformers (ViT) have shown promising results in processing visual data alongside other modalities for comprehensive analysis. This study aims to investigate the integration of MMDL and ViT in the context of traffic accident detection, addressing the critical need for advanced predictive models in this domain. Through a literature review, we assess the current landscape of MMDL applications, and highlight the evolution and challenges of multimodal learning. Building on these insights, we propose a novel MMDL architecture designed to leverage video, audio, and metadata for accurate and timely accident detection. Our methodology combines a structured review of recent MMDL research with a theoretical approach to architecture design, emphasizing the fusion of multimodal data through ViT. The review adheres to established guidelines for systematic reviews, focusing on advancements from 2019 to 2023, while the architecture design is grounded in a thorough analysis of modalities relevant to traffic incidents. The main contributions include a taxonomy of MMDL methods and a ViT-based architecture for enhancing traffic safety systems. Integrating multimodal data through advanced deep learning models can improves the prediction accuracy of traffic accident detection. This research underscores the potential of MMDL and ViT in developing robust, real-time monitoring systems, marking a step forward in the application of artificial intelligence for public safety and smart city initiatives.

Download

Name:: BI_F12_Formato_Licencia_Publicacion_Trabajos_Grado jesus.pdf
Size:: 536.77 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.43 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Ingeniería de Sistemas