A comprehensive survey on applications of transformers for deep learning tasks

Küçük Resim Yok



Dergi Başlığı

Dergi ISSN

Cilt Başlığı


Pergamon-Elsevier Science Ltd

Erişim Hakkı



Transformers are Deep Neural Networks (DNN) that utilize a self-attention mechanism to capture contextual relationships within sequential data. Unlike traditional neural networks and variants of Recurrent Neural Networks (RNNs), such as Long Short-Term Memory (LSTM), Transformer models excel at managing long dependencies among input sequence elements and facilitate parallel processing. Consequently, Transformer -based models have garnered significant attention from researchers in the field of artificial intelligence. This is due to their tremendous potential and impressive accomplishments, which extend beyond Natural Language Processing (NLP) tasks to encompass various domains, including Computer Vision (CV), audio and speech processing, healthcare, and the Internet of Things (IoT). Although several survey papers have been published, spotlighting the Transformer's contributions in specific fields, architectural disparities, or performance assessments, there remains a notable absence of a comprehensive survey paper that encompasses its major applications across diverse domains. Therefore, this paper addresses this gap by conducting an extensive survey of proposed Transformer models spanning from 2017 to 2022. Our survey encompasses the identification of the top five application domains for Transformer-based models, namely: NLP, CV, multi -modality, audio and speech processing, and signal processing. We analyze the influence of highly impactful Transformer-based models within these domains and subsequently categorize them according to their respective tasks, employing a novel taxonomy. Our primary objective is to illuminate the existing potential and future prospects of Transformers for researchers who are passionate about this area, thereby contributing to a more comprehensive understanding of this groundbreaking technology.


Anahtar Kelimeler

Transformer, Self-Attention, Deep Learning, Natural Language Processing (Nlp), Computer Vision (Cv), Multi-Modality


Expert Systems With Applications

WoS Q Değeri


Scopus Q Değeri