• IEEE.org
  • IEEE CS Standards
  • Career Center
  • About Us
  • Subscribe to Newsletter

0

IEEE
CS Logo
  • MEMBERSHIP
  • CONFERENCES
  • PUBLICATIONS
  • EDUCATION & CAREER
  • VOLUNTEER
  • ABOUT
  • Join Us
CS Logo

0

IEEE Computer Society Logo
Sign up for our newsletter
FacebookTwitterLinkedInInstagramYoutube
IEEE COMPUTER SOCIETY
About UsBoard of GovernorsNewslettersPress RoomIEEE Support CenterContact Us
COMPUTING RESOURCES
Career CenterCourses & CertificationsWebinarsPodcastsTech NewsMembership
BUSINESS SOLUTIONS
Corporate PartnershipsConference Sponsorships & ExhibitsAdvertisingRecruitingDigital Library Institutional Subscriptions
DIGITAL LIBRARY
MagazinesJournalsConference ProceedingsVideo LibraryLibrarian Resources
COMMUNITY RESOURCES
GovernanceConference OrganizersAuthorsChaptersCommunities
POLICIES
PrivacyAccessibility StatementIEEE Nondiscrimination PolicyIEEE Ethics ReportingXML Sitemap

Copyright 2025 IEEE - All rights reserved. A public charity, IEEE is the world’s largest technical professional organization dedicated to advancing technology for the benefit of humanity.

  • Home
  • /Digital Library
  • /Journals
  • /Tp
  • Home
  • / ...
  • /Journals
  • /Tp

CLOSED Call for Papers: Special Issue on Transformer Models in Vision

Transformer models have recently demonstrated exemplary performance on a broad range of language tasks such as text classification, machine translation, and question answering. These breakthroughs in the natural language processing (NLP) domain have sparked great interest in the computer vision community to investigate these models for vision and multi-modal learning tasks. However, visual data follows a typical structure (such as spatial and temporal coherence), thus demanding novel network designs and training schemes. As a result, transformer models and their variants have been successfully used for image recognition, object detection, segmentation, image super-resolution, video understanding, image generation, text-image synthesis and visual question answering.

Among their salient benefits, transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequences, as compared to recurrent networks such as long short-term memory (LSTM). Different from convolutional neural networks, transformers require minimal inductive biases for their design and are naturally suited as set-functions. Furthermore, the relatively straightforward design of transformers allows processing multiple modalities (such as images, videos, text, and speech) using similar processing blocks and demonstrates excellent scalability to very large capacity networks and huge datasets.

This special issue seeks original contributions towards advancing the theory, architecture, and algorithmic design for transformer models in computer vision, as well as novel applications and use cases. We envision original and well-motivated adaptations of transformer models for vision tasks and efforts towards improving their accuracy, robustness, and efficiency. The special issue will provide a timely collection of recent advances to benefit the researchers and practitioners working in the broad research field of computer vision, pattern analysis, and machine intelligence. Topics of interest include (but are not limited to):

  • Theoretical insights into transformer-based models
  • Efficient transformer architectures, including novel mechanisms for self-attention
  • Novel transformer models for spatial (image) and temporal (video) data modeling
  • Visualizing and interpreting transformer networks
  • Generative models for transformer networks
  • Hybrid network designs combining the strengths of transformer models with convolutional and graph-based models
  • Unsupervised, weakly supervised, and semi-supervised learning with transformer models
  • Multi-modal learning combining visual data with text, speech, and knowledge graphs
  • Leveraging multi-spectral data like satellite imagery and infrared images in transformer models for improved semantic understanding of visual content
  • Transformer-based designs for low-level vision problems such as image super- resolution, deblurring, de-raining, and denoising
  • Novel transformer-based methods for high-level vision problems such as object detection, segmentation, activity recognition, and pose estimation
  • Transformer models for volumetric, mesh, and point-cloud data processing in 3D and 4D data regimes

Important Dates

Open for submissions: 15 October 2021

Submissions due: 15 February 2022

Preliminary notification: 15 March 2022

Revisions due: 15 May 2022

Final notification: 30 June 2022

Publication (tentative): November 2022

Submission Guidelines

For author information and guidelines on submission criteria, visit the Author Information page. Please submit papers through the ScholarOne system, and be sure to select the special-issue name. Manuscripts should not be published or currently submitted for publication elsewhere. Please submit only full papers intended for review, not abstracts, to the ScholarOne portal.

Questions?

Contact the guest editors:

  • Ashish Vaswani, Research Scientist, Google Brain (USA)
  • Fahad Shahbaz Khan, Associate Professor, Mohamed Bin Zayed University of Artificial Intelligence (UAE), LinköpingUniversity (Sweden)
  • Ming-Hsuan Yang, Professor University of California, Merced (USA); Research Scientist, Google (USA)
  • Mubarak Shah, Trustee Chair Professor, University of Central Florida (USA)
  • Niki Parmar, Research Scientist, Google Research (USA)
  • Salman Khan, Assistant Professor, Mohamed Bin Zayed University of Artificial Intelligence (UAE), Australian National University (Australia)

LATEST NEWS
IEEE Computer Society Announces 2026 Class of Fellows
IEEE Computer Society Announces 2026 Class of Fellows
MicroLED Photonic Interconnects for AI Servers
MicroLED Photonic Interconnects for AI Servers
Vishkin Receives 2026 IEEE Computer Society Charles Babbage Award
Vishkin Receives 2026 IEEE Computer Society Charles Babbage Award
Empowering Communities Through Digital Literacy: Impact Across Lebanon
Empowering Communities Through Digital Literacy: Impact Across Lebanon
From Isolation to Innovation: Establishing a Computer Training Center to Empower Hinterland Communities
From Isolation to Innovation: Establishing a Computer Training Center to Empower Hinterland Communities
Read Next

IEEE Computer Society Announces 2026 Class of Fellows

MicroLED Photonic Interconnects for AI Servers

Vishkin Receives 2026 IEEE Computer Society Charles Babbage Award

Empowering Communities Through Digital Literacy: Impact Across Lebanon

From Isolation to Innovation: Establishing a Computer Training Center to Empower Hinterland Communities

IEEE Uganda Section: Tackling Climate Change and Food Security Through AI and IoT

Blockchain Service Capability Evaluation (IEEE Std 3230.03-2025)

Autonomous Observability: AI Agents That Debug AI

Get the latest news and technology trends for computing professionals with ComputingEdge
Sign up for our newsletter