Main Navigation
- Code of Conduct
- Create Profile
- Reset / Forgot Password
- Privacy Policy
- Contact CVPR
June 18-22, 2023, Vancouver, Canada
The Thirty-Fourth IEEE/CVF Conference on Computer Vision and Pattern Recognition
Submission website: https://openreview.net/group?id=thecvf.com/CVPR/2023/Conference
Call for Papers
Papers in the main technical program must describe high-quality, original research. Topics of interest include all aspects of computer vision and pattern recognition including, but not limited to:
Important Dates
Paper Registration Deadline*: | Fri, Nov 4, 2022 11:59pm Pacific Time |
* Date is fixed, no extension will be given
Paper Submission
All submissions will be handled electronically via the OpenReview conference submission website https://openreview.net/group?id=thecvf.com/CVPR/2023/Conference . All authors must agree to the policies stipulated below. The submission deadline is November 11, 2022 and will not be changed. Supplementary materials can be submitted until November 18, 2022.
In submitting a manuscript to CVPR, authors acknowledge that no paper substantially similar in content has been or will be submitted to another conference or workshop during the review period (November 11, 2022 – February 27, 2023). Please refer to the Author Guidelines on the conference web site for additional details on dual submissions and guidelines concerning prior work.
By submitting a paper to CVPR, the authors agree to the review process and understand that papers are processed by OpenReview to match each manuscript to the best possible area chairs and reviewers.
All accepted papers will be made publicly available by the Computer Vision Foundation (CVF) two weeks before the conference. Authors wishing to submit a patent understand that the paper's official public disclosure is two weeks before the conference or whenever the authors make it publicly available, whichever is first. More information about CVF is available at http://www.cv-foundation.org/ .
Tutorials and Workshops
In addition to the main technical program, the conference will include tutorials and workshops. Information about these can be found on tabs on the main CVPR web page:
- Call for Tutorials
- Call for Workshop Proposals
For further information and updates about the conference, visit the main conference website, at http://cvpr2023.thecvf.com .
Organizing Committee
Conference producer, general chair, program chair, tutorial chair, workshop chair, social chair, senior pami-tc ombud, conference ombud, virtual platform chair, technical chair, social activities chair, demonstration chair, local chair, publications chair, publicity chair, finance chair, accessibility chair, doctoral consortium chair, program advisory board, web developer.
CVF Sponsored Conferences
Cvf sponsored conferences errata.
It is the policy of the Computer Vision Foundation to maintain PDF copies of conference papers as submitted during the camera-ready paper collection. These papers are considered the final published versions of the work. We recognize the need for minor corrections after publication, and thus provide links to arXiv versions of the papers where available. If a correction must be made, it should be made as an update to the arXiv version of the paper by the authors. The CVF maintainers should then be notified of the update via email ( [email protected] ). The conference open access website will be updated periodically to indicate changes made to an arXiv version since the original conference publication date. The original camera-ready version of the paper will be maintained within the open access archive, and will not be removed or replaced by request.
Other Computer Vision Conferences and Workshops
- Registration
- DEI and Code of Conduct
- Travel Awards
- Call For Papers
- Reviewer Guidelines
- Author Guidelines
- Call For Workshops
- Call for Tutorial Proposals
- Call For Demos
- Doctoral Consortium
- Local Poster Printing
- BECOME A SPONSOR
SUBMISSIONS
Wacv 2024 call for papers.
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) provides a forum for computer vision researchers working on practical applications and innovative algorithms to share their latest developments. WACV 2024 solicits high-quality, original submissions describing research in computer vision, with a particular emphasis on systems and applications with significant, interesting vision components.
Application areas include, but are not limited to:
- Agriculture
- Animals and Insects
- Arts, games, and social media
- Autonomous driving
- Biomedical, healthcare, and medicine
- Commercial and retail
- Embedded sensing and real-time techniques
- Environmental monitoring, climate change, and ecology
- Food science and nutrition
- Psychology and cognitive science
- Remote sensing
- Smartphones and end-user devices
- Social good
- Structural engineering and civil engineering
- Virtual and augmented reality
- Visualization
Authors are also encouraged to submit more traditional computer vision algorithms papers. Topics of interest include, but are not limited to:
- 3D computer vision
- Adversarial learning, adversarial attack, and defense methods
- Biometrics, face, gesture, and body pose
- Computational photography
- Generative models for image, video, 3D, etc.
- Datasets and evaluations
- Explainable, fair, accountable, privacy-preserving, and ethical computer vision
- Image recognition and understanding (object detection, categorization, segmentation, scene modeling, visual reasoning)
- Low-level and physics-based vision
- Machine learning architectures, formulations, and algorithms (including transfer, low-shot, semi-, self-, and unsupervised learning)
- Video recognition and understanding (tracking, action recognition, etc.)
- Vision + language and/or other modalities
All submissions will be handled electronically through CMT: https://cmt3.research.microsoft.com/WACV2024
Papers can be submitted to either the applications or the algorithms tracks, which will have different review criteria. Applications papers will be evaluated on systems-level innovation, novelty of the domain and comparative assessment. Algorithms papers will be evaluated according to the standard conference criteria including algorithmic novelty and quantified evaluation against current, alternative approaches.
Deadlines (**note that all PT timezone deadlines are AM not PM) :
- Paper registration: June 21st, 2023 11:59 AM PT (June 21st, 2023 06:59 PM GMT)
- Submission: June 28th, 2023 11:59 AM PT (June 28th, 2023 06:59 PM GMT)
- Supplementary material deadline: June 30th, 11:59 AM PT (June 30th, 2023 06:59 PM GMT)
- Reviews and Decisions released to authors: Aug 11th, 2023 Aug 14th, 2023
- Rebuttal and Revision submission: Aug. 30th, 2023 11:59 AM PT (Aug. 30th, 2023 06:59 PM GMT)
- Final Decisions released to authors: Oct 20th, 2023
- Camera ready deadline: November 7 th , 11:59 PM, PT
- Author Registration deadline: November 7 th , 11:59 PM, PT
- Paper registration: Aug 23rd, 2023 11:59 AM PT (Aug. 23th, 2023 06:59 PM GMT)
- Submission: Aug. 30th, 2023 11:59 AM PT (Aug. 30th, 2023 06:59 PM GMT)
- Supplementary material deadline: September 1st, 11:59 AM PT (September 1st, 2023 06:59 PM GMT)
- Reviews and Final Decisions released to authors: Oct 20th, 2023
Camera ready submission instructions are here: https://docs.google.com/document/d/1UrN19rmmotCk10FajE1y0Zh0W4ruL3uGQHU_oEzK5pY/
As in previous years, WACV 2024 will employ a two-round review process. New papers can be submitted in either the first or the second round. The primary benefit of submitting in Round 1 is that submissions which are not accepted early in the first round can be revised and resubmitted along with a rebuttal, enabling authors to address reviewer concerns. Round 2 submissions will not have a rebuttal.
Please direct any questions to the Program Chairs ( [email protected] ).
Subscribe to the PwC Newsletter
Join the community, computer vision, semantic segmentation.
Tumor Segmentation
Panoptic Segmentation
3D Semantic Segmentation
Weakly-Supervised Semantic Segmentation
Representation learning.
Disentanglement
Graph representation learning, sentence embeddings.
Network Embedding
Classification.
Text Classification
Graph Classification
Audio Classification
Medical Image Classification
Object detection.
3D Object Detection
Real-Time Object Detection
RGB Salient Object Detection
Few-Shot Object Detection
Image classification.
Out of Distribution (OOD) Detection
Few-Shot Image Classification
Fine-Grained Image Classification
Semi-Supervised Image Classification
2d object detection.
Edge Detection
Thermal image segmentation.
Open Vocabulary Object Detection
Reinforcement learning (rl), off-policy evaluation, multi-objective reinforcement learning, 3d point cloud reinforcement learning, deep hashing, table retrieval, domain adaptation.
Unsupervised Domain Adaptation
Domain Generalization
Test-time Adaptation
Source-free domain adaptation, image generation.
Image-to-Image Translation
Text-to-Image Generation
Image Inpainting
Conditional Image Generation
Data augmentation.
Image Augmentation
Text Augmentation
Autonomous vehicles.
Autonomous Driving
Self-Driving Cars
Simultaneous Localization and Mapping
Autonomous Navigation
Image Denoising
Color Image Denoising
Sar Image Despeckling
Grayscale image denoising, meta-learning.
Few-Shot Learning
Sample Probing
Universal meta-learning, contrastive learning.
Super-Resolution
Image Super-Resolution
Video Super-Resolution
Multi-Frame Super-Resolution
Reference-based Super-Resolution
Pose estimation.
3D Human Pose Estimation
Keypoint Detection
3D Pose Estimation
6D Pose Estimation
Self-supervised learning.
Point Cloud Pre-training
Unsupervised video clustering, 2d semantic segmentation, image segmentation, text style transfer.
Scene Parsing
Reflection Removal
Visual question answering (vqa).
Visual Question Answering
Machine Reading Comprehension
Chart Question Answering
Embodied Question Answering
Depth Estimation
3D Reconstruction
Neural Rendering
3D Face Reconstruction
Anomaly detection.
Unsupervised Anomaly Detection
One-Class Classification
Supervised anomaly detection, anomaly detection in surveillance videos, sentiment analysis.
Aspect-Based Sentiment Analysis (ABSA)
Multimodal Sentiment Analysis
Aspect Sentiment Triplet Extraction
Twitter Sentiment Analysis
Temporal Action Localization
Video Understanding
Video generation.
Video Object Segmentation
Action Classification
3d object super-resolution.
One-Shot Learning
Few-Shot Semantic Segmentation
Cross-domain few-shot.
Unsupervised Few-Shot Learning
Activity recognition.
Action Recognition
Human Activity Recognition
Egocentric activity recognition.
Group Activity Recognition
Medical image segmentation.
Lesion Segmentation
Brain Tumor Segmentation
Cell Segmentation
Skin lesion segmentation, monocular depth estimation.
Stereo Depth Estimation
Depth and camera motion.
3D Depth Estimation
Exposure fairness, optical character recognition (ocr).
Active Learning
Handwriting Recognition
Handwritten digit recognition, irregular text recognition, instance segmentation.
Referring Expression Segmentation
3D Instance Segmentation
Real-time Instance Segmentation
Unsupervised Object Segmentation
Facial recognition and modelling.
Face Recognition
Face Swapping
Face Detection
Facial Expression Recognition (FER)
Face Verification
Object tracking.
Multi-Object Tracking
Visual Object Tracking
Multiple Object Tracking
Cell Tracking
Zero-shot learning.
Generalized Zero-Shot Learning
Compositional Zero-Shot Learning
Multi-label zero-shot learning, quantization, data free quantization, unet quantization, continual learning.
Class Incremental Learning
Continual named entity recognition, unsupervised class-incremental learning.
Action Recognition In Videos
3D Action Recognition
Self-supervised action recognition, few shot action recognition.
Scene Understanding
Scene Text Recognition
Scene Graph Generation
Scene Recognition
Adversarial attack.
Backdoor Attack
Adversarial Text
Adversarial attack detection, real-world adversarial attack, active object detection, image retrieval.
Sketch-Based Image Retrieval
Content-Based Image Retrieval
Composed Image Retrieval (CoIR)
Medical Image Retrieval
Dimensionality reduction.
Supervised dimensionality reduction
Online nonnegative cp decomposition, emotion recognition.
Speech Emotion Recognition
Emotion Recognition in Conversation
Multimodal Emotion Recognition
Emotion-cause pair extraction.
Monocular 3D Object Detection
3D Object Detection From Stereo Images
Multiview Detection
Robust 3d object detection, image reconstruction.
MRI Reconstruction
Film Removal
Style transfer.
Image Stylization
Font style transfer, style generalization, face transfer, optical flow estimation.
Video Stabilization
Action localization.
Action Segmentation
Spatio-temporal action localization, image captioning.
3D dense captioning
Controllable image captioning, aesthetic image captioning.
Relational Captioning
Person re-identification.
Unsupervised Person Re-Identification
Video-based person re-identification, generalizable person re-identification, cloth-changing person re-identification, image restoration.
Demosaicking
Spectral reconstruction, underwater image restoration.
JPEG Artifact Correction
Visual relationship detection, lighting estimation.
3D Room Layouts From A Single RGB Panorama
Road scene understanding, action detection.
Skeleton Based Action Recognition
Online Action Detection
Audio-visual active speaker detection, metric learning.
Object Recognition
3D Object Recognition
Continuous object recognition.
Depiction Invariant Object Recognition
Monocular 3D Human Pose Estimation
Pose prediction.
3D Multi-Person Pose Estimation
3d human pose and shape estimation, image enhancement.
Low-Light Image Enhancement
Image relighting, de-aliasing, multi-label classification.
Missing Labels
Extreme multi-label classification, hierarchical multi-label classification, medical code prediction, continuous control.
Steering Control
Drone controller, 3d face modelling.
Semi-Supervised Video Object Segmentation
Unsupervised Video Object Segmentation
Referring Video Object Segmentation
Video Salient Object Detection
Trajectory prediction.
Trajectory Forecasting
Human motion prediction, out-of-sight trajectory prediction.
Multivariate Time Series Imputation
Image quality assessment, no-reference image quality assessment, blind image quality assessment.
Aesthetics Quality Assessment
Stereoscopic image quality assessment, novel view synthesis.
Novel LiDAR View Synthesis
Gournd video synthesis from satellite image
Object localization.
Weakly-Supervised Object Localization
Image-based localization, unsupervised object localization, monocular 3d object localization.
Blind Image Deblurring
Single-image blind deblurring, out-of-distribution detection, video semantic segmentation.
Camera shot segmentation
Cloud removal.
Facial Inpainting
Fine-Grained Image Inpainting
2d classification.
Face Generation
Neural Network Compression
Music Source Separation
Cell detection.
Plant Phenotyping
Instruction following, visual instruction following, change detection.
Semi-supervised Change Detection
Saliency detection.
Saliency Prediction
Co-Salient Object Detection
Video saliency detection, unsupervised saliency detection, prompt engineering.
Visual Prompting
Image compression.
Feature Compression
Jpeg compression artifact reduction.
Lossy-Compression Artifact Reduction
Color image compression artifact reduction, explainable artificial intelligence, explainable models, explanation fidelity evaluation, fad curve analysis, image registration.
Unsupervised Image Registration
Ensemble learning, salient object detection, saliency ranking, visual reasoning.
Visual Commonsense Reasoning
Visual tracking.
Point Tracking
Rgb-t tracking, real-time visual tracking.
RF-based Visual Tracking
3d point cloud classification.
3D Object Classification
Few-Shot 3D Point Cloud Classification
Supervised only 3d point cloud classification, zero-shot transfer 3d point cloud classification, motion estimation, image manipulation detection.
Zero Shot Skeletal Action Recognition
Generalized zero shot skeletal action recognition, whole slide images, visual grounding.
3D visual grounding
Person-centric visual grounding.
Phrase Extraction and Grounding (PEG)
Activity prediction, motion prediction, cyber attack detection, sequential skip prediction, gesture recognition.
Hand Gesture Recognition
Hand-Gesture Recognition
RF-based Gesture Recognition
Video question answering.
Zero-Shot Video Question Answer
Few-shot video question answering, text detection, video captioning.
Dense Video Captioning
Boundary captioning, visual text correction, audio-visual video captioning, point cloud registration.
Image to Point Cloud Registration
Robust 3D Semantic Segmentation
Real-Time 3D Semantic Segmentation
Unsupervised 3D Semantic Segmentation
Furniture segmentation, medical diagnosis.
Alzheimer's Disease Detection
Retinal OCT Disease Classification
Blood cell count, thoracic disease classification, 3d point cloud interpolation, visual odometry.
Face Anti-Spoofing
Monocular visual odometry.
Hand Pose Estimation
Hand Segmentation
Gesture-to-gesture translation, rain removal.
Single Image Deraining
Image clustering.
Online Clustering
Face Clustering
Multi-view subspace clustering, multi-modal subspace clustering.
Image Dehazing
Single Image Dehazing
Colorization.
Line Art Colorization
Point-interactive Image Colorization
Color Mismatch Correction
Robot navigation.
PointGoal Navigation
Social navigation.
Sequential Place Learning
Image manipulation, conformal prediction, image editing, rolling shutter correction, shadow removal, multimodel-guided image editing, joint deblur and frame interpolation, multimodal fashion image editing, visual place recognition.
Indoor Localization
3d place recognition, visual localization.
DeepFake Detection
Synthetic Speech Detection
Human detection of deepfakes, multimodal forgery detection.
Unsupervised Image-To-Image Translation
Synthetic-to-Real Translation
Multimodal Unsupervised Image-To-Image Translation
Cross-View Image-to-Image Translation
Fundus to Angiography Generation
Stereo matching.
Crowd Counting
Visual Crowd Analysis
Group detection in crowds, object reconstruction.
3D Object Reconstruction
Human-object interaction detection.
Affordance Recognition
Earth observation, image deblurring, low-light image deblurring and enhancement, image matching.
Semantic correspondence
Patch matching, set matching.
Matching Disparate Images
Video quality assessment, video alignment, temporal sentence grounding, long-video activity recognition, point cloud classification, jet tagging, few-shot point cloud classification, hyperspectral.
Hyperspectral Image Classification
Hyperspectral unmixing, hyperspectral image segmentation, classification of hyperspectral images, document text classification.
Learning with noisy labels
Multi-label classification of biomedical texts, political salient issue orientation detection, 3d point cloud reconstruction.
Weakly Supervised Action Localization
Weakly-supervised temporal action localization.
Temporal Action Proposal Generation
Activity recognition in videos, scene classification.
2D Human Pose Estimation
Action anticipation.
3D Face Animation
Semi-supervised human pose estimation, point cloud generation, point cloud completion, referring expression, reconstruction, 3d human reconstruction.
Single-View 3D Reconstruction
4d reconstruction, single-image-based hdr reconstruction, compressive sensing, keyword spotting.
Small-Footprint Keyword Spotting
Visual keyword spotting, scene text detection.
Curved Text Detection
Multi-oriented scene text detection, camera calibration, boundary detection.
Junction Detection
Image matting.
Semantic Image Matting
Video retrieval, video-text retrieval, video grounding, video-adverb retrieval, replay grounding, composed video retrieval (covr), cross-modal retrieval, image-text matching, cross-modal retrieval with noisy correspondence, multilingual cross-modal retrieval.
Zero-shot Composed Person Retrieval
Cross-modal retrieval on rsitmd, document ai, document understanding, emotion classification.
Motion Synthesis
Motion Style Transfer
Temporal human motion composition, video summarization.
Unsupervised Video Summarization
Supervised video summarization, point cloud segmentation, superpixels.
Sensor Fusion
Few-Shot Transfer Learning for Saliency Prediction
Aerial Video Saliency Prediction
3d anomaly detection, video anomaly detection, artifact detection, remote sensing.
Remote Sensing Image Classification
Change detection for remote sensing images, building change detection for remote sensing images.
Segmentation Of Remote Sensing Imagery
The Semantic Segmentation Of Remote Sensing Imagery
Document layout analysis.
Talking Head Generation
Talking face generation.
Face Age Editing
Facial expression generation, kinship face generation.
Point cloud reconstruction
3D Semantic Scene Completion
3D Semantic Scene Completion from a single RGB image
Garment reconstruction, human detection.
Video Instance Segmentation
Privacy Preserving Deep Learning
Membership inference attack.
Generalized Few-Shot Semantic Segmentation
Video editing, video temporal consistency, virtual try-on.
Generalized Referring Expression Segmentation
Scene flow estimation.
Self-supervised Scene Flow Estimation
Object discovery, 3d classification, depth completion.
Motion Forecasting
Multi-Person Pose forecasting
Multiple Object Forecasting
Face reconstruction, gaze estimation.
CARLA MAP Leaderboard
Dead-reckoning prediction.
text-guided-image-editing
Text-based image editing, concept alignment.
Zero-Shot Text-to-Image Generation
Conditional text-to-image synthesis, texture synthesis, machine unlearning, continual forgetting, sign language recognition.
MULTI-VIEW LEARNING
Incomplete multi-view clustering, interactive segmentation.
Breast Cancer Detection
Skin cancer classification.
Breast Cancer Histology Image Classification
Lung cancer diagnosis, classification of breast cancer histology images, image recognition, fine-grained image recognition, license plate recognition, material recognition, disease prediction, disease trajectory forecasting, gait recognition.
Multiview Gait Recognition
Gait recognition in the wild, human parsing.
Multi-Human Parsing
Scene generation, weakly supervised segmentation, object counting, training-free object counting, open-vocabulary object counting, pose tracking.
3D Human Pose Tracking
3D Multi-Person Pose Estimation (absolute)
3D Multi-Person Pose Estimation (root-relative)
3D Multi-Person Mesh Recovery
Event-based vision.
Event-based Optical Flow
Event-Based Video Reconstruction
Event-based motion estimation, interest point detection, homography estimation.
3D Hand Pose Estimation
Facial landmark detection.
Unsupervised Facial Landmark Detection
3D Facial Landmark Localization
3d character animation from a single photo, scene segmentation.
Dichotomous Image Segmentation
Activity detection, inverse rendering, temporal localization.
Language-Based Temporal Localization
Temporal defect localization, multi-label image classification.
Multi-label Image Recognition with Partial Labels
Text-to-video generation, text-to-video editing, subject-driven video generation, 3d object tracking.
3D Single Object Tracking
Template matching, camera localization.
Camera Relocalization
Lidar semantic segmentation, motion segmentation, visual dialog.
Relation Network
Text spotting.
Intelligent Surveillance
Vehicle Re-Identification
Few-shot class-incremental learning, class-incremental semantic segmentation, non-exemplar-based class incremental learning, disparity estimation.
Handwritten Text Recognition
Handwritten document recognition, unsupervised text recognition, knowledge distillation.
Data-free Knowledge Distillation
Self-knowledge distillation, moment retrieval.
Zero-shot Moment Retrieval
Text to video retrieval, partially relevant video retrieval, decision making under uncertainty.
Uncertainty Visualization
Person search, semi-supervised object detection.
Shadow Detection
Shadow Detection And Removal
Unconstrained Lip-synchronization
Mixed reality, video inpainting.
Video Enhancement
Cross-corpus
Micro-expression recognition, micro-expression spotting.
3D Facial Expression Recognition
Smile Recognition
3D Multi-Object Tracking
Real-time multi-object tracking, multi-animal tracking with identification, trajectory long-tail distribution for muti-object tracking, grounded multiple object tracking, future prediction, human mesh recovery.
Stereo Image Super-Resolution
Burst image super-resolution, satellite image super-resolution, multispectral image super-resolution.
Face Image Quality Assessment
Lightweight face recognition.
Age-Invariant Face Recognition
Synthetic face recognition, face quality assessement, image categorization, fine-grained visual categorization, zero shot segmentation, physics-informed machine learning, soil moisture estimation, video reconstruction.
Overlapped 10-1
Overlapped 15-1, overlapped 15-5, disjoint 10-1, disjoint 15-1, color constancy.
Few-Shot Camera-Adaptive Color Constancy
Hdr reconstruction, multi-exposure image fusion, open vocabulary semantic segmentation, zero-guidance segmentation, deep attention, line detection, visual recognition.
Fine-Grained Visual Recognition
Sign language translation.
Image Cropping
Stereo matching hand.
3D Absolute Human Pose Estimation
Text-to-Face Generation
Image forensics, tone mapping, zero-shot action recognition, natural language transduction, novel class discovery.
Breast Cancer Histology Image Classification (20% labels)
Transparent object detection, transparent objects, video restoration.
Analog Video Restoration
Abnormal event detection in video.
Semi-supervised Anomaly Detection
Surface normals estimation.
Vision-Language Navigation
hand-object pose
Grasp Generation
3D Canonical Hand Pose Estimation
Cross-domain few-shot learning, texture classification, image animation.
Infrared And Visible Image Fusion
Image to 3D
Probabilistic deep learning, unsupervised few-shot image classification, generalized few-shot classification, action quality assessment, highlight detection, pedestrian attribute recognition.
Steganalysis
Sketch Recognition
Face Sketch Synthesis
Drawing pictures.
Photo-To-Caricature Translation
Spoof detection, face presentation attack detection, detecting image manipulation, cross-domain iris presentation attack detection, finger dorsal image spoof detection, computer vision techniques adopted in 3d cryogenic electron microscopy, single particle analysis, cryogenic electron tomography, iris recognition, pupil dilation.
One-shot visual object segmentation
Unbiased Scene Graph Generation
Panoptic Scene Graph Generation
Image to video generation.
Unconditional Video Generation
Action understanding, automatic post-editing.
Dense Captioning
Document image classification.
Image Stitching
Multi-View 3D Reconstruction
Person retrieval, segmentation, open-vocabulary semantic segmentation, universal domain adaptation, surgical phase recognition, online surgical phase recognition, offline surgical phase recognition, blind face restoration.
Face Reenactment
Geometric Matching
Human action generation.
Action Generation
Object categorization, text based person retrieval, diffusion personalization.
Diffusion Personalization Tuning Free
Efficient Diffusion Personalization
Human dynamics.
3D Human Dynamics
Meme classification, hateful meme classification, severity prediction, intubation support prediction, cloud detection.
Text-To-Image
Story visualization, complex scene breaking and synthesis, image fusion, pansharpening, image deconvolution.
Image Outpainting
Table Recognition
Object segmentation.
Camouflaged Object Segmentation
Landslide segmentation, text-line extraction, point clouds, point cloud video understanding, point cloud rrepresentation learning.
Semantic SLAM
Object SLAM
Intrinsic image decomposition, line segment detection, sports analytics, situation recognition, grounded situation recognition, image shadow removal, motion detection, multi-target domain adaptation, person identification, visual prompt tuning, single-source domain generalization, evolving domain generalization, source-free domain generalization.
Robot Pose Estimation
Camouflaged Object Segmentation with a Single Task-generic Prompt
Image morphing, rotated mnist, weakly-supervised instance segmentation, image smoothing, fake image detection.
GAN image forensics
Fake Image Attribution
Image steganography, occlusion handling, contour detection.
Crop Classification
Face image quality, lane detection.
3D Lane Detection
Layout design, license plate detection.
Video Panoptic Segmentation
Viewpoint estimation.
Drone navigation
Drone-view target localization, value prediction, body mass index (bmi) prediction, crop yield prediction, multi-object tracking and segmentation.
Zero-Shot Transfer Image Classification
3D Object Reconstruction From A Single Image
CAD Reconstruction
3d point cloud linear classification, multiview learning, person recognition.
Photo Retouching
Motion retargeting, shape representation of 3d point clouds, bird's-eye view semantic segmentation.
Dense Pixel Correspondence Estimation
Human part segmentation.
Document Shadow Removal
Symmetry detection, traffic sign detection, video style transfer, referring image matting.
Referring Image Matting (Expression-based)
Referring Image Matting (Keyword-based)
Referring Image Matting (RefMatte-RW100)
Referring image matting (prompt-based), human interaction recognition, one-shot 3d action recognition, mutual gaze, affordance detection.
Gaze Prediction
Hand detection, image forgery detection, image instance retrieval, amodal instance segmentation, image quality estimation.
Image Similarity Search
Precipitation Forecasting
Referring expression generation, road damage detection.
Space-time Video Super-resolution
Video matting.
Open Vocabulary Attribute Detection
Inverse tone mapping, image/document clustering, self-organized clustering, instance search.
Audio Fingerprint
Open-World Semi-Supervised Learning
Semi-supervised image classification (cold start), 3d shape modeling.
Action Analysis
Art analysis, facial editing.
Food Recognition
Holdout Set
Material classification.
Motion Magnification
Multispectral object detection, semi-supervised instance segmentation, binary classification, llm-generated text detection, cancer-no cancer per breast classification, cancer-no cancer per image classification, suspicous (birads 4,5)-no suspicous (birads 1,2,3) per image classification, cancer-no cancer per view classification, video segmentation, camera shot boundary detection, open-vocabulary video segmentation, open-world video segmentation, lung nodule classification, lung nodule 3d classification, lung nodule detection, lung nodule 3d detection, 3d scene reconstruction.
Zero-Shot Composed Image Retrieval (ZS-CIR)
Event segmentation, generic event boundary detection, image retouching, image-variation, jpeg artifact removal, point cloud super resolution, skills assessment.
Text-based Person Retrieval
Sensor Modeling
Handwriting verification, bangla spelling error correction, video prediction, earth surface forecasting, predict future video frames, ad-hoc video search, audio-visual synchronization, handwriting generation, pose retrieval, scanpath prediction, scene change detection.
Sketch-to-Image Translation
Skills evaluation, synthetic image detection, highlight removal, 3d shape reconstruction from a single 2d image.
Shape from Texture
Deception detection, deception detection in videos.
Video Visual Relation Detection
Human-object relationship detection, 3d open-vocabulary instance segmentation.
3D Shape Representation
3D Dense Shape Correspondence
Birds eye view object detection.
Multiple People Tracking
Network Interpretation
Rgb-d reconstruction, seeing beyond the visible, semi-supervised domain generalization, unsupervised semantic segmentation.
Unsupervised Semantic Segmentation with Language-image Pre-training
Multiple object tracking with transformer.
Multiple Object Track and Segmentation
Constrained lip-synchronization, face dubbing, vietnamese visual question answering, explanatory visual question answering, 3d shape reconstruction, 4d panoptic segmentation, defocus blur detection, event data classification, image comprehension, image manipulation localization, instance shadow detection, kinship verification, medical image enhancement, open vocabulary panoptic segmentation, single-object discovery, training-free 3d point cloud classification, video forensics.
Sequential Place Recognition
Autonomous flight (dense forest), autonomous web navigation, 2d pose estimation, category-agnostic pose estimation, overlapping pose estimation.
Generative 3D Object Classification
Cube engraving classification, facial expression recognition, cross-domain facial expression recognition, zero-shot facial expression recognition, multimodal machine translation.
Face to Face Translation
Multimodal lexical translation, 10-shot image generation, 2d semantic segmentation task 3 (25 classes), document enhancement, action assessment, bokeh effect rendering, drivable area detection, face anonymization, font recognition, horizon line estimation, image imputation.
Long Video Retrieval (Background Removed)
Medical image denoising.
Occlusion Estimation
Physiological computing.
Lake Ice Monitoring
Short-term object interaction anticipation, spatio-temporal video grounding, text-based person retrieval with noisy correspondence.
Unsupervised 3D Point Cloud Linear Evaluation
Wireframe parsing, gaze redirection, single-image-generation, unsupervised anomaly detection with specified settings -- 30% anomaly, root cause ranking, anomaly detection at 30% anomaly, anomaly detection at various anomaly percentages.
Unsupervised Contextual Anomaly Detection
Landmark tracking, muscle tendon junction identification, mistake detection, online mistake detection, 3d object captioning, 3d semantic occupancy prediction, animated gif generation, generalized referring expression comprehension, image deblocking, infrared image super-resolution, motion disentanglement, persuasion strategies, scene text editing, image to sketch recognition, traffic accident detection, accident anticipation, unsupervised landmark detection, visual speech recognition, lip to speech synthesis, continual anomaly detection, weakly supervised action segmentation (transcript), weakly supervised action segmentation (action set)), calving front delineation in synthetic aperture radar imagery, calving front delineation in synthetic aperture radar imagery with fixed training amount.
Handwritten Line Segmentation
Handwritten word segmentation.
General Action Video Anomaly Detection
Physical video anomaly detection, monocular cross-view road scene parsing(road), monocular cross-view road scene parsing(vehicle).
Transparent Object Depth Estimation
3d scene editing, age and gender estimation, data ablation.
Occluded Face Detection
Gait identification, historical color image dating, stochastic human motion prediction, image retargeting, image and video forgery detection, motion captioning, personality trait recognition, personalized segmentation, scene-aware dialogue, spatial relation recognition, spatial token mixer, steganographics, story continuation.
Unsupervised Anomaly Detection with Specified Settings -- 0.1% anomaly
Unsupervised anomaly detection with specified settings -- 1% anomaly, unsupervised anomaly detection with specified settings -- 10% anomaly, unsupervised anomaly detection with specified settings -- 20% anomaly, vehicle speed estimation, visual analogies, visual social relationship recognition, zero-shot text-to-video generation, text-guided-generation, video frame interpolation, 3d video frame interpolation, unsupervised video frame interpolation.
eXtreme-Video-Frame-Interpolation
Continual semantic segmentation, overlapped 5-3, overlapped 25-25, micro-expression generation, micro-expression generation (megc2021), period estimation, art period estimation (544 artists), unsupervised panoptic segmentation, unsupervised zero-shot panoptic segmentation, 3d rotation estimation, camera auto-calibration, defocus estimation, derendering, fingertip detection, hierarchical text segmentation, human-object interaction concept discovery.
One-Shot Face Stylization
Speaker-specific lip to speech synthesis, multi-person pose estimation, neural stylization.
Part-aware Panoptic Segmentation
Population Mapping
Pornography detection, prediction of occupancy grid maps, raw reconstruction, repetitive action counting, svbrdf estimation, semi-supervised video classification, spectrum cartography, supervised image retrieval, synthetic image attribution, training-free 3d part segmentation, unsupervised image decomposition, video propagation, vietnamese multimodal learning, weakly supervised 3d point cloud segmentation, weakly-supervised panoptic segmentation, drone-based object tracking, brain visual reconstruction, brain visual reconstruction from fmri.
Human-Object Interaction Generation
Image-guided composition, fashion understanding, semi-supervised fashion compatibility.
intensity image denoising
Lifetime image denoising, observation completion, active observation completion, boundary grounding.
Video Narrative Grounding
3d inpainting, 3d scene graph alignment, 4d spatio temporal semantic segmentation.
Age Estimation
Few-shot Age Estimation
Brdf estimation, camouflage segmentation, clothing attribute recognition, damaged building detection, depth image estimation, detecting shadows, dynamic texture recognition.
Disguised Face Verification
Few shot open set object detection, gaze target estimation, generalized zero-shot learning - unseen, hd semantic map learning, human-object interaction anticipation, image deep networks, keypoint detection and image matching, manufacturing quality control, materials imaging, micro-gesture recognition, multi-person pose estimation and tracking.
Multi-modal image segmentation
Multi-object discovery, neural radiance caching.
Parking Space Occupancy
Partial Video Copy Detection
Multimodal Patch Matching
Perpetual view generation, procedure learning, prompt-driven zero-shot domain adaptation, jersey number recognition, photo to rest generalization, single-shot hdr reconstruction, on-the-fly sketch based image retrieval, thermal image denoising, trademark retrieval, unsupervised instance segmentation, unsupervised zero-shot instance segmentation, vehicle key-point and orientation estimation.
Video Individual Counting
Video-adverb retrieval (unseen compositions), video-to-image affordance grounding.
Vietnamese Scene Text
Visual sentiment prediction, human-scene contact detection, localization in video forgery, video classification, student engagement level detection (four class video classification), multi class classification (four-level video classification), 3d canonicalization, 3d surface generation.
Visibility Estimation from Point Cloud
Amodal layout estimation, blink estimation, camera absolute pose regression, change data generation, constrained diffeomorphic image registration, continuous affect estimation, deep feature inversion, document image skew estimation, earthquake prediction, fashion compatibility learning.
Displaced People Recognition
Finger vein recognition, flooded building segmentation.
Future Hand Prediction
Generative temporal nursing, grounded multimodal named entity recognition, house generation, human fmri response prediction, hurricane forecasting, ifc entity classification, image declipping, image similarity detection.
Image Text Removal
Image-to-gps verification.
Image-based Automatic Meter Reading
Dial meter reading, indoor scene reconstruction, jpeg decompression.
Kiss Detection
Laminar-turbulent flow localisation.
Landmark Recognition
Brain landmark detection, corpus video moment retrieval, mllm evaluation: aesthetics, medical image deblurring, mental workload estimation, meter reading, motion expressions guided video segmentation, natural image orientation angle detection, multi-object colocalization, multilingual text-to-image generation, video emotion detection, nwp post-processing, occluded 3d object symmetry detection, open set video captioning, pso-convnets dynamics 1, pso-convnets dynamics 2, partial point cloud matching.
Partially View-aligned Multi-view Learning
Pedestrian Detection
Thermal Infrared Pedestrian Detection
Personality trait recognition by face, physical attribute prediction, point cloud semantic completion, point cloud classification dataset, point- of-no-return (pnr) temporal localization, pose contrastive learning, potrait generation, procedure step recognition, prostate zones segmentation, pulmorary vessel segmentation, pulmonary artery–vein classification, reference expression generation, safety perception recognition, interspecies facial keypoint transfer, specular reflection mitigation, specular segmentation, state change object detection, surface normals estimation from point clouds, train ego-path detection.
Transform A Video Into A Comics
Transparency separation, typeface completion.
Unbalanced Segmentation
Unsupervised Long Term Person Re-Identification
Video correspondence flow.
Key-Frame-based Video Super-Resolution (K = 15)
Zero-shot single object tracking, yield mapping in apple orchards, lidar absolute pose regression, opd: single-view 3d openable part detection, self-supervised scene text recognition, spatial-aware image editing, video narration captioning, spectral estimation, spectral estimation from a single rgb image, 3d prostate segmentation, aggregate xview3 metric, atomic action recognition, composite action recognition, calving front delineation from synthetic aperture radar imagery, computer vision transduction, crosslingual text-to-image generation, zero-shot dense video captioning, document to image conversion, frame duplication detection, geometrical view, hyperview challenge.
Image Operation Chain Detection
Kinematic based workflow recognition, logo recognition.
MLLM Aesthetic Evaluation
Motion detection in non-stationary scenes, open-set video tagging, satellite orbit determination.
Segmentation Based Workflow Recognition
2d particle picking, small object detection.
Rice Grain Disease Detection
Sperm morphology classification, video & kinematic base workflow recognition, video based workflow recognition, video, kinematic & segmentation base workflow recognition, animal pose estimation.
- IEEE Xplore Digital Library
- IEEE Standards
- IEEE Spectrum
- Publications
- IEEE Transactions on Games
- Special Issues
Open Special Issue Calls
Special issue on computer vision and games: call for papers.
Video Games and Computer Vision research have long held a symbiotic relationship. On the one hand, virtual worlds in games are often used for collecting training data or as testbeds for computer vision models since they provide a greater deal of flexibility, control and scalability in the data collection process compared to the real world. On the other hand, computer vision advancements have enabled us to push the frontiers of what is possible within these artificial game worlds and have transformed the processes with which these worlds are created. However, significant research questions still remain unaddressed both in the field (Computer Vision) and the domain (Games), which include technical and engineering challenges.
This special issue invites research papers aiming to bridge the existing gaps between computer vision research and games engineering, with the motive of bringing together the games research community and the computer vision community that have largely operated independently until now. We are inviting papers for two main tracks. The first track focuses on introducing novel techniques within computer vision research that can advance the field of digital games. The second track, instead, focuses on leveraging game technologies to advance state-of-the-art techniques in computer vision. The list of topics below is not inclusive of all research directions that will be represented.
1) Computer Vision for Games
CV for game-playing, game testing and player modelling.
Data-driven CV to improve game graphics, animations, level-design, etc. as well as procedural content generation.
HCI through visual interfaces (gestures, posture, gaze, etc.).
Extended reality games.
Synthetic data and media generation based on users' emotions, behaviour, etc.
Improving real-time applicability of vision models integrated within games and game engines.
2) Games for Computer Vision
Game worlds that aid data augmentation techniques.
Rich game-based labelled datasets for tasks such as object detection, segmentation, or depth and flow estimation.
Ethics of game-based data collection and inference.
Forward modelling in and for games.
Generalisation and robustness in vision models leveraging a plethora of existing commercial games.
Unsupervised pre-training of image/video representations and world transition models from gameplay data.
We invite the submission of high quality papers on the topics above in the full paper format. Authors should follow normal IEEE Transactions on Games guidelines for their submissions, but clearly identify their papers for this special issue during the submission process. Extended versions of previously published conference or workshop papers are welcome, provided that the journal paper is a significant extension, and is accompanied by a cover letter explaining the additional contribution. You may visit the submission guidelines for author information guidelines and page length limits.
Important Dates:
Paper submission: January 31, 2024
First decisions: May 31st, 2024
Early access SI publication (online): August 2024
Publication in print: End 2024
Guest Editors:
Chintan Trivedi (University of Malta)
Matthew Guzdial (University of Alberta)
Konstantinos Makantasis (University of Malta)
Julian Togelius (New York University)
Nicu Sebe (University of Trento)
Special Issue on Human-Centred AI in Game Evaluation - Call for Papers
Most games are consciously designed with a specific experience or vision in mind. Games are commonly designed for entertainment and competition purposes, but self-expression, social critique, targeted learning, knowledge discovery as well as physical and mental health are also valid design objectives. Determining whether an objective is fulfilled is often quite difficult due to the complexity of modern games and the variability of human responses. For this reason, games are commonly playtested before being published. However, playtests are expensive and time-consuming and not all aspects of the game can be evaluated to the full extent before it is published.
There is thus a need for more concentrated and systematic work on evaluating/characterising games, its artefacts as well as player experience. Researchers have proposed approaches intended to assist game designers using methods from the field of artificial and computational intelligence (AI and CI, respectively). Still, to our knowledge, there is a surprising lack of generality and validation regarding these methods, even in scientific publications on game design. No central repository for methods currently exists. In this special issue, we want to focus on human-centred AI approaches aiming for a more holistic and systematic approach to game evaluation. We thus seek submissions on related topics for this special issue.
The following is a non-comprehensive list of suggested topics:
Uses of AI agents to evaluate game content
Measures for game evaluation
Game evaluation and play-testing for AR/VR
Relationship between AI agents and player experience
Automatic analysis of play-traces
Mixed-Initiative gameplay evaluation
Player modelling for game evaluation
Automatic evaluation for new game genres
Validation of automatic evaluation methods using human data
Generality of automatic evaluation methods
Differences between different evaluation methods (tested with AI or humans, qualitative vs. quantitative, objective vs subjective measures, etc.)
Evaluation measures and their relationship to business and research goals
Playtesting standards in industry
Correlation between objective and subjective measures
Ethics, privacy and legal aspects of using player data
Evaluation of generated content
We invite the submission of high quality papers on the topics above in the following formats:
Full papers
Short papers
Authors should follow normal IEEE Transactions on Games guidelines for their submissions, but clearly identify their papers for this special issue during the submission process. Extended versions of previously published conference or workshop papers are welcome, provided that the journal paper is a significant extension, and is accompanied by a cover letter explaining the additional contribution. See ( https://www.transactions. games/submit/submission- guidelines ) for author information guidelines and page length limits.
Important Dates
Paper submission November 1st, 2023
First decisions January 29th, 2024
Early access SI publication (online) March 2024
Publication in print End 2024
Guest Editors
Alena Denisova (University of York, UK)
Diego Pérez-Liébana (Queen Mary University of London, UK)
Vanessa Volz ( modl.ai , DK)
Julian Frommel (Utrecht University, NL)
Sahar Asadi (King, SE)
Ongoing Special Issues
User Evaluation for VR Games - Guest editors: Hai-Ning Liang (Xi’an Jiaotong-Liverpool University, China), Wenge Xu (Birmingham City University, UK), Yiyu Cai (Nanyang Technological University, Singapore), Fotis Liarokapis (CYENS – Centre of Excellence, Cyprus)
Recent Special Issues
Evolutionary Computation for Games - Guest editors: Jacob Schrum, Jialin Liu, Cameron Browne, Anikó Ekárt and Marcus Gallagher ( https://ieeexplore.ieee.org/ xpl/tocresult.jsp?isnumber= 10075412&punumber=7782673 )
User Experience of AI in Games - Guest editors: Henrik Warpefelt, Christoph Salge, Magy Seif El-Nasr, Jichen Zhu and Mirjam Palosaari Eladhari ( https://ieeexplore.ieee.org/ xpl/tocresult.jsp?isnumber= 9987665&punumber=7782673 )
Team AI in Games - Guest editors: Maxim Mozgovoy, Mike Preuss, Tomoharu Nakashima and Rafael Bidarra ( https://ieeexplore.ieee.org/ xpl/tocresult.jsp?isnumber= 9652015&punumber=7782673 )
Serious Games for Health - Guest editors: Duarte Duque, João L. Vilaça, Marjorie A. Zielke, Nuno Dias, Nuno F. Rodrigues and Ruck Thawonmas ( https://ieeexplore.ieee.org/ xpl/tocresult.jsp?isnumber= 9301448&punumber=7782673 )
Game Competition Frameworks for Research and Education - Guest editors: Jialin Liu, Diego Perez-Liebana, Tristan Cazenave and Ruck Thawonmas ( https://ieeexplore.ieee.org/ xpl/tocresult.jsp?isnumber= 8839649&punumber=7782673 )
AI-Based and AI-Assisted Game Design - Guest editors: Antonios Liapis, Georgios N. Yannakakis, Michael Cook and Simon Colton ( https://ieeexplore.ieee.org/ xpl/tocresult.jsp?isnumber= 8667748&punumber=7782673 )
New Proposals
If you are interested in guest editing a special issue, you should send a proposal to the Editor-in-Chief ( Georgios N. Yannakakis ).
Preparing your Proposal
- Please prepare a Call for Papers which will ultimately be distributed. We would suggest that this is a maximum of one side of A4.
- Please provide an additional page, that presents short biographies of the proposed guest editors and why they are qualified to guest edit a special of the Transactions of Games in the area being suggested.
- The proposal will be sent to the journal’s Associate Editors for comment. You may be asked to revise the proposal and this iterative process will continue until a decision has been made.
Other Points to Note
- The proposed guest editors must include a current Associate Editor of ToG. This will ensure that the same standards are maintained for special issues, as for regular issues.
- A guest editor can submit a maximum of ONE paper with them as an author.
- A special issue cannot be related to a particular event (e.g. a conference) and submissions should be open to all
- The editorial that is eventually written to introduce the special issue must not mention specific conferences or events
Previous Special Issues
- Computational Aesthetics in Games ( Volume 4, Issue 3 and Editorial )
- Computational Narrative and Games ( Volume 6, Issue 2 and Editorial )
- General Games ( Volume 6, Issue 4 and Editorial )
- Age of Analytics ( Volume 7, Issue 3 and Editorial )
- Physics Simulation ( Volume 8, Issue 2 and Editorial )
- Real Time Strategy ( Volume 8, Issue 4 and Editorial )
- Welcome from the Vice President for Publications
- Editors and Associate Editors
- IEEE Xplore (TG)
- IEEE Xplore (TCIAIG)
- Information for Authors
- Types of Contributions
- Manuscript Format
- Manuscript Submission
- IEEE Copyright
- Page Charges
- Professional Editing Services
- Plagiarism and Ethical Issues
- Recent Articles
- Most Accessed Articles
- Advertising
- IEEE CS Standards
- Career Center
- Subscribe to Newsletter
- IEEE Standards
- For Industry Professionals
- For Students
- Launch a New Career
- Membership FAQ
- Membership FAQs
- Membership Grades
- Special Circumstances
- Discounts & Payments
- Distinguished Contributor Recognition
- Grant Programs
- Find a Local Chapter
- Find a Distinguished Visitor
- About Distinguished Visitors Program
- Find a Speaker on Early Career Topics
- Technical Communities
- Collabratec (Discussion Forum)
- My Subscriptions
- My Referrals
- Computer Magazine
- ComputingEdge Magazine
- Let us help make your event a success. EXPLORE PLANNING SERVICES
- Events Calendar
- Calls for Papers
- Conference Proceedings
- Conference Highlights
- Top 2024 Conferences
- Conference Sponsorship Options
- Conference Planning Services
- Conference Organizer Resources
- Virtual Conference Guide
- Get a Quote
- CPS Dashboard
- CPS Author FAQ
- CPS Organizer FAQ
- Find the latest in advanced computing research. VISIT THE DIGITAL LIBRARY
- Open Access
- Tech News Blog
- Author Guidelines
- Reviewer Information
- Guest Editor Information
- Editor Information
- Editor-in-Chief Information
- Volunteer Opportunities
- Video Library
- Member Benefits
- Institutional Library Subscriptions
- Advertising and Sponsorship
- Code of Ethics
- Educational Webinars
- Online Education
- Certifications
- Industry Webinars & Whitepapers
- Research Reports
- Bodies of Knowledge
- CS for Industry Professionals
- Resource Library
- Newsletters
- Women in Computing
- Digital Library Access
- Organize a Conference
- Run a Publication
- Become a Distinguished Speaker
- Participate in Standards Activities
- Peer Review Content
- Author Resources
- Publish Open Access
- Society Leadership
- Boards & Committees
- Local Chapters
- Governance Resources
- Conference Publishing Services
- Chapter Resources
- About the Board of Governors
- Board of Governors Members
- Diversity & Inclusion
- Open Volunteer Opportunities
- Award Recipients
- Student Scholarships & Awards
- Nominate an Election Candidate
- Nominate a Colleague
- Corporate Partnerships
- Conference Sponsorships & Exhibits
- Advertising
- Recruitment
- Publications
- Education & Career
TCPAMI's Grand Collection
Computer vision is experiencing remarkable growth, fueled by breakthroughs in pattern recognition, artificial intelligence, and image processing. TCPAMI's Grand Collection is a compendium of computer vision research that received the Best Paper Award in 2023 from CVPR, WACV, ICCV, and ICCP, covering a range of topics from decision-making of autonomous driving systems to visual programming techniques.
These conferences, organized by the Technical Community for Pattern Analysis and Machine Intelligence (TCPAMI), collectively highlight the rapid advancements and diverse applications of computer vision technology, offering valuable insights and collaboration opportunities for computer vision experts.
Download TCPAMI's Grand Collection to stay abreast of the topics influencing research in 2024. Paper titles include:
- Lossy Image Compression with Quantized Hierarchical VAEs
- Exemplar Guided Deep Neural Network for Spatial Transcriptomics Analysis of Gene Expression Prediction
- Visual Programming: Compositional visual reasoning without training
- Planning-oriented Autonomous Driving
- Passive Ultra-Wideband Single-Photon Imaging
- Adding Conditional Control to Text-to-Image Diffusion Models
Learn more about the Technical Community on Pattern Analysis and Machine Intelligence (TCPAMI) and search upcoming events in pattern recognition, computer vision, AI, and more.
Download TCPAMI's Grand Collection
- Name * First Last
- Country/Region * Country/Region Afghanistan Albania Algeria Andorra Angola Anguilla Antigua and Barbuda Argentina Armenia Aruba Australia Austria Azerbaijan Bahamas Bahrain Bangladesh Barbados Belarus Belgium Belize Benin Bermuda Bhutan Bolivia Bonaire, Sint Eustatius, Saba Bosnia and Herzegovina Botswana Brazil Brunei Darussalam Bulgaria Burkina Faso Burundi Cabo Verde Cambodia Cameroon Canada Cayman Islands Central African Republic Chad Chile China Colombia Comoros Congo Congo, Democratic Republic of Cook Islands Costa Rica Cote d'Ivoire Croatia Cuba Curacao Cyprus Czech Republic Denmark Djibouti Dominica Dominican Republic Ecuador Egypt El Salvador Equatorial Guinea Eritrea Estonia Eswatini Ethiopia Falkland Islands (Malvinas) Faroe Islands Fiji Finland France French Guiana French Polynesia Gabon Gambia Georgia Germany Ghana Gibraltar Greece Greenland Grenada Guadeloupe Guatemala Guinea Guinea-Bissau Guyana Haiti Honduras Hong Kong Hungary Iceland India Indonesia Iran, Islamic Republic of Iraq Ireland Isle of Man Israel Italy Jamaica Japan Jordan Kazakhstan Kenya Kiribati Korea (North) Korea, Republic of Kosovo (UNMIK) Kuwait Kyrgyzstan Laos Latvia Lebanon Lesotho Liberia Libya Liechtenstein Lithuania Luxembourg Macao Madagascar Malawi Malaysia Maldives Mali Malta Martinique Mauritania Mauritius Mayotte Mexico Moldova, Republic of Monaco Mongolia Montenegro Montserrat Morocco Mozambique Myanmar Namibia Nauru Nepal Netherlands New Caledonia New Zealand Nicaragua Niger Nigeria Niue Norfolk Island North Macedonia Norway Oman Pakistan Palestine, State of Panama Papua New Guinea Paraguay Peru Philippines Pitcairn Poland Portugal Qatar Reunion Romania Russian Federation Rwanda Saint Kitts and Nevis Saint Lucia Samoa San Marino Sao Tome and Principe Saudi Arabia Senegal Serbia Seychelles Sierra Leone Singapore Sint Maarten Slovakia Slovenia Solomon Islands Somalia South Africa South Sudan Spain Sri Lanka St. Helena St. Vincent and the Grenadines Sudan Suriname Svalbard and Jan Mayen Sweden Switzerland Syrian Arab Republic Taiwan Tajikistan Tanzania, United Republic of Thailand Timor-Leste Togo Tokelau Tonga Trinidad and Tobago Tunisia Turkey Turkmenistan Turks And Caicos Islands Tuvalu Uganda Ukraine United Arab Emirates United Kingdom Uruguay USA Uzbekistan Vatican City Venezuela Vietnam Virgin Islands (British) Wallis And Futuna Western Sahara Yemen Zambia Zimbabwe
- Job Title * Select a Job Title 1. Chairman of the Board/President/CEO 2. Owner/Partner 3. General Manager 4. V.P. Operations 5. V.P. Engineering/Director Engineering 6. Chief Engineer/Chief Scientist 7. Engineering Manager 8. Scientific Manager 9. Member of Technical Staff 10. Design Engineering Manager 11. Design Engineer 12. Hardware Engineer 13. Software Engineer 14. Computer Scientist 15. Dean/Professor/Instructor 16. Consultant 17. Retired 18. Other Professional/Technical 19. Other Professional/Non-Technical 20. Student
- Bachelor’s Degree
- Master's Degree
- Doctoral Degree
- Vocational/Bootcamp
- High School/No Professional Schooling
- I agree to the IEEE Privacy Policy
- Name This field is for validation purposes and should be left unchanged.
- Resource Library - reports, guides, and technology predictions
- Computer Society Digital Library
- Membership Benefits for Industry Practioners
- Partner with us to advance computing
Sign up for our newsletter.
EMAIL ADDRESS
IEEE COMPUTER SOCIETY
- Board of Governors
- IEEE Support Center
DIGITAL LIBRARY
- Librarian Resources
COMPUTING RESOURCES
- Courses & Certifications
COMMUNITY RESOURCES
- Conference Organizers
- Communities
BUSINESS SOLUTIONS
- Conference Sponsorships & Exhibits
- Digital Library Institutional Subscriptions
- Accessibility Statement
- IEEE Nondiscrimination Policy
- XML Sitemap
©IEEE — All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A not-for-profit organization, the Institute of Electrical and Electronics Engineers (IEEE) is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.
Help | Advanced Search
Computer Science > Computer Vision and Pattern Recognition
Title: rank-based no-reference quality assessment for face swapping.
Abstract: Face swapping has become a prominent research area in computer vision and image processing due to rapid technological advancements. The metric of measuring the quality in most face swapping methods relies on several distances between the manipulated images and the source image, or the target image, i.e., there are suitable known reference face images. Therefore, there is still a gap in accurately assessing the quality of face interchange in reference-free scenarios. In this study, we present a novel no-reference image quality assessment (NR-IQA) method specifically designed for face swapping, addressing this issue by constructing a comprehensive large-scale dataset, implementing a method for ranking image quality based on multiple facial attributes, and incorporating a Siamese network based on interpretable qualitative comparisons. Our model demonstrates the state-of-the-art performance in the quality assessment of swapped faces, providing coarse- and fine-grained. Enhanced by this metric, an improved face-swapping model achieved a more advanced level with respect to expressions and poses. Extensive experiments confirm the superiority of our method over existing general no-reference image quality assessment metrics and the latest metric of facial image quality assessment, making it well suited for evaluating face swapping images in real-world scenarios.
Comments: | 8 pages, 5 figures |
Subjects: | Computer Vision and Pattern Recognition (cs.CV) |
Cite as: | [cs.CV] |
(or [cs.CV] for this version) | |
Focus to learn more arXiv-issued DOI via DataCite |
Submission history
Access paper:.
- HTML (experimental)
- Other Formats
References & Citations
- Google Scholar
- Semantic Scholar
BibTeX formatted citation
Bibliographic and Citation Tools
Code, data and media associated with this article, recommenders and search tools.
- Institution
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .
IEEE Account
- Change Username/Password
- Update Address
Purchase Details
- Payment Options
- Order History
- View Purchased Documents
Profile Information
- Communications Preferences
- Profession and Education
- Technical Interests
- US & Canada: +1 800 678 4333
- Worldwide: +1 732 981 0060
- Contact & Support
- About IEEE Xplore
- Accessibility
- Terms of Use
- Nondiscrimination Policy
- Privacy & Opting Out of Cookies
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.
IMAGES
VIDEO
COMMENTS
As the deep learning exhibits strong advantages in the feature extraction, it has been widely used in the field of computer vision and among others, and gradually replaced traditional machine learning algorithms. This paper first reviews the main ideas of deep learning, and displays several related frequently-used algorithms for computer vision. Afterwards, the current research status of ...
Profile Information. Communications Preferences. Profession and Education. Technical Interests. Need Help? US & Canada:+1 800 678 4333. Worldwide: +1 732 981 0060. Contact & Support. About IEEE Xplore.
With the development of artificial intelligence, computer vision technology that simulates human vision has received widespread attention. Based on the current commonly used method of computer vision technology-deep learning, this paper outlines the development of deep learning models, and determines the inflection point of the development of the introduction of convolutional neural networks ...
In the previous research, many scholars (Dosovitskiy and Brox, ... They commented in their paper "computer vision is already being put to questionable use and as researchers, we have a responsibility to at least consider the harm our work might be doing and think of ways to mitigate it. ... Proceedings of the 2017 ieee conference on computer ...
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) aims to initiate further discussion within computer vision applications and research. In 2022, it was encouraged that researchers submit papers and proposals including potential negative societal impacts of their proposed research and possible methods on how to mitigate them.
In robotics, computer vision modeling empowers robots to perceive and interact with their environments effectively. This paper presents a groundbreaking effort to develop a computer vision model tailored to underwater robotic systems aimed explicitly at detecting multiple objects of interest beneath the water's surface. The primary focus of our model lies in accurately differentiating between ...
The Thirty-Fourth IEEE/CVF Conference on Computer Vision and Pattern Recognition. Main conference ... original research. Topics of interest include all aspects of computer vision and pattern recognition including, but not limited to: 3D from multi-view and sensors ... All accepted papers will be made publicly available by the Computer Vision ...
Abstract: The author provides a general introduction to computer vision. He discusses basic techniques and computer implementations, and also indicates areas in which further research is needed. He focuses on two-dimensional object recognition, i.e. recognition of an object whose spatial orientation, relative to the viewing direction is known.
These CVPR 2021 papers are the Open Access versions, provided by the Computer Vision Foundation. Except for the watermark, they are identical to the accepted versions; the final published version of the proceedings is available on IEEE Xplore. This material is presented to ensure timely dissemination of scholarly and technical work.
Evolutionary computer vision (ECV) is at the intersection of two major research fields of artificial intelligence: computer vision and evolutionary computation. This special issue aims to provide an overview of state-of-the-art contributions to the latest research and development in the discipline. Computer vision includes methods for acquiring ...
Reviews Released: 24 January 2023. Rebuttal Period: 24-31 January 2023. Final Decisions: 27 February 2023. Papers in the main technical program must describe high-quality, original research. Topics of interest include all aspects of computer vision and pattern recognition including, but not limited to: 3D from multi-view and sensors.
Computer Vision is a concept which works with the methods for automatic extraction, analysis and understanding of useful information from a single image or a sequence of images. To fulfill the new challenges, the system which is proposed in this paper, is mainly being used for object detection and recognition of images so that the image search engine becomes more fruitful. A large dataset of ...
The Thirty-Fourth IEEE/CVF Conference on Computer Vision and Pattern Recognition. ... Call for Papers. Papers in the main technical program must describe high-quality, original research. Topics of interest include all aspects of computer vision and pattern recognition including, but not limited to: ... (Microsoft Research)
Computer Vision Foundation. These research papers are the Open Access versions, provided by the Computer Vision Foundation. Except for the watermark, they are identical to the accepted versions; the final published version of the proceedings is available on IEEE Xplore. This material is presented to ensure timely dissemination of scholarly and ...
With the gradual improvement of artificial intelligence technology, image processing has become a common technology and is widely used in various fields to provide people with high-quality services. Starting from computer vision algorithms and image processing technologies, the computer vision display system is designed, and image distortion correction algorithms are explored for reference.
S3 Gaussian: Self-Supervised Street Gaussians for Autonomous Driving. Nan Huang, Xiaobao Wei, Wenzhao Zheng, Pengju An, Ming Lu, Wei Zhan, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang. Comments: Code is available at: this https URL. Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Therefore, this paper makes an in-depth study of computer vision image multimedia technology under the background of big data. In the research, this paper systematically describes the current computer vision image multimedia technology, as well as the future development direction of computer vision image multimedia technology.
Computer Vision and Image Analysis: Past, Present, and Future Trends Ying Bi, Member, IEEE, Bing Xue, Senior Member, IEEE, Pablo Mesejo, Stefano Cagnoni, Senior Member, IEEE, Mengjie Zhang, Fellow, IEEE, Abstract—Computer vision (CV) is a big and important field in artificial intelligence covering a wide range of applications.
In the field of fire safety, enhanced detection systems are crucial for mitigating hazards and facilitating rapid responses. This research presents a novel hybrid and layered fire detection system that integrates machine learning (ML) and computer vision (CV) methodologies with extensive categorization to enhance fire protection in office buildings. To accurately evaluate and detect fire ...
The machine learning and computer vision research is still evolving [1]. Computer vision is an essential part of Internet of Things, Industrial Internet of Things, and brain human interfaces. The complex human activities are recognized and monitored in multimedia streams using machine learning and computer vison.
Abstract: This research focuses on constructing an efficient image processing model, which is rooted in computer vision algorithms, to ameliorate image distortion and optimize visual display systems. The article initially discusses the significance of image distortion correction and its processing flow, encompassing detailed steps ranging from image preprocessing, model construction, parameter ...
WACV 2024 solicits high-quality, original submissions describing research in computer vision, with a particular emphasis on systems and applications with significant, interesting vision components. Application areas include, but are not limited to: Authors are also encouraged to submit more traditional computer vision algorithms papers. Topics ...
Stay informed on the latest trending ML papers with code, research developments, libraries, methods, and datasets. ... You can create a new account if you don't have one. Browse SoTA > Computer Vision Computer Vision. 4685 benchmarks • 1446 tasks • 3046 datasets • 48211 papers with code Semantic Segmentation ... 5342 papers with code
The role of computer vision technology in the field of artificial intelligence development is very important, but there is a problem of poor application effect of key technologies. Traditional neural network algorithms cannot solve the problems of image classification and inaccurate image detection in computer visual perception tasks. In the tide of artificial intelligence, the combination of ...
Special Issue on Computer Vision and Games: Call for Papers. Video Games and Computer Vision research have long held a symbiotic relationship. On the one hand, virtual worlds in games are often used for collecting training data or as testbeds for computer vision models since they provide a greater deal of flexibility, control and scalability in ...
TCPAMI's Grand Collection is a compendium of computer vision research that received the Best Paper Award in 2023 from CVPR, WACV, ICCV, and ICCP, covering a range of topics from decision-making of autonomous driving systems to visual programming techniques. These conferences, organized by the Technical Community for Pattern Analysis and Machine ...
The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image ...
Face swapping has become a prominent research area in computer vision and image processing due to rapid technological advancements. The metric of measuring the quality in most face swapping methods relies on several distances between the manipulated images and the source image, or the target image, i.e., there are suitable known reference face images. Therefore, there is still a gap in ...
In this paper, a tourism information forecasting system in computer aided environment is established. In this way, the sparsity of various data can be effectively integrated. In particular, forecasts for foreign visitors have been optimized. This paper aims to combine machine vision and deep learning to build a new efficient multi-source information perception method. According to the basic ...
All accepted full papers will be published in IEEE(ISBN: 979-8-3503-8843-5) and will be submitted to IEEE Xplore, EI Compendex, Scopus and Inspec for indexing. Important Dates: Full Paper ...