computer vision research papers ieee

Main Navigation

Code of Conduct
Create Profile
Reset / Forgot Password
Privacy Policy
Contact CVPR

June 18-22, 2023, Vancouver, Canada

The Thirty-Fourth IEEE/CVF Conference on Computer Vision and Pattern Recognition

Submission website: https://openreview.net/group?id=thecvf.com/CVPR/2023/Conference

Call for Papers

Papers in the main technical program must describe high-quality, original research. Topics of interest include all aspects of computer vision and pattern recognition including, but not limited to:

Important Dates

Paper Registration Deadline*:
Submission Deadline*:
Supplementary Materials Deadline:
Reviews Released:
Rebuttal Period:
Final Decisions:

Fri, Nov 4, 2022 11:59pm Pacific Time
Fri, Nov 11, 2022 11:59pm Pacific Time
Fri, Nov 18, 2022 11:59pm Pacific Time
Tue, Jan 24, 2023
Tue, Jan 24-31, 2023
Mon, Feb 27, 2023

* Date is fixed, no extension will be given

Paper Submission

All submissions will be handled electronically via the OpenReview conference submission website https://openreview.net/group?id=thecvf.com/CVPR/2023/Conference . All authors must agree to the policies stipulated below. The submission deadline is November 11, 2022 and will not be changed. Supplementary materials can be submitted until November 18, 2022.

In submitting a manuscript to CVPR, authors acknowledge that no paper substantially similar in content has been or will be submitted to another conference or workshop during the review period (November 11, 2022 – February 27, 2023). Please refer to the Author Guidelines on the conference web site for additional details on dual submissions and guidelines concerning prior work.

By submitting a paper to CVPR, the authors agree to the review process and understand that papers are processed by OpenReview to match each manuscript to the best possible area chairs and reviewers.

All accepted papers will be made publicly available by the Computer Vision Foundation (CVF) two weeks before the conference. Authors wishing to submit a patent understand that the paper's official public disclosure is two weeks before the conference or whenever the authors make it publicly available, whichever is first. More information about CVF is available at http://www.cv-foundation.org/ .

Tutorials and Workshops

In addition to the main technical program, the conference will include tutorials and workshops. Information about these can be found on tabs on the main CVPR web page:

Call for Tutorials
Call for Workshop Proposals

For further information and updates about the conference, visit the main conference website, at http://cvpr2023.thecvf.com .

Organizing Committee

Conference producer, general chair, program chair, tutorial chair, workshop chair, social chair, senior pami-tc ombud, conference ombud, virtual platform chair, technical chair, social activities chair, demonstration chair, local chair, publications chair, publicity chair, finance chair, accessibility chair, doctoral consortium chair, program advisory board, web developer.

CVF Sponsored Conferences

Cvf sponsored conferences errata.

It is the policy of the Computer Vision Foundation to maintain PDF copies of conference papers as submitted during the camera-ready paper collection. These papers are considered the final published versions of the work. We recognize the need for minor corrections after publication, and thus provide links to arXiv versions of the papers where available. If a correction must be made, it should be made as an update to the arXiv version of the paper by the authors. The CVF maintainers should then be notified of the update via email ( [email protected] ). The conference open access website will be updated periodically to indicate changes made to an arXiv version since the original conference publication date. The original camera-ready version of the paper will be maintained within the open access archive, and will not be removed or replaced by request.

Other Computer Vision Conferences and Workshops

Registration
DEI and Code of Conduct
Travel Awards
Call For Papers
Reviewer Guidelines
Author Guidelines
Call For Workshops
Call for Tutorial Proposals
Call For Demos
Doctoral Consortium
Local Poster Printing
BECOME A SPONSOR

SUBMISSIONS

Wacv 2024 call for papers.

IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) provides a forum for computer vision researchers working on practical applications and innovative algorithms to share their latest developments. WACV 2024 solicits high-quality, original submissions describing research in computer vision, with a particular emphasis on systems and applications with significant, interesting vision components.

Application areas include, but are not limited to:

Agriculture
Animals and Insects
Arts, games, and social media
Autonomous driving
Biomedical, healthcare, and medicine
Commercial and retail
Embedded sensing and real-time techniques
Environmental monitoring, climate change, and ecology
Food science and nutrition
Psychology and cognitive science
Remote sensing
Smartphones and end-user devices
Social good
Structural engineering and civil engineering
Virtual and augmented reality
Visualization

Authors are also encouraged to submit more traditional computer vision algorithms papers. Topics of interest include, but are not limited to:

3D computer vision
Adversarial learning, adversarial attack, and defense methods
Biometrics, face, gesture, and body pose
Computational photography
Generative models for image, video, 3D, etc.
Datasets and evaluations
Explainable, fair, accountable, privacy-preserving, and ethical computer vision
Image recognition and understanding (object detection, categorization, segmentation, scene modeling, visual reasoning)
Low-level and physics-based vision
Machine learning architectures, formulations, and algorithms (including transfer, low-shot, semi-, self-, and unsupervised learning)
Video recognition and understanding (tracking, action recognition, etc.)
Vision + language and/or other modalities

All submissions will be handled electronically through CMT: https://cmt3.research.microsoft.com/WACV2024

Papers can be submitted to either the applications or the algorithms tracks, which will have different review criteria. Applications papers will be evaluated on systems-level innovation, novelty of the domain and comparative assessment. Algorithms papers will be evaluated according to the standard conference criteria including algorithmic novelty and quantified evaluation against current, alternative approaches.

Deadlines (**note that all PT timezone deadlines are AM not PM) :

Paper registration: June 21st, 2023 11:59 AM PT (June 21st, 2023 06:59 PM GMT)
Submission: June 28th, 2023 11:59 AM PT (June 28th, 2023 06:59 PM GMT)
Supplementary material deadline: June 30th, 11:59 AM PT (June 30th, 2023 06:59 PM GMT)
Reviews and Decisions released to authors: Aug 11th, 2023 Aug 14th, 2023
Rebuttal and Revision submission: Aug. 30th, 2023 11:59 AM PT (Aug. 30th, 2023 06:59 PM GMT)
Final Decisions released to authors: Oct 20th, 2023
Camera ready deadline: November 7 th , 11:59 PM, PT
Author Registration deadline: November 7 th , 11:59 PM, PT
Paper registration: Aug 23rd, 2023 11:59 AM PT (Aug. 23th, 2023 06:59 PM GMT)
Submission: Aug. 30th, 2023 11:59 AM PT (Aug. 30th, 2023 06:59 PM GMT)
Supplementary material deadline: September 1st, 11:59 AM PT (September 1st, 2023 06:59 PM GMT)
Reviews and Final Decisions released to authors: Oct 20th, 2023

Camera ready submission instructions are here: https://docs.google.com/document/d/1UrN19rmmotCk10FajE1y0Zh0W4ruL3uGQHU_oEzK5pY/

As in previous years, WACV 2024 will employ a two-round review process. New papers can be submitted in either the first or the second round. The primary benefit of submitting in Round 1 is that submissions which are not accepted early in the first round can be revised and resubmitted along with a rebuttal, enabling authors to address reviewer concerns. Round 2 submissions will not have a rebuttal.

Please direct any questions to the Program Chairs ( [email protected] ).

Subscribe to the PwC Newsletter

Join the community, computer vision, semantic segmentation.

Tumor Segmentation

Panoptic Segmentation

3D Semantic Segmentation

Weakly-Supervised Semantic Segmentation

Representation learning.

Disentanglement

Graph representation learning, sentence embeddings.

Network Embedding

Classification.

Text Classification

Graph Classification

Audio Classification

Medical Image Classification

Object detection.

3D Object Detection

Real-Time Object Detection

RGB Salient Object Detection

Few-Shot Object Detection

Image classification.

Out of Distribution (OOD) Detection

Few-Shot Image Classification

Fine-Grained Image Classification

Semi-Supervised Image Classification

2d object detection.

Edge Detection

Thermal image segmentation.

Open Vocabulary Object Detection

Reinforcement learning (rl), off-policy evaluation, multi-objective reinforcement learning, 3d point cloud reinforcement learning, deep hashing, table retrieval, domain adaptation.

Unsupervised Domain Adaptation

Domain Generalization

Test-time Adaptation

Source-free domain adaptation, image generation.

Image-to-Image Translation

Text-to-Image Generation

Image Inpainting

Conditional Image Generation

Data augmentation.

Image Augmentation

Text Augmentation

Autonomous vehicles.

Autonomous Driving

Self-Driving Cars

Simultaneous Localization and Mapping

Autonomous Navigation

Image Denoising

Color Image Denoising

Sar Image Despeckling

Grayscale image denoising, meta-learning.

Few-Shot Learning

Sample Probing

Universal meta-learning, contrastive learning.

Super-Resolution

Image Super-Resolution

Video Super-Resolution

Multi-Frame Super-Resolution

Reference-based Super-Resolution

Pose estimation.

3D Human Pose Estimation

Keypoint Detection

3D Pose Estimation

6D Pose Estimation

Self-supervised learning.

Point Cloud Pre-training

Unsupervised video clustering, 2d semantic segmentation, image segmentation, text style transfer.

Scene Parsing

Reflection Removal

Visual question answering (vqa).

Visual Question Answering

Machine Reading Comprehension

Chart Question Answering

Embodied Question Answering

Depth Estimation

3D Reconstruction

Neural Rendering

3D Face Reconstruction

Anomaly detection.

Unsupervised Anomaly Detection

One-Class Classification

Supervised anomaly detection, anomaly detection in surveillance videos, sentiment analysis.

Aspect-Based Sentiment Analysis (ABSA)

Multimodal Sentiment Analysis

Aspect Sentiment Triplet Extraction

Twitter Sentiment Analysis

Temporal Action Localization

Video Understanding

Video generation.

Video Object Segmentation

Action Classification

3d object super-resolution.

One-Shot Learning

Few-Shot Semantic Segmentation

Cross-domain few-shot.

Unsupervised Few-Shot Learning

Activity recognition.

Action Recognition

Human Activity Recognition

Egocentric activity recognition.

Group Activity Recognition

Medical image segmentation.

Lesion Segmentation

Brain Tumor Segmentation

Cell Segmentation

Skin lesion segmentation, monocular depth estimation.

Stereo Depth Estimation

Depth and camera motion.

3D Depth Estimation

Exposure fairness, optical character recognition (ocr).

Active Learning

Handwriting Recognition

Handwritten digit recognition, irregular text recognition, instance segmentation.

Referring Expression Segmentation

3D Instance Segmentation

Real-time Instance Segmentation

Unsupervised Object Segmentation

Facial recognition and modelling.

Face Recognition

Face Swapping

Face Detection

Facial Expression Recognition (FER)

Face Verification

Object tracking.

Multi-Object Tracking

Visual Object Tracking

Multiple Object Tracking

Cell Tracking

Zero-shot learning.

Generalized Zero-Shot Learning

Compositional Zero-Shot Learning

Multi-label zero-shot learning, quantization, data free quantization, unet quantization, continual learning.

Class Incremental Learning

Continual named entity recognition, unsupervised class-incremental learning.

Action Recognition In Videos

3D Action Recognition

Self-supervised action recognition, few shot action recognition.

Scene Understanding

Scene Text Recognition

Scene Graph Generation

Scene Recognition

Adversarial attack.

Backdoor Attack

Adversarial Text

Adversarial attack detection, real-world adversarial attack, active object detection, image retrieval.

Sketch-Based Image Retrieval

Content-Based Image Retrieval

Composed Image Retrieval (CoIR)

Medical Image Retrieval

Dimensionality reduction.

Supervised dimensionality reduction

Online nonnegative cp decomposition, emotion recognition.

Speech Emotion Recognition

Emotion Recognition in Conversation

Multimodal Emotion Recognition

Emotion-cause pair extraction.

Monocular 3D Object Detection

3D Object Detection From Stereo Images

Multiview Detection

Robust 3d object detection, image reconstruction.

MRI Reconstruction

Film Removal

Style transfer.

Image Stylization

Font style transfer, style generalization, face transfer, optical flow estimation.

Video Stabilization

Action localization.

Action Segmentation

Spatio-temporal action localization, image captioning.

3D dense captioning

Controllable image captioning, aesthetic image captioning.

Relational Captioning

Person re-identification.

Unsupervised Person Re-Identification

Video-based person re-identification, generalizable person re-identification, cloth-changing person re-identification, image restoration.

Demosaicking

Spectral reconstruction, underwater image restoration.

JPEG Artifact Correction

Visual relationship detection, lighting estimation.

3D Room Layouts From A Single RGB Panorama

Road scene understanding, action detection.

Skeleton Based Action Recognition

Online Action Detection

Audio-visual active speaker detection, metric learning.

Object Recognition

3D Object Recognition

Continuous object recognition.

Depiction Invariant Object Recognition

Monocular 3D Human Pose Estimation

Pose prediction.

3D Multi-Person Pose Estimation

3d human pose and shape estimation, image enhancement.

Low-Light Image Enhancement

Image relighting, de-aliasing, multi-label classification.

Missing Labels

Extreme multi-label classification, hierarchical multi-label classification, medical code prediction, continuous control.

Steering Control

Drone controller, 3d face modelling.

Semi-Supervised Video Object Segmentation

Unsupervised Video Object Segmentation

Referring Video Object Segmentation

Video Salient Object Detection

Trajectory prediction.

Trajectory Forecasting

Human motion prediction, out-of-sight trajectory prediction.

Multivariate Time Series Imputation

Image quality assessment, no-reference image quality assessment, blind image quality assessment.

Aesthetics Quality Assessment

Stereoscopic image quality assessment, novel view synthesis.

Novel LiDAR View Synthesis

Gournd video synthesis from satellite image

Object localization.

Weakly-Supervised Object Localization

Image-based localization, unsupervised object localization, monocular 3d object localization.

Blind Image Deblurring

Single-image blind deblurring, out-of-distribution detection, video semantic segmentation.

Camera shot segmentation

Cloud removal.

Facial Inpainting

Fine-Grained Image Inpainting

2d classification.

Face Generation

Neural Network Compression

Music Source Separation

Cell detection.

Plant Phenotyping

Instruction following, visual instruction following, change detection.

Semi-supervised Change Detection

Saliency detection.

Saliency Prediction

Co-Salient Object Detection

Video saliency detection, unsupervised saliency detection, prompt engineering.

Visual Prompting

Image compression.

Feature Compression

Jpeg compression artifact reduction.

Lossy-Compression Artifact Reduction

Color image compression artifact reduction, explainable artificial intelligence, explainable models, explanation fidelity evaluation, fad curve analysis, image registration.

Unsupervised Image Registration

Ensemble learning, salient object detection, saliency ranking, visual reasoning.

Visual Commonsense Reasoning

Visual tracking.

Point Tracking

Rgb-t tracking, real-time visual tracking.

RF-based Visual Tracking

3d point cloud classification.

3D Object Classification

Few-Shot 3D Point Cloud Classification

Supervised only 3d point cloud classification, zero-shot transfer 3d point cloud classification, motion estimation, image manipulation detection.

Zero Shot Skeletal Action Recognition

Generalized zero shot skeletal action recognition, whole slide images, visual grounding.

3D visual grounding

Person-centric visual grounding.

Phrase Extraction and Grounding (PEG)

Activity prediction, motion prediction, cyber attack detection, sequential skip prediction, gesture recognition.

Hand Gesture Recognition

Hand-Gesture Recognition

RF-based Gesture Recognition

Video question answering.

Zero-Shot Video Question Answer

Few-shot video question answering, text detection, video captioning.

Dense Video Captioning

Boundary captioning, visual text correction, audio-visual video captioning, point cloud registration.

Image to Point Cloud Registration

Robust 3D Semantic Segmentation

Real-Time 3D Semantic Segmentation

Unsupervised 3D Semantic Segmentation

Furniture segmentation, medical diagnosis.

Alzheimer's Disease Detection

Retinal OCT Disease Classification

Blood cell count, thoracic disease classification, 3d point cloud interpolation, visual odometry.

Face Anti-Spoofing

Monocular visual odometry.

Hand Pose Estimation

Hand Segmentation

Gesture-to-gesture translation, rain removal.

Single Image Deraining

Image clustering.

Online Clustering

Face Clustering

Multi-view subspace clustering, multi-modal subspace clustering.

Image Dehazing

Single Image Dehazing

Colorization.

Line Art Colorization

Point-interactive Image Colorization

Color Mismatch Correction

Robot navigation.

PointGoal Navigation

Social navigation.

Sequential Place Learning

Image manipulation, conformal prediction, image editing, rolling shutter correction, shadow removal, multimodel-guided image editing, joint deblur and frame interpolation, multimodal fashion image editing, visual place recognition.

Indoor Localization

3d place recognition, visual localization.

DeepFake Detection

Synthetic Speech Detection

Human detection of deepfakes, multimodal forgery detection.

Unsupervised Image-To-Image Translation

Synthetic-to-Real Translation

Multimodal Unsupervised Image-To-Image Translation

Cross-View Image-to-Image Translation

Fundus to Angiography Generation

Stereo matching.

Crowd Counting

Visual Crowd Analysis

Group detection in crowds, object reconstruction.

3D Object Reconstruction

Human-object interaction detection.

Affordance Recognition

Earth observation, image deblurring, low-light image deblurring and enhancement, image matching.

Semantic correspondence

Patch matching, set matching.

Matching Disparate Images

Video quality assessment, video alignment, temporal sentence grounding, long-video activity recognition, point cloud classification, jet tagging, few-shot point cloud classification, hyperspectral.

Hyperspectral Image Classification

Hyperspectral unmixing, hyperspectral image segmentation, classification of hyperspectral images, document text classification.

Learning with noisy labels

Multi-label classification of biomedical texts, political salient issue orientation detection, 3d point cloud reconstruction.

Weakly Supervised Action Localization

Weakly-supervised temporal action localization.

Temporal Action Proposal Generation

Activity recognition in videos, scene classification.

2D Human Pose Estimation

Action anticipation.

3D Face Animation

Semi-supervised human pose estimation, point cloud generation, point cloud completion, referring expression, reconstruction, 3d human reconstruction.

Single-View 3D Reconstruction

4d reconstruction, single-image-based hdr reconstruction, compressive sensing, keyword spotting.

Small-Footprint Keyword Spotting

Visual keyword spotting, scene text detection.

Curved Text Detection

Multi-oriented scene text detection, camera calibration, boundary detection.

Junction Detection

Image matting.

Semantic Image Matting

Video retrieval, video-text retrieval, video grounding, video-adverb retrieval, replay grounding, composed video retrieval (covr), cross-modal retrieval, image-text matching, cross-modal retrieval with noisy correspondence, multilingual cross-modal retrieval.

Zero-shot Composed Person Retrieval

Cross-modal retrieval on rsitmd, document ai, document understanding, emotion classification.

Motion Synthesis

Motion Style Transfer

Temporal human motion composition, video summarization.

Unsupervised Video Summarization

Supervised video summarization, point cloud segmentation, superpixels.

Sensor Fusion

Few-Shot Transfer Learning for Saliency Prediction

Aerial Video Saliency Prediction

3d anomaly detection, video anomaly detection, artifact detection, remote sensing.

Remote Sensing Image Classification

Change detection for remote sensing images, building change detection for remote sensing images.

Segmentation Of Remote Sensing Imagery

The Semantic Segmentation Of Remote Sensing Imagery

Document layout analysis.

Talking Head Generation

Talking face generation.

Face Age Editing

Facial expression generation, kinship face generation.

Point cloud reconstruction

3D Semantic Scene Completion

3D Semantic Scene Completion from a single RGB image

Garment reconstruction, human detection.

Video Instance Segmentation

Privacy Preserving Deep Learning

Membership inference attack.

Generalized Few-Shot Semantic Segmentation

Video editing, video temporal consistency, virtual try-on.

Generalized Referring Expression Segmentation

Scene flow estimation.

Self-supervised Scene Flow Estimation

Object discovery, 3d classification, depth completion.

Motion Forecasting

Multi-Person Pose forecasting

Multiple Object Forecasting

Face reconstruction, gaze estimation.

CARLA MAP Leaderboard

Dead-reckoning prediction.

text-guided-image-editing

Text-based image editing, concept alignment.

Zero-Shot Text-to-Image Generation

Conditional text-to-image synthesis, texture synthesis, machine unlearning, continual forgetting, sign language recognition.

MULTI-VIEW LEARNING

Incomplete multi-view clustering, interactive segmentation.

Breast Cancer Detection

Skin cancer classification.

Breast Cancer Histology Image Classification

Lung cancer diagnosis, classification of breast cancer histology images, image recognition, fine-grained image recognition, license plate recognition, material recognition, disease prediction, disease trajectory forecasting, gait recognition.

Multiview Gait Recognition

Gait recognition in the wild, human parsing.

Multi-Human Parsing

Scene generation, weakly supervised segmentation, object counting, training-free object counting, open-vocabulary object counting, pose tracking.

3D Human Pose Tracking

3D Multi-Person Pose Estimation (absolute)

3D Multi-Person Pose Estimation (root-relative)

3D Multi-Person Mesh Recovery

Event-based vision.

Event-based Optical Flow

Event-Based Video Reconstruction

Event-based motion estimation, interest point detection, homography estimation.

3D Hand Pose Estimation

Facial landmark detection.

Unsupervised Facial Landmark Detection

3D Facial Landmark Localization

3d character animation from a single photo, scene segmentation.

Dichotomous Image Segmentation

Activity detection, inverse rendering, temporal localization.

Language-Based Temporal Localization

Temporal defect localization, multi-label image classification.

Multi-label Image Recognition with Partial Labels

Text-to-video generation, text-to-video editing, subject-driven video generation, 3d object tracking.

3D Single Object Tracking

Template matching, camera localization.

Camera Relocalization

Lidar semantic segmentation, motion segmentation, visual dialog.

Relation Network

Text spotting.

Intelligent Surveillance

Vehicle Re-Identification

Few-shot class-incremental learning, class-incremental semantic segmentation, non-exemplar-based class incremental learning, disparity estimation.

Handwritten Text Recognition

Handwritten document recognition, unsupervised text recognition, knowledge distillation.

Data-free Knowledge Distillation

Self-knowledge distillation, moment retrieval.

Zero-shot Moment Retrieval

Text to video retrieval, partially relevant video retrieval, decision making under uncertainty.

Uncertainty Visualization

Person search, semi-supervised object detection.

Shadow Detection

Shadow Detection And Removal

Unconstrained Lip-synchronization

Mixed reality, video inpainting.

Video Enhancement

Cross-corpus

Micro-expression recognition, micro-expression spotting.

3D Facial Expression Recognition

Smile Recognition

3D Multi-Object Tracking

Real-time multi-object tracking, multi-animal tracking with identification, trajectory long-tail distribution for muti-object tracking, grounded multiple object tracking, future prediction, human mesh recovery.

Stereo Image Super-Resolution

Burst image super-resolution, satellite image super-resolution, multispectral image super-resolution.

Face Image Quality Assessment

Lightweight face recognition.

Age-Invariant Face Recognition

Synthetic face recognition, face quality assessement, image categorization, fine-grained visual categorization, zero shot segmentation, physics-informed machine learning, soil moisture estimation, video reconstruction.

Overlapped 10-1

Overlapped 15-1, overlapped 15-5, disjoint 10-1, disjoint 15-1, color constancy.

Few-Shot Camera-Adaptive Color Constancy

Hdr reconstruction, multi-exposure image fusion, open vocabulary semantic segmentation, zero-guidance segmentation, deep attention, line detection, visual recognition.

Fine-Grained Visual Recognition

Sign language translation.

Image Cropping

Stereo matching hand.

3D Absolute Human Pose Estimation

Text-to-Face Generation

Image forensics, tone mapping, zero-shot action recognition, natural language transduction, novel class discovery.

Breast Cancer Histology Image Classification (20% labels)

Transparent object detection, transparent objects, video restoration.

Analog Video Restoration

Abnormal event detection in video.

Semi-supervised Anomaly Detection

Surface normals estimation.

Vision-Language Navigation

hand-object pose

Grasp Generation

3D Canonical Hand Pose Estimation

Cross-domain few-shot learning, texture classification, image animation.

Infrared And Visible Image Fusion

Image to 3D

Probabilistic deep learning, unsupervised few-shot image classification, generalized few-shot classification, action quality assessment, highlight detection, pedestrian attribute recognition.

Steganalysis

Sketch Recognition

Face Sketch Synthesis

Drawing pictures.

Photo-To-Caricature Translation

Spoof detection, face presentation attack detection, detecting image manipulation, cross-domain iris presentation attack detection, finger dorsal image spoof detection, computer vision techniques adopted in 3d cryogenic electron microscopy, single particle analysis, cryogenic electron tomography, iris recognition, pupil dilation.

One-shot visual object segmentation

Unbiased Scene Graph Generation

Panoptic Scene Graph Generation

Image to video generation.

Unconditional Video Generation

Action understanding, automatic post-editing.

Dense Captioning

Document image classification.

Image Stitching

Multi-View 3D Reconstruction

Person retrieval, segmentation, open-vocabulary semantic segmentation, universal domain adaptation, surgical phase recognition, online surgical phase recognition, offline surgical phase recognition, blind face restoration.

Face Reenactment

Geometric Matching

Human action generation.

Action Generation

Object categorization, text based person retrieval, diffusion personalization.

Diffusion Personalization Tuning Free

Efficient Diffusion Personalization

Human dynamics.

3D Human Dynamics

Meme classification, hateful meme classification, severity prediction, intubation support prediction, cloud detection.

Text-To-Image

Story visualization, complex scene breaking and synthesis, image fusion, pansharpening, image deconvolution.

Image Outpainting

Table Recognition

Object segmentation.

Camouflaged Object Segmentation

Landslide segmentation, text-line extraction, point clouds, point cloud video understanding, point cloud rrepresentation learning.

Semantic SLAM

Object SLAM

Intrinsic image decomposition, line segment detection, sports analytics, situation recognition, grounded situation recognition, image shadow removal, motion detection, multi-target domain adaptation, person identification, visual prompt tuning, single-source domain generalization, evolving domain generalization, source-free domain generalization.

Robot Pose Estimation

Camouflaged Object Segmentation with a Single Task-generic Prompt

Image morphing, rotated mnist, weakly-supervised instance segmentation, image smoothing, fake image detection.

GAN image forensics

Fake Image Attribution

Image steganography, occlusion handling, contour detection.

Crop Classification

Face image quality, lane detection.

3D Lane Detection

Layout design, license plate detection.

Video Panoptic Segmentation

Viewpoint estimation.

Drone navigation

Drone-view target localization, value prediction, body mass index (bmi) prediction, crop yield prediction, multi-object tracking and segmentation.

Zero-Shot Transfer Image Classification

3D Object Reconstruction From A Single Image

CAD Reconstruction

3d point cloud linear classification, multiview learning, person recognition.

Photo Retouching

Motion retargeting, shape representation of 3d point clouds, bird's-eye view semantic segmentation.

Dense Pixel Correspondence Estimation

Human part segmentation.

Document Shadow Removal

Symmetry detection, traffic sign detection, video style transfer, referring image matting.

Referring Image Matting (Expression-based)

Referring Image Matting (Keyword-based)

Referring Image Matting (RefMatte-RW100)

Referring image matting (prompt-based), human interaction recognition, one-shot 3d action recognition, mutual gaze, affordance detection.

Gaze Prediction

Hand detection, image forgery detection, image instance retrieval, amodal instance segmentation, image quality estimation.

Image Similarity Search

Precipitation Forecasting

Referring expression generation, road damage detection.

Space-time Video Super-resolution

Video matting.

Open Vocabulary Attribute Detection

Inverse tone mapping, image/document clustering, self-organized clustering, instance search.

Audio Fingerprint

Open-World Semi-Supervised Learning

Semi-supervised image classification (cold start), 3d shape modeling.

Action Analysis

Art analysis, facial editing.

Food Recognition

Holdout Set

Material classification.

Motion Magnification

Multispectral object detection, semi-supervised instance segmentation, binary classification, llm-generated text detection, cancer-no cancer per breast classification, cancer-no cancer per image classification, suspicous (birads 4,5)-no suspicous (birads 1,2,3) per image classification, cancer-no cancer per view classification, video segmentation, camera shot boundary detection, open-vocabulary video segmentation, open-world video segmentation, lung nodule classification, lung nodule 3d classification, lung nodule detection, lung nodule 3d detection, 3d scene reconstruction.

Zero-Shot Composed Image Retrieval (ZS-CIR)

Event segmentation, generic event boundary detection, image retouching, image-variation, jpeg artifact removal, point cloud super resolution, skills assessment.

Text-based Person Retrieval

Sensor Modeling

Handwriting verification, bangla spelling error correction, video prediction, earth surface forecasting, predict future video frames, ad-hoc video search, audio-visual synchronization, handwriting generation, pose retrieval, scanpath prediction, scene change detection.

Sketch-to-Image Translation

Skills evaluation, synthetic image detection, highlight removal, 3d shape reconstruction from a single 2d image.

Shape from Texture

Deception detection, deception detection in videos.

Video Visual Relation Detection

Human-object relationship detection, 3d open-vocabulary instance segmentation.

3D Shape Representation

3D Dense Shape Correspondence

Birds eye view object detection.

Multiple People Tracking

Network Interpretation

Rgb-d reconstruction, seeing beyond the visible, semi-supervised domain generalization, unsupervised semantic segmentation.

Unsupervised Semantic Segmentation with Language-image Pre-training

Multiple object tracking with transformer.

Multiple Object Track and Segmentation

Constrained lip-synchronization, face dubbing, vietnamese visual question answering, explanatory visual question answering, 3d shape reconstruction, 4d panoptic segmentation, defocus blur detection, event data classification, image comprehension, image manipulation localization, instance shadow detection, kinship verification, medical image enhancement, open vocabulary panoptic segmentation, single-object discovery, training-free 3d point cloud classification, video forensics.

Sequential Place Recognition

Autonomous flight (dense forest), autonomous web navigation, 2d pose estimation, category-agnostic pose estimation, overlapping pose estimation.

Generative 3D Object Classification

Cube engraving classification, facial expression recognition, cross-domain facial expression recognition, zero-shot facial expression recognition, multimodal machine translation.

Face to Face Translation

Multimodal lexical translation, 10-shot image generation, 2d semantic segmentation task 3 (25 classes), document enhancement, action assessment, bokeh effect rendering, drivable area detection, face anonymization, font recognition, horizon line estimation, image imputation.

Long Video Retrieval (Background Removed)

Medical image denoising.

Occlusion Estimation

Physiological computing.

Lake Ice Monitoring

Short-term object interaction anticipation, spatio-temporal video grounding, text-based person retrieval with noisy correspondence.

Unsupervised 3D Point Cloud Linear Evaluation

Wireframe parsing, gaze redirection, single-image-generation, unsupervised anomaly detection with specified settings -- 30% anomaly, root cause ranking, anomaly detection at 30% anomaly, anomaly detection at various anomaly percentages.

Unsupervised Contextual Anomaly Detection

Landmark tracking, muscle tendon junction identification, mistake detection, online mistake detection, 3d object captioning, 3d semantic occupancy prediction, animated gif generation, generalized referring expression comprehension, image deblocking, infrared image super-resolution, motion disentanglement, persuasion strategies, scene text editing, image to sketch recognition, traffic accident detection, accident anticipation, unsupervised landmark detection, visual speech recognition, lip to speech synthesis, continual anomaly detection, weakly supervised action segmentation (transcript), weakly supervised action segmentation (action set)), calving front delineation in synthetic aperture radar imagery, calving front delineation in synthetic aperture radar imagery with fixed training amount.

Handwritten Line Segmentation

Handwritten word segmentation.

General Action Video Anomaly Detection

Physical video anomaly detection, monocular cross-view road scene parsing(road), monocular cross-view road scene parsing(vehicle).

Transparent Object Depth Estimation

3d scene editing, age and gender estimation, data ablation.

Occluded Face Detection

Gait identification, historical color image dating, stochastic human motion prediction, image retargeting, image and video forgery detection, motion captioning, personality trait recognition, personalized segmentation, scene-aware dialogue, spatial relation recognition, spatial token mixer, steganographics, story continuation.

Unsupervised Anomaly Detection with Specified Settings -- 0.1% anomaly

Unsupervised anomaly detection with specified settings -- 1% anomaly, unsupervised anomaly detection with specified settings -- 10% anomaly, unsupervised anomaly detection with specified settings -- 20% anomaly, vehicle speed estimation, visual analogies, visual social relationship recognition, zero-shot text-to-video generation, text-guided-generation, video frame interpolation, 3d video frame interpolation, unsupervised video frame interpolation.

eXtreme-Video-Frame-Interpolation

Continual semantic segmentation, overlapped 5-3, overlapped 25-25, micro-expression generation, micro-expression generation (megc2021), period estimation, art period estimation (544 artists), unsupervised panoptic segmentation, unsupervised zero-shot panoptic segmentation, 3d rotation estimation, camera auto-calibration, defocus estimation, derendering, fingertip detection, hierarchical text segmentation, human-object interaction concept discovery.

One-Shot Face Stylization

Speaker-specific lip to speech synthesis, multi-person pose estimation, neural stylization.

Part-aware Panoptic Segmentation

Population Mapping

Pornography detection, prediction of occupancy grid maps, raw reconstruction, repetitive action counting, svbrdf estimation, semi-supervised video classification, spectrum cartography, supervised image retrieval, synthetic image attribution, training-free 3d part segmentation, unsupervised image decomposition, video propagation, vietnamese multimodal learning, weakly supervised 3d point cloud segmentation, weakly-supervised panoptic segmentation, drone-based object tracking, brain visual reconstruction, brain visual reconstruction from fmri.

Human-Object Interaction Generation

Image-guided composition, fashion understanding, semi-supervised fashion compatibility.

intensity image denoising

Lifetime image denoising, observation completion, active observation completion, boundary grounding.

Video Narrative Grounding

3d inpainting, 3d scene graph alignment, 4d spatio temporal semantic segmentation.

Age Estimation

Few-shot Age Estimation

Brdf estimation, camouflage segmentation, clothing attribute recognition, damaged building detection, depth image estimation, detecting shadows, dynamic texture recognition.

Disguised Face Verification

Few shot open set object detection, gaze target estimation, generalized zero-shot learning - unseen, hd semantic map learning, human-object interaction anticipation, image deep networks, keypoint detection and image matching, manufacturing quality control, materials imaging, micro-gesture recognition, multi-person pose estimation and tracking.

Multi-modal image segmentation

Multi-object discovery, neural radiance caching.

Parking Space Occupancy

Partial Video Copy Detection

Multimodal Patch Matching

Perpetual view generation, procedure learning, prompt-driven zero-shot domain adaptation, jersey number recognition, photo to rest generalization, single-shot hdr reconstruction, on-the-fly sketch based image retrieval, thermal image denoising, trademark retrieval, unsupervised instance segmentation, unsupervised zero-shot instance segmentation, vehicle key-point and orientation estimation.

Video Individual Counting

Video-adverb retrieval (unseen compositions), video-to-image affordance grounding.

Vietnamese Scene Text

Visual sentiment prediction, human-scene contact detection, localization in video forgery, video classification, student engagement level detection (four class video classification), multi class classification (four-level video classification), 3d canonicalization, 3d surface generation.

Visibility Estimation from Point Cloud

Amodal layout estimation, blink estimation, camera absolute pose regression, change data generation, constrained diffeomorphic image registration, continuous affect estimation, deep feature inversion, document image skew estimation, earthquake prediction, fashion compatibility learning.

Displaced People Recognition

Finger vein recognition, flooded building segmentation.

Future Hand Prediction

Generative temporal nursing, grounded multimodal named entity recognition, house generation, human fmri response prediction, hurricane forecasting, ifc entity classification, image declipping, image similarity detection.

Image Text Removal

Image-to-gps verification.

Image-based Automatic Meter Reading

Dial meter reading, indoor scene reconstruction, jpeg decompression.

Kiss Detection

Laminar-turbulent flow localisation.

Landmark Recognition

Brain landmark detection, corpus video moment retrieval, mllm evaluation: aesthetics, medical image deblurring, mental workload estimation, meter reading, motion expressions guided video segmentation, natural image orientation angle detection, multi-object colocalization, multilingual text-to-image generation, video emotion detection, nwp post-processing, occluded 3d object symmetry detection, open set video captioning, pso-convnets dynamics 1, pso-convnets dynamics 2, partial point cloud matching.

Partially View-aligned Multi-view Learning

Pedestrian Detection

Thermal Infrared Pedestrian Detection

Personality trait recognition by face, physical attribute prediction, point cloud semantic completion, point cloud classification dataset, point- of-no-return (pnr) temporal localization, pose contrastive learning, potrait generation, procedure step recognition, prostate zones segmentation, pulmorary vessel segmentation, pulmonary artery–vein classification, reference expression generation, safety perception recognition, interspecies facial keypoint transfer, specular reflection mitigation, specular segmentation, state change object detection, surface normals estimation from point clouds, train ego-path detection.

Transform A Video Into A Comics

Transparency separation, typeface completion.

Unbalanced Segmentation

Unsupervised Long Term Person Re-Identification

Video correspondence flow.

Key-Frame-based Video Super-Resolution (K = 15)

Zero-shot single object tracking, yield mapping in apple orchards, lidar absolute pose regression, opd: single-view 3d openable part detection, self-supervised scene text recognition, spatial-aware image editing, video narration captioning, spectral estimation, spectral estimation from a single rgb image, 3d prostate segmentation, aggregate xview3 metric, atomic action recognition, composite action recognition, calving front delineation from synthetic aperture radar imagery, computer vision transduction, crosslingual text-to-image generation, zero-shot dense video captioning, document to image conversion, frame duplication detection, geometrical view, hyperview challenge.

Image Operation Chain Detection

Kinematic based workflow recognition, logo recognition.

MLLM Aesthetic Evaluation

Motion detection in non-stationary scenes, open-set video tagging, satellite orbit determination.

Segmentation Based Workflow Recognition

2d particle picking, small object detection.

Rice Grain Disease Detection

Sperm morphology classification, video & kinematic base workflow recognition, video based workflow recognition, video, kinematic & segmentation base workflow recognition, animal pose estimation.

IEEE Xplore Digital Library
IEEE Standards
IEEE Spectrum

Publications
IEEE Transactions on Games
Special Issues

Open Special Issue Calls

Special issue on computer vision and games: call for papers.

Video Games and Computer Vision research have long held a symbiotic relationship. On the one hand, virtual worlds in games are often used for collecting training data or as testbeds for computer vision models since they provide a greater deal of flexibility, control and scalability in the data collection process compared to the real world. On the other hand, computer vision advancements have enabled us to push the frontiers of what is possible within these artificial game worlds and have transformed the processes with which these worlds are created. However, significant research questions still remain unaddressed both in the field (Computer Vision) and the domain (Games), which include technical and engineering challenges.

This special issue invites research papers aiming to bridge the existing gaps between computer vision research and games engineering, with the motive of bringing together the games research community and the computer vision community that have largely operated independently until now. We are inviting papers for two main tracks. The first track focuses on introducing novel techniques within computer vision research that can advance the field of digital games. The second track, instead, focuses on leveraging game technologies to advance state-of-the-art techniques in computer vision. The list of topics below is not inclusive of all research directions that will be represented.

1) Computer Vision for Games

CV for game-playing, game testing and player modelling.

Data-driven CV to improve game graphics, animations, level-design, etc. as well as procedural content generation.

HCI through visual interfaces (gestures, posture, gaze, etc.).

Extended reality games.

Synthetic data and media generation based on users' emotions, behaviour, etc.

Improving real-time applicability of vision models integrated within games and game engines.

2) Games for Computer Vision

Game worlds that aid data augmentation techniques.

Rich game-based labelled datasets for tasks such as object detection, segmentation, or depth and flow estimation.

Ethics of game-based data collection and inference.

Forward modelling in and for games.

Generalisation and robustness in vision models leveraging a plethora of existing commercial games.

Unsupervised pre-training of image/video representations and world transition models from gameplay data.

We invite the submission of high quality papers on the topics above in the full paper format. Authors should follow normal IEEE Transactions on Games guidelines for their submissions, but clearly identify their papers for this special issue during the submission process. Extended versions of previously published conference or workshop papers are welcome, provided that the journal paper is a significant extension, and is accompanied by a cover letter explaining the additional contribution. You may visit the submission guidelines for author information guidelines and page length limits.

Important Dates:

Paper submission: January 31, 2024

First decisions: May 31st, 2024

Early access SI publication (online): August 2024

Publication in print: End 2024

Guest Editors:

Chintan Trivedi (University of Malta)

Matthew Guzdial (University of Alberta)

Konstantinos Makantasis (University of Malta)

Julian Togelius (New York University)

Nicu Sebe (University of Trento)

Special Issue on Human-Centred AI in Game Evaluation - Call for Papers

Most games are consciously designed with a specific experience or vision in mind. Games are commonly designed for entertainment and competition purposes, but self-expression, social critique, targeted learning, knowledge discovery as well as physical and mental health are also valid design objectives. Determining whether an objective is fulfilled is often quite difficult due to the complexity of modern games and the variability of human responses. For this reason, games are commonly playtested before being published. However, playtests are expensive and time-consuming and not all aspects of the game can be evaluated to the full extent before it is published.

There is thus a need for more concentrated and systematic work on evaluating/characterising games, its artefacts as well as player experience. Researchers have proposed approaches intended to assist game designers using methods from the field of artificial and computational intelligence (AI and CI, respectively). Still, to our knowledge, there is a surprising lack of generality and validation regarding these methods, even in scientific publications on game design. No central repository for methods currently exists. In this special issue, we want to focus on human-centred AI approaches aiming for a more holistic and systematic approach to game evaluation. We thus seek submissions on related topics for this special issue.

The following is a non-comprehensive list of suggested topics:

Uses of AI agents to evaluate game content

Measures for game evaluation

Game evaluation and play-testing for AR/VR

Relationship between AI agents and player experience

Automatic analysis of play-traces

Mixed-Initiative gameplay evaluation

Player modelling for game evaluation

Automatic evaluation for new game genres

Validation of automatic evaluation methods using human data

Generality of automatic evaluation methods

Differences between different evaluation methods (tested with AI or humans, qualitative vs. quantitative, objective vs subjective measures, etc.)

Evaluation measures and their relationship to business and research goals

Playtesting standards in industry

Correlation between objective and subjective measures

Ethics, privacy and legal aspects of using player data

Evaluation of generated content

We invite the submission of high quality papers on the topics above in the following formats:

Full papers

Short papers

Authors should follow normal IEEE Transactions on Games guidelines for their submissions, but clearly identify their papers for this special issue during the submission process. Extended versions of previously published conference or workshop papers are welcome, provided that the journal paper is a significant extension, and is accompanied by a cover letter explaining the additional contribution. See ( https://www.transactions. games/submit/submission- guidelines ) for author information guidelines and page length limits.

Important Dates

Paper submission November 1st, 2023

First decisions January 29th, 2024

Early access SI publication (online) March 2024

Publication in print End 2024

Guest Editors

Alena Denisova (University of York, UK)

Diego Pérez-Liébana (Queen Mary University of London, UK)

Vanessa Volz ( modl.ai , DK)

Julian Frommel (Utrecht University, NL)

Sahar Asadi (King, SE)

Ongoing Special Issues

User Evaluation for VR Games - Guest editors: Hai-Ning Liang (Xi’an Jiaotong-Liverpool University, China), Wenge Xu (Birmingham City University, UK), Yiyu Cai (Nanyang Technological University, Singapore), Fotis Liarokapis (CYENS – Centre of Excellence, Cyprus)

Recent Special Issues

Evolutionary Computation for Games - Guest editors: Jacob Schrum, Jialin Liu, Cameron Browne, Anikó Ekárt and Marcus Gallagher ( https://ieeexplore.ieee.org/ xpl/tocresult.jsp?isnumber= 10075412&punumber=7782673 )

User Experience of AI in Games - Guest editors: Henrik Warpefelt, Christoph Salge, Magy Seif El-Nasr, Jichen Zhu and Mirjam Palosaari Eladhari ( https://ieeexplore.ieee.org/ xpl/tocresult.jsp?isnumber= 9987665&punumber=7782673 )

Team AI in Games - Guest editors: Maxim Mozgovoy, Mike Preuss, Tomoharu Nakashima and Rafael Bidarra ( https://ieeexplore.ieee.org/ xpl/tocresult.jsp?isnumber= 9652015&punumber=7782673 )

Serious Games for Health - Guest editors: Duarte Duque, João L. Vilaça, Marjorie A. Zielke, Nuno Dias, Nuno F. Rodrigues and Ruck Thawonmas ( https://ieeexplore.ieee.org/ xpl/tocresult.jsp?isnumber= 9301448&punumber=7782673 )

Game Competition Frameworks for Research and Education - Guest editors: Jialin Liu, Diego Perez-Liebana, Tristan Cazenave and Ruck Thawonmas ( https://ieeexplore.ieee.org/ xpl/tocresult.jsp?isnumber= 8839649&punumber=7782673 )

AI-Based and AI-Assisted Game Design - Guest editors: Antonios Liapis, Georgios N. Yannakakis, Michael Cook and Simon Colton ( https://ieeexplore.ieee.org/ xpl/tocresult.jsp?isnumber= 8667748&punumber=7782673 )

New Proposals

If you are interested in guest editing a special issue, you should send a proposal to the Editor-in-Chief ( Georgios N. Yannakakis ).

Preparing your Proposal

Please prepare a Call for Papers which will ultimately be distributed. We would suggest that this is a maximum of one side of A4.
Please provide an additional page, that presents short biographies of the proposed guest editors and why they are qualified to guest edit a special of the Transactions of Games in the area being suggested.
The proposal will be sent to the journal’s Associate Editors for comment. You may be asked to revise the proposal and this iterative process will continue until a decision has been made.

Other Points to Note

The proposed guest editors must include a current Associate Editor of ToG. This will ensure that the same standards are maintained for special issues, as for regular issues.
A guest editor can submit a maximum of ONE paper with them as an author.
A special issue cannot be related to a particular event (e.g. a conference) and submissions should be open to all
The editorial that is eventually written to introduce the special issue must not mention specific conferences or events

Previous Special Issues

Computational Aesthetics in Games ( Volume 4, Issue 3 and Editorial )
Computational Narrative and Games ( Volume 6, Issue 2 and Editorial )
General Games ( Volume 6, Issue 4 and Editorial )
Age of Analytics ( Volume 7, Issue 3 and Editorial )
Physics Simulation ( Volume 8, Issue 2 and Editorial )
Real Time Strategy ( Volume 8, Issue 4 and Editorial )
Welcome from the Vice President for Publications
Editors and Associate Editors
IEEE Xplore (TG)
IEEE Xplore (TCIAIG)
Information for Authors
Types of Contributions
Manuscript Format
Manuscript Submission
IEEE Copyright
Page Charges
Professional Editing Services
Plagiarism and Ethical Issues
Recent Articles
Most Accessed Articles
Advertising

IEEE CS Standards
Career Center
Subscribe to Newsletter
IEEE Standards

For Industry Professionals
For Students
Launch a New Career
Membership FAQ
Membership FAQs
Membership Grades
Special Circumstances
Discounts & Payments
Distinguished Contributor Recognition
Grant Programs
Find a Local Chapter
Find a Distinguished Visitor
About Distinguished Visitors Program
Find a Speaker on Early Career Topics
Technical Communities
Collabratec (Discussion Forum)
My Subscriptions
My Referrals
Computer Magazine
ComputingEdge Magazine
Let us help make your event a success. EXPLORE PLANNING SERVICES
Events Calendar
Calls for Papers
Conference Proceedings
Conference Highlights
Top 2024 Conferences
Conference Sponsorship Options
Conference Planning Services
Conference Organizer Resources
Virtual Conference Guide
Get a Quote
CPS Dashboard
CPS Author FAQ
CPS Organizer FAQ
Find the latest in advanced computing research. VISIT THE DIGITAL LIBRARY
Open Access
Tech News Blog
Author Guidelines
Reviewer Information
Guest Editor Information
Editor Information
Editor-in-Chief Information
Volunteer Opportunities
Video Library
Member Benefits
Institutional Library Subscriptions
Advertising and Sponsorship
Code of Ethics
Educational Webinars
Online Education
Certifications
Industry Webinars & Whitepapers
Research Reports
Bodies of Knowledge
CS for Industry Professionals
Resource Library
Newsletters
Women in Computing
Digital Library Access
Organize a Conference
Run a Publication
Become a Distinguished Speaker
Participate in Standards Activities
Peer Review Content
Author Resources
Publish Open Access
Society Leadership
Boards & Committees
Local Chapters
Governance Resources
Conference Publishing Services
Chapter Resources
About the Board of Governors
Board of Governors Members
Diversity & Inclusion
Open Volunteer Opportunities
Award Recipients
Student Scholarships & Awards
Nominate an Election Candidate
Nominate a Colleague
Corporate Partnerships
Conference Sponsorships & Exhibits
Advertising
Recruitment
Publications
Education & Career

TCPAMI's Grand Collection

Computer vision is experiencing remarkable growth, fueled by breakthroughs in pattern recognition, artificial intelligence, and image processing. TCPAMI's Grand Collection is a compendium of computer vision research that received the Best Paper Award in 2023 from CVPR, WACV, ICCV, and ICCP, covering a range of topics from decision-making of autonomous driving systems to visual programming techniques.

These conferences, organized by the Technical Community for Pattern Analysis and Machine Intelligence (TCPAMI), collectively highlight the rapid advancements and diverse applications of computer vision technology, offering valuable insights and collaboration opportunities for computer vision experts.

Download TCPAMI's Grand Collection to stay abreast of the topics influencing research in 2024. Paper titles include:

Lossy Image Compression with Quantized Hierarchical VAEs
Exemplar Guided Deep Neural Network for Spatial Transcriptomics Analysis of Gene Expression Prediction
Visual Programming: Compositional visual reasoning without training
Planning-oriented Autonomous Driving
Passive Ultra-Wideband Single-Photon Imaging
Adding Conditional Control to Text-to-Image Diffusion Models

Learn more about the Technical Community on Pattern Analysis and Machine Intelligence (TCPAMI) and search upcoming events in pattern recognition, computer vision, AI, and more.

Download TCPAMI's Grand Collection

Name * First Last
Country/Region * Country/Region Afghanistan Albania Algeria Andorra Angola Anguilla Antigua and Barbuda Argentina Armenia Aruba Australia Austria Azerbaijan Bahamas Bahrain Bangladesh Barbados Belarus Belgium Belize Benin Bermuda Bhutan Bolivia Bonaire, Sint Eustatius, Saba Bosnia and Herzegovina Botswana Brazil Brunei Darussalam Bulgaria Burkina Faso Burundi Cabo Verde Cambodia Cameroon Canada Cayman Islands Central African Republic Chad Chile China Colombia Comoros Congo Congo, Democratic Republic of Cook Islands Costa Rica Cote d'Ivoire Croatia Cuba Curacao Cyprus Czech Republic Denmark Djibouti Dominica Dominican Republic Ecuador Egypt El Salvador Equatorial Guinea Eritrea Estonia Eswatini Ethiopia Falkland Islands (Malvinas) Faroe Islands Fiji Finland France French Guiana French Polynesia Gabon Gambia Georgia Germany Ghana Gibraltar Greece Greenland Grenada Guadeloupe Guatemala Guinea Guinea-Bissau Guyana Haiti Honduras Hong Kong Hungary Iceland India Indonesia Iran, Islamic Republic of Iraq Ireland Isle of Man Israel Italy Jamaica Japan Jordan Kazakhstan Kenya Kiribati Korea (North) Korea, Republic of Kosovo (UNMIK) Kuwait Kyrgyzstan Laos Latvia Lebanon Lesotho Liberia Libya Liechtenstein Lithuania Luxembourg Macao Madagascar Malawi Malaysia Maldives Mali Malta Martinique Mauritania Mauritius Mayotte Mexico Moldova, Republic of Monaco Mongolia Montenegro Montserrat Morocco Mozambique Myanmar Namibia Nauru Nepal Netherlands New Caledonia New Zealand Nicaragua Niger Nigeria Niue Norfolk Island North Macedonia Norway Oman Pakistan Palestine, State of Panama Papua New Guinea Paraguay Peru Philippines Pitcairn Poland Portugal Qatar Reunion Romania Russian Federation Rwanda Saint Kitts and Nevis Saint Lucia Samoa San Marino Sao Tome and Principe Saudi Arabia Senegal Serbia Seychelles Sierra Leone Singapore Sint Maarten Slovakia Slovenia Solomon Islands Somalia South Africa South Sudan Spain Sri Lanka St. Helena St. Vincent and the Grenadines Sudan Suriname Svalbard and Jan Mayen Sweden Switzerland Syrian Arab Republic Taiwan Tajikistan Tanzania, United Republic of Thailand Timor-Leste Togo Tokelau Tonga Trinidad and Tobago Tunisia Turkey Turkmenistan Turks And Caicos Islands Tuvalu Uganda Ukraine United Arab Emirates United Kingdom Uruguay USA Uzbekistan Vatican City Venezuela Vietnam Virgin Islands (British) Wallis And Futuna Western Sahara Yemen Zambia Zimbabwe
Job Title * Select a Job Title 1. Chairman of the Board/President/CEO 2. Owner/Partner 3. General Manager 4. V.P. Operations 5. V.P. Engineering/Director Engineering 6. Chief Engineer/Chief Scientist 7. Engineering Manager 8. Scientific Manager 9. Member of Technical Staff 10. Design Engineering Manager 11. Design Engineer 12. Hardware Engineer 13. Software Engineer 14. Computer Scientist 15. Dean/Professor/Instructor 16. Consultant 17. Retired 18. Other Professional/Technical 19. Other Professional/Non-Technical 20. Student
Bachelor’s Degree
Master's Degree
Doctoral Degree
Vocational/Bootcamp
High School/No Professional Schooling
I agree to the IEEE Privacy Policy
Name This field is for validation purposes and should be left unchanged.
Resource Library - reports, guides, and technology predictions
Computer Society Digital Library
Membership Benefits for Industry Practioners
Partner with us to advance computing

EMAIL ADDRESS

IEEE COMPUTER SOCIETY

Board of Governors
IEEE Support Center

DIGITAL LIBRARY

Librarian Resources

COMPUTING RESOURCES

Courses & Certifications

COMMUNITY RESOURCES

Conference Organizers
Communities

BUSINESS SOLUTIONS

Conference Sponsorships & Exhibits
Digital Library Institutional Subscriptions
Accessibility Statement
IEEE Nondiscrimination Policy
XML Sitemap

A not-for-profit organization, the Institute of Electrical and Electronics Engineers (IEEE) is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

Help | Advanced Search

Computer Science > Computer Vision and Pattern Recognition

Title: rank-based no-reference quality assessment for face swapping.

Abstract: Face swapping has become a prominent research area in computer vision and image processing due to rapid technological advancements. The metric of measuring the quality in most face swapping methods relies on several distances between the manipulated images and the source image, or the target image, i.e., there are suitable known reference face images. Therefore, there is still a gap in accurately assessing the quality of face interchange in reference-free scenarios. In this study, we present a novel no-reference image quality assessment (NR-IQA) method specifically designed for face swapping, addressing this issue by constructing a comprehensive large-scale dataset, implementing a method for ranking image quality based on multiple facial attributes, and incorporating a Siamese network based on interpretable qualitative comparisons. Our model demonstrates the state-of-the-art performance in the quality assessment of swapped faces, providing coarse- and fine-grained. Enhanced by this metric, an improved face-swapping model achieved a more advanced level with respect to expressions and poses. Extensive experiments confirm the superiority of our method over existing general no-reference image quality assessment metrics and the latest metric of facial image quality assessment, making it well suited for evaluating face swapping images in real-world scenarios.

Comments:	8 pages, 5 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	[cs.CV]
	(or [cs.CV] for this version)
	Focus to learn more arXiv-issued DOI via DataCite

Submission history

Access paper:.

HTML (experimental)
Other Formats

References & Citations

Google Scholar
Semantic Scholar

BibTeX formatted citation

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IEEE Account

Change Username/Password
Update Address

Purchase Details

Payment Options
Order History
View Purchased Documents

Profile Information

Communications Preferences
Profession and Education
Technical Interests
US & Canada: +1 800 678 4333
Worldwide: +1 732 981 0060
Contact & Support
About IEEE Xplore
Accessibility
Terms of Use
Nondiscrimination Policy
Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

IMAGES

(PDF) IEEE Paper
(PDF) IEEE paper
IEEE PAPER
ServiceNow Research spotlight: Papers accepted at the 2022 IEEE/CVF
(PDF) Computer Vision for Interactive Computer Graphics
IEEE Paper Template

VIDEO

Bibhas Adhikari at QCE23
Computer Vision Assisted Smart ICU Framework for Optimized Patient Care
Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality
Computer Vision Research Overview
What is the primary goal of image processing?
Foundations of Data Visualisation

COMMENTS

The application of deep learning in computer vision
As the deep learning exhibits strong advantages in the feature extraction, it has been widely used in the field of computer vision and among others, and gradually replaced traditional machine learning algorithms. This paper first reviews the main ideas of deep learning, and displays several related frequently-used algorithms for computer vision. Afterwards, the current research status of ...
Conference on Computer Vision and Pattern Recognition (CVPR)
Profile Information. Communications Preferences. Profession and Education. Technical Interests. Need Help? US & Canada:+1 800 678 4333. Worldwide: +1 732 981 0060. Contact & Support. About IEEE Xplore.
Computer Vision Technology Based on Deep Learning
With the development of artificial intelligence, computer vision technology that simulates human vision has received widespread attention. Based on the current commonly used method of computer vision technology-deep learning, this paper outlines the development of deep learning models, and determines the inflection point of the development of the introduction of convolutional neural networks ...
Deep learning in computer vision: A critical review of emerging
In the previous research, many scholars (Dosovitskiy and Brox, ... They commented in their paper "computer vision is already being put to questionable use and as researchers, we have a responsibility to at least consider the harm our work might be doing and think of ways to mitigate it. ... Proceedings of the 2017 ieee conference on computer ...
Resources for Computer Vision
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) aims to initiate further discussion within computer vision applications and research. In 2022, it was encouraged that researchers submit papers and proposals including potential negative societal impacts of their proposed research and possible methods on how to mitigate them.
Multi-Object Detection Using Enhanced Computer Vision ...
In robotics, computer vision modeling empowers robots to perceive and interact with their environments effectively. This paper presents a groundbreaking effort to develop a computer vision model tailored to underwater robotic systems aimed explicitly at detecting multiple objects of interest beneath the water's surface. The primary focus of our model lies in accurately differentiating between ...
CVPR 2024 Call for Papers
The Thirty-Fourth IEEE/CVF Conference on Computer Vision and Pattern Recognition. Main conference ... original research. Topics of interest include all aspects of computer vision and pattern recognition including, but not limited to: 3D from multi-view and sensors ... All accepted papers will be made publicly available by the Computer Vision ...
Computer vision: basic principles
Abstract: The author provides a general introduction to computer vision. He discusses basic techniques and computer implementations, and also indicates areas in which further research is needed. He focuses on two-dimensional object recognition, i.e. recognition of an object whose spatial orientation, relative to the viewing direction is known.
CVPR 2021 Open Access Repository
These CVPR 2021 papers are the Open Access versions, provided by the Computer Vision Foundation. Except for the watermark, they are identical to the accepted versions; the final published version of the proceedings is available on IEEE Xplore. This material is presented to ensure timely dissemination of scholarly and technical work.
PDF IEEE Transactions on Evolutionary Computation Special Issue on
Evolutionary computer vision (ECV) is at the intersection of two major research fields of artificial intelligence: computer vision and evolutionary computation. This special issue aims to provide an overview of state-of-the-art contributions to the latest research and development in the discipline. Computer vision includes methods for acquiring ...
Call for Papers: IEEE/CVF CVPR
Reviews Released: 24 January 2023. Rebuttal Period: 24-31 January 2023. Final Decisions: 27 February 2023. Papers in the main technical program must describe high-quality, original research. Topics of interest include all aspects of computer vision and pattern recognition including, but not limited to: 3D from multi-view and sensors.
Computer Vision Based Object Detection and Recognition ...
Computer Vision is a concept which works with the methods for automatic extraction, analysis and understanding of useful information from a single image or a sequence of images. To fulfill the new challenges, the system which is proposed in this paper, is mainly being used for object detection and recognition of images so that the image search engine becomes more fruitful. A large dataset of ...
CVPR 2023 Call for Papers
The Thirty-Fourth IEEE/CVF Conference on Computer Vision and Pattern Recognition. ... Call for Papers. Papers in the main technical program must describe high-quality, original research. Topics of interest include all aspects of computer vision and pattern recognition including, but not limited to: ... (Microsoft Research)
CVF Open Access
Computer Vision Foundation. These research papers are the Open Access versions, provided by the Computer Vision Foundation. Except for the watermark, they are identical to the accepted versions; the final published version of the proceedings is available on IEEE Xplore. This material is presented to ensure timely dissemination of scholarly and ...
Research on Image Processing Technology of Computer Vision Algorithm
With the gradual improvement of artificial intelligence technology, image processing has become a common technology and is widely used in various fields to provide people with high-quality services. Starting from computer vision algorithms and image processing technologies, the computer vision display system is designed, and image distortion correction algorithms are explored for reference.
Computer Vision and Pattern Recognition
S3 Gaussian: Self-Supervised Street Gaussians for Autonomous Driving. Nan Huang, Xiaobao Wei, Wenzhao Zheng, Pengju An, Ming Lu, Wei Zhan, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang. Comments: Code is available at: this https URL. Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Research on Computer Vision Image Multimedia Technology ...
Therefore, this paper makes an in-depth study of computer vision image multimedia technology under the background of big data. In the research, this paper systematically describes the current computer vision image multimedia technology, as well as the future development direction of computer vision image multimedia technology.
PDF A Survey on Evolutionary Computation for Computer Vision and Image
Computer Vision and Image Analysis: Past, Present, and Future Trends Ying Bi, Member, IEEE, Bing Xue, Senior Member, IEEE, Pablo Mesejo, Stefano Cagnoni, Senior Member, IEEE, Mengjie Zhang, Fellow, IEEE, Abstract—Computer vision (CV) is a big and important ﬁeld in artiﬁcial intelligence covering a wide range of applications.
Early Detection of Fire using YOLOv8 and Computer Vision
In the field of fire safety, enhanced detection systems are crucial for mitigating hazards and facilitating rapid responses. This research presents a novel hybrid and layered fire detection system that integrates machine learning (ML) and computer vision (CV) methodologies with extensive categorization to enhance fire protection in office buildings. To accurately evaluate and detect fire ...
Machine Learning in Computer Vision
The machine learning and computer vision research is still evolving [1]. Computer vision is an essential part of Internet of Things, Industrial Internet of Things, and brain human interfaces. The complex human activities are recognized and monitored in multimedia streams using machine learning and computer vison.
Research on the Construction of Image Processing Model ...
Abstract: This research focuses on constructing an efficient image processing model, which is rooted in computer vision algorithms, to ameliorate image distortion and optimize visual display systems. The article initially discusses the significance of image distortion correction and its processing flow, encompassing detailed steps ranging from image preprocessing, model construction, parameter ...
Submission
WACV 2024 solicits high-quality, original submissions describing research in computer vision, with a particular emphasis on systems and applications with significant, interesting vision components. Application areas include, but are not limited to: Authors are also encouraged to submit more traditional computer vision algorithms papers. Topics ...
Computer Vision
Stay informed on the latest trending ML papers with code, research developments, libraries, methods, and datasets. ... You can create a new account if you don't have one. Browse SoTA > Computer Vision Computer Vision. 4685 benchmarks • 1446 tasks • 3046 datasets • 48211 papers with code Semantic Segmentation ... 5342 papers with code
Research on Some Key Technologies of Deep Learning in the ...
The role of computer vision technology in the field of artificial intelligence development is very important, but there is a problem of poor application effect of key technologies. Traditional neural network algorithms cannot solve the problems of image classification and inaccurate image detection in computer visual perception tasks. In the tide of artificial intelligence, the combination of ...
Special Issues
Special Issue on Computer Vision and Games: Call for Papers. Video Games and Computer Vision research have long held a symbiotic relationship. On the one hand, virtual worlds in games are often used for collecting training data or as testbeds for computer vision models since they provide a greater deal of flexibility, control and scalability in ...
TCPAMI's Grand Collection
TCPAMI's Grand Collection is a compendium of computer vision research that received the Best Paper Award in 2023 from CVPR, WACV, ICCV, and ICCP, covering a range of topics from decision-making of autonomous driving systems to visual programming techniques. These conferences, organized by the Technical Community for Pattern Analysis and Machine ...
CVIU
The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image ...
Rank-based No-reference Quality Assessment for Face Swapping
Face swapping has become a prominent research area in computer vision and image processing due to rapid technological advancements. The metric of measuring the quality in most face swapping methods relies on several distances between the manipulated images and the source image, or the target image, i.e., there are suitable known reference face images. Therefore, there is still a gap in ...
Research on AIGC Classification Content Generation System ...
In this paper, a tourism information forecasting system in computer aided environment is established. In this way, the sparsity of various data can be effectively integrated. In particular, forecasts for foreign visitors have been optimized. This paper aims to combine machine vision and deep learning to build a new efficient multi-source information perception method. According to the basic ...
[Call for paper]2024 5th International Conference on Computer Vision
All accepted full papers will be published in IEEE(ISBN: 979-8-3503-8843-5) and will be submitted to IEEE Xplore, EI Compendex, Scopus and Inspec for indexing. Important Dates: Full Paper ...