Help | Advanced Search

Computer Science > Computer Vision and Pattern Recognition

Title: causal reasoning meets visual representation learning: a prospective study.

Abstract: Visual representation learning is ubiquitous in various real-world applications, including visual comprehension, video understanding, multi-modal analysis, human-computer interaction, and urban computing. Due to the emergence of huge amounts of multi-modal heterogeneous spatial/temporal/spatial-temporal data in big data era, the lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models. The majority of the existing methods tend to fit the original data/variable distributions and ignore the essential causal relations behind the multi-modal knowledge, which lacks unified guidance and analysis about why modern visual representation learning methods easily collapse into data bias and have limited generalization and cognitive abilities. Inspired by the strong inference ability of human-level agents, recent years have therefore witnessed great effort in developing causal reasoning paradigms to realize robust representation and model learning with good cognitive ability. In this paper, we conduct a comprehensive review of existing causal reasoning methods for visual representation learning, covering fundamental theories, models, and datasets. The limitations of current methods and datasets are also discussed. Moreover, we propose some prospective challenges, opportunities, and future research directions for benchmarking causal reasoning algorithms in visual representation learning. This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods, publicly available benchmarks, and consensus-building standards for reliable visual representation learning and related real-world applications more efficiently.

Submission history

Access paper:.

  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

causal reasoning meets visual representation learning a prospective study

  • Published Online
  • Current Issue
  • Special Issue
  • Selected Papers
  • Aims and Scopes
  • Index information
  • Editorial Board
  • Springer Page
  • Guide for Authors
  • Submission Template
  • Copyright Agreement
  • Guide for Referees
  • Referees Acknowledgement

Causal Reasoning Meets Visual Representation Learning: A Prospective Study

  • Yang Liu , 
  • Yu-Shen Wei , 
  • Hong Yan , 
  • Guan-Bin Li , 

 alt=

Export File

You can copy and paste references from this page.

A service of the American Association for the Advancement of Science

Causal reasoning meets visual representation learning: A prospective study

Beijing Zhongke Journal Publising Co. Ltd.

Overview of the structure of this paper

 Overview of the structure of this paper, including the discussion of related methods, datasets, challenges, and the relations among causal reasoning, visual representation learning, and their integration

Credit: Beijing Zhongke Journal Publising Co. Ltd.

With the emergence of huge amounts of heterogeneous multi-modal data, including images, videos, texts/languages, audios, and multi-sensor data, deep learning-based methods have shown promising performance for various computer vision and machine learning tasks, e.g., the visual comprehension, video understanding, visual-linguistic analysis, and multi-modal fusion, etc. However, the existing methods rely heavily upon fitting the data distributions and tend to capture the spurious correlations from different modalities, and thus fail to learn the essential causal relations behind the multi-modal knowledge that have a good generalization and cognitive abilities. Inspired by the fact that most of the data in computer vision society are independent and identically distributed (i.i.d.), a substantial body of literature adopted data augmentation, pre-training, self-supervision, and novel architectures to improve the robustness of the state-of-the-art deep neural network architectures. However, it has been argued that such strategies only learn correlation-based patterns (statistical dependencies) from data and may not generalize well without the guarantee of the i.i.d setting.

Due to the powerful ability of to uncover the underlying structural knowledge about data generating processes that allow interventions and generalize well across different tasks and environments, causal reasoning offers a promising alternative to correlation learning. Recently, causal reasoning has attracted increasing attention in a myriad of high-impact domains of computer vision and machine learning, such as interpretable deep learning, causal feature selection, visual comprehension, visual robustness, visual question answering, and video understanding. A common challenge of these causal methods is how to build a strong cognitive model that can fully discover causality and spatial-temporal relations.

In this paper, researchers aim to provide a comprehensive overview of causal reasoning for visual representation learning, attract attention, encourage discussions, and bring to the forefront the urgency of developing novel causality-guided visual representation learning methods. Although there are some surveys about causal reasoning, these works are intended for general representation learning tasks such as deconfounding, out-of-distribution (OOD) generalization, and debiasing. Differently, this paper focuses on the systematic and comprehensive survey of related works, datasets, insights, future challenges and opportunities for causal reasoning, visual representation learning, and their integration. To present the review more concisely and clearly, this paper selects and cites related work by considering their sources, publication years, impact, and the cover of different aspects of the topic surveyed in this paper. Overall, the main contributions of this paper are given as follows.

Firstly, this paper presents the basic concepts of causality, the structural causal model (SCM), the independent causal mechanism (ICM) principle, causal inference, and causal intervention. Then, based on the analysis, this paper further gives some directions for conducting causal reasoning on visual representation learning tasks. Note that this paper is supposedto be the first that proposes the potential research directions for causal visual representation learning.

Secondly, a prospective review is introduced to systematically and structurally review the existing works according to their efforts in the above-pointed directions for conducting causal visual representation learning more efficiently. Researchers focus on the relation between visual representation learning and causal reasoning and provide a better understanding of why and how existing causal reasoning methods can be helpful in visual representation learning, as well as providing inspiration for future research and studies.

Thirdly, this paper explores and discusses future research areas and open problems related to using causal reasoning methods to tackle visual representation learning. This can encourage and support the broadening and deepening of research in the related fields.

Section 2 provides the preliminaries, which include five parts. The first part is the basic concepts of causality. Causal learning is different from statistical learning, which aims to discover causal relationships beyond statistical relations. Learning causality requires machine learning methods not only to predict the outcome of i.i.d. experiments but also to reason from a causal perspective. The second part is the SCM which considers the formulation of a causality style. The third part is the ICM principle that describes the independence of causal mechanisms. The fourth part is causal inference whose purpose is to estimate the outcome shift (or effect) of different treatments. The last part is causal intervention which aims to capture the causal effects of interventions (i.e., variables), and take advantage of causal relations in datasets to improve model performance and generalization ability.

Traditional feature learning methods usually learn the spurious correlation introduced by confounders. This will reduce the robustness of models and make models hard to generalize across domains. Causal reasoning, a learning paradigm that reveals the real causality from the outcome, overcomes the essential defect of correlation learning and learns robust, reusable, and reliable features. In Section 3, researchers review the recent representative causal reasoning methods for general feature learning, which mainly consist of three main paradigms: 1) structural causal model (SCM) embedded, 2) applying causal intervention/counterfactual, and 3) Markov boundary (MB) based feature selection.

Visual representation learning has made great progress in recent years, which can utilize spatial or/and temporal information to complete specific tasks, including visual understanding (object detection, scene graph generation, visual grounding, visual commonsense reasoning), action detection and recognition, and visual question answering, etc. In Section 4, researchers introduce these representative visual learning tasks and discuss the existing challenges and necessity of applying causal reasoning to visual representation learning.

According to the above-discussed visual representation learning methods, the current machine learning, especially representation learning, faces several challenges: 1) lack of interpretability, 2) poor generalization ability, and 3) over-reliance on correlations of data distribution. Causal reasoning offers a promising alternative to address these challenges. The discovery of causality helps to uncover the causal mechanism behind the data, allowing the machine to understand better why and to make decisions through intervention or counterfactual reasoning. In Section 5, researchers summarize some recent approaches for causal visual representation learning. The causal visual representation learning is an emerging research topic and has appeared since the 2020s. The related tasks can be roughly categorized into several main aspects: 1) causal visual understanding, 2) causal visual robustness, and 3) causal visual question answering. In this section, researchers discuss these three representative causal visual representations learning tasks.

Correlation-based models may perform well in existing datasets, not because these models have a strong reasoning capability, but because these datasets cannot fully support the evaluation of the models′ reasoning capability. Spurious correlations in these datasets can be exploited by the model to cheat, which means that the model just concentrates on superficial correlation learning, not real causal reasoning, only approximating the distribution of the dataset. For example, in the VQA v1.0 dataset for the VQA task, the model simply answers “yes” when seeing the question “Do you see a ···”, which will achieve nearly 90% accuracy. Due to this shortcoming in current datasets, researchers need to build benchmarks that can evaluate the true causal reasoning capability of models. In Section 6, researchers take image question answering benchmarks and video question answering benchmarks as examples to analyze the current research situation of related causal reasoning datasets and give some future directions.

Section 7 proposes and discusses some future research directions. Causal reasoning with visual representation learning has a variety of applications. Modeling causal reasoning for a variety of tasks can achieve a better perception of the real world. In this section, researchers introduce the applications from five aspects: image/video analysis, explainable artificial intelligence, recommendation system, human-computer dialog and interaction, and crowd intelligence analysis.

They also discuss how causal reasoning benefits various real-world applications.

Some researchers have successfully implemented causal reasoning for visual representation learning to discover causality and visual relations. However, causal reasoning for visual representation learning is still in its infancy stage, and many issues remain unsolved. Therefore, Section 8 highlights several possible research directions and open problems to inspire further extensive and in-depth research on this topic. Potential research directions for causal visual representation learning can be summarized as: 1) more reasonable causal relation modeling; 2) more precise approximation of intervention distributions; 3) more proper counterfactual synthesizing process; 4) large-scale benchmarks and evaluation pipeline.

This paper has provided a comprehensive survey on causal reasoning for visual representation learning. Researchers hope that this survey can help attract attention, encourage discussions, and bring to the forefront the urgency of developing novel causal reasoning methods, publicly available benchmarks, and consensus-building standards for reliable visual representation learning and related real-world applications more efficiently.

See the article:

Causal Reasoning Meets Visual Representation Learning: A Prospective Study

http://doi.org/10.1007/s11633-022-1362-z

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.

causal reasoning meets visual representation learning a prospective study

  • CAS IR Grid
  • 中国科学院自动化研究所
  • International Journal of Automation and Computing

Causal Reasoning Meets Visual Representation Learning: A Prospective Study

入库方式: OAI收割

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.

Causal Reasoning Meets Visual Representation Learning: A Prospective Study

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a detailed summary of this paper with a premium account .

We ran into a problem analyzing this paper.

Please try again later (sorry!).

Get summaries of trending AI papers delivered straight to your inbox

Unsubscribe anytime.

You answered out of questions correctly.

= 4">Well done!

Appreciate you reporting the issue. We'll look into it.

Researchers explore causal machine learning, a new advancement for AI in health care

by Ludwig Maximilian University of Munich

Using new methods, machines can learn not only to make predictions, but also to handle causal relationships

Artificial intelligence is making progress in the medical arena. When it comes to imaging techniques and the calculation of health risks, there is a plethora of AI methods in development and testing phases. Wherever it is a matter of recognizing patterns in large data volumes, it is expected that machines will bring great benefit to humanity. Following the classical model, the AI compares information against learned examples, draws conclusions, and makes extrapolations.

Now an international team led by Professor Stefan Feuerriegel, Head of the Institute of Artificial Intelligence (AI) in Management at LMU, is exploring the potential of a comparatively new branch of AI for diagnostics and therapy. Can causal machine learning (ML) estimate treatment outcomes—and do so better than the ML methods generally used to date? Yes, says a study by the group, which has been published in Nature Medicine and is titled "Causal ML can improve the effectiveness and safety of treatments."

In particular, the new ML variant offers "an abundance of opportunities for personalizing treatment strategies and thus individually improving the health of patients," write the researchers, who hail from Munich, Cambridge (United Kingdom), and Boston (United States) and include Stefan Bauer and Niki Kilbertus, professors of computer science at the Technical University of Munich (TUM) and group leaders at Helmholtz AI.

As regards machine assistance in therapy decisions, the authors anticipate a decisive leap forward in quality. Classical ML recognizes patterns and discovers correlations, they argue. However, the causal principle of cause and effect remains closed to machines as a rule; they cannot address the question of why. And yet many questions that arise when making therapy decisions contain causal problems within them.

The authors illustrate this with the example of diabetes: Classical ML would aim to predict how probable a disease is for a given patient with a range of risk factors. With causal ML, it would ideally be possible to answer how the risk changes if the patient gets an anti-diabetes drug; that is, gauge the effect of a cause (prescription of medication). It would also be possible to estimate whether another treatment plan would be better, for example, than the commonly prescribed medication, metformin.

To be able to estimate the effect of a—hypothetical—treatment, however, "the AI models must learn to answer questions of a 'What if?' nature," says Jonas Schweisthal, doctoral candidate in Feuerriegel's team.

"We give the machine rules for recognizing the causal structure and correctly formalizing the problem," says Feuerriegel. Then the machine has to learn to recognize the effects of interventions and understand, so to speak, how real-life consequences are mirrored in the data that has been fed into the computers.

"The software we need for causal ML methods in medicine doesn't exist out of the box," says Feuerriegel. Rather, "complex modeling" of the respective problem is required, involving "close collaboration between AI experts and doctors."

Like his TUM colleagues Stefan Bauer and Niki Kilbertus, Feuerriegel also researches questions relating to AI in medicine, decision-making, and other topics at the Munich Center for Machine Learning (MCML) and the Konrad Zuse School of Excellence in Reliable AI.

In other fields of application, such as marketing, explains Feuerriegel, the work with causal ML has already been in the testing phase for some years now. "Our goal is to bring the methods a step closer to practice. The paper describes the direction in which things could move over the coming years."

Explore further

Feedback to editors

causal reasoning meets visual representation learning a prospective study

Likelihood of kids and young people smoking and vaping linked to social media use

4 hours ago

causal reasoning meets visual representation learning a prospective study

Primary health coverage found to have prevented more than 300,000 child deaths in four Latin American countries

causal reasoning meets visual representation learning a prospective study

Global life expectancy projected to increase by nearly 5 years by 2050 despite various threats

causal reasoning meets visual representation learning a prospective study

Number of people experiencing poor health, early death from metabolism-related risk factors has increased since 2000

causal reasoning meets visual representation learning a prospective study

Men at greater risk of major health effects of diabetes than women, study suggests

causal reasoning meets visual representation learning a prospective study

Discovery of a master neuron that controls movement in worms has implications for human disease

6 hours ago

causal reasoning meets visual representation learning a prospective study

First US trial of varenicline for e-cigarette cessation shows positive results

causal reasoning meets visual representation learning a prospective study

Examining the mechanisms and clinical potential of a promising non-opioid pain therapy candidate

7 hours ago

causal reasoning meets visual representation learning a prospective study

Machine learning method for predicting glioma mutations shows promise for personalized treatment

causal reasoning meets visual representation learning a prospective study

Study finds brain wiring predicted adolescents' emotional health during COVID-19 pandemic

Related stories.

causal reasoning meets visual representation learning a prospective study

Artificial intelligence facilitates better control of global development aid

Apr 13, 2022

causal reasoning meets visual representation learning a prospective study

AI model provides a hypoglycemia early warning system when driving

Feb 8, 2024

causal reasoning meets visual representation learning a prospective study

Model uses AI to create better outcomes and save costs for prediabetic patients

Feb 28, 2024

causal reasoning meets visual representation learning a prospective study

Advancing causal inference in clinical neuroscience research

Jul 13, 2023

causal reasoning meets visual representation learning a prospective study

Researcher: The quantum computer doesn't exist yet, but we are better understanding what problems it can solve

Apr 10, 2024

causal reasoning meets visual representation learning a prospective study

Causal reasoning meets visual representation learning: A prospective study

Nov 22, 2023

Recommended for you

causal reasoning meets visual representation learning a prospective study

AI may improve doctor–patient interactions for older adults with cancer

May 15, 2024

causal reasoning meets visual representation learning a prospective study

New tool can help surgeons quickly search videos and create interactive feedback

causal reasoning meets visual representation learning a prospective study

Artificial intelligence tool detects sex-related differences in brain structure

May 14, 2024

causal reasoning meets visual representation learning a prospective study

Machine learning sheds light on gene transcription

causal reasoning meets visual representation learning a prospective study

Study shows ChatGPT can accurately analyze medical charts for clinical research, other applications

May 13, 2024

Let us know if there is a problem with our content

Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form . For general feedback, use the public comments section below (please adhere to guidelines ).

Please select the most appropriate category to facilitate processing of your request

Thank you for taking time to provide your feedback to the editors.

Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.

E-mail the story

Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient's address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Medical Xpress in any form.

Newsletter sign up

Get weekly and/or daily updates delivered to your inbox. You can unsubscribe at any time and we'll never share your details to third parties.

More information Privacy policy

Donate and enjoy an ad-free experience

We keep our content available to everyone. Consider supporting Science X's mission by getting a premium account.

E-mail newsletter

IMAGES

  1. Causal Reasoning Meets Visual Representation Learning: A Prospective Study

    causal reasoning meets visual representation learning a prospective study

  2. (PDF) Causal Reasoning Meets Visual Representation Learning: A

    causal reasoning meets visual representation learning a prospective study

  3. Causal Reasoning Meets Visual Representation Learning: A Prospective Study

    causal reasoning meets visual representation learning a prospective study

  4. Causal Reasoning Meets Visual Representation Learning: A Prospective Study

    causal reasoning meets visual representation learning a prospective study

  5. Causal Reasoning Meets Visual Representation Learning: A Prospective

    causal reasoning meets visual representation learning a prospective study

  6. Causal Reasoning Meets Visual Representation Learning: A Prospective Study

    causal reasoning meets visual representation learning a prospective study

VIDEO

  1. DBRX: My First Performance TEST

  2. Learning Concept-Based Causal Transition and Symbolic Reasoning for Visual Planning (IROS 2024)

  3. Learning and Leveraging World Models in Visual Representation Learning Meta 2024

  4. Metaculus Presents

  5. Causal Representation Learning

  6. Causal Discovery

COMMENTS

  1. Causal Reasoning Meets Visual Representation Learning: A Prospective Study

    In this paper, we conduct a comprehensive review of existing causal reasoning methods for visual representation learning, covering fundamental theories, models, and datasets. The limitations of current methods and datasets are also discussed. Moreover, we propose some prospective challenges, opportunities, and future research directions for ...

  2. Causal Reasoning Meets Visual Representation Learning: A Prospective Study

    This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods, publicly available benchmarks, and consensus-building standards for reliable visual representation learning and related real-world applications ...

  3. PDF Causal Reasoning Meets Visual Representation Learning: A Prospective Study

    paper further gives some directions for conducting causal reasoning on visual representation learning tasks. Note that to the best of our knowledge, this paper is the first that proposes the potential research directions for causal visual representation learning. Secondly, a prospective review is introduced to sys-

  4. Causal Reasoning Meets Visual Representation Learning: A Prospective Study

    Causal Reasoning Meets Visual Representation Learning: A Prospective Study. Visual representation learning is ubiquitous in various real-world applications, including visual comprehension, video understanding, multi-modal analysis, human-computer interaction, and urban computing. Due to the emergence of huge amounts of multi-modal heterogeneous ...

  5. Causal Reasoning Meets Visual Representation Learning: A Prospective Study

    pecially representation learning, faces several challenges: 1) lack of interpretability, 2) poor generalization ability, and 3) over-reliance on correlations of data distribution. Causal reasoning ...

  6. Causal Reasoning Meets Visual Representation Learning: A Prospective Study

    This paper conducts a comprehensive review of existing causal reasoning methods for visual representation learning, covering fundamental theories, models, and datasets, and proposes some prospective challenges, opportunities, and future research directions for benchmarking causal reasoning algorithms inVisual representation learning. Visual representation learning is ubiquitous in various real ...

  7. Causal Reasoning Meets Visual Representation Learning: A Prospective Study

    Causal Reasoning Meets Visual Representation Learning: A Prospective Study. Yang Liu , Yu-Shen Wei , Hong Yan , Guan-Bin Li , Liang Lin. School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China. More Information. Abstract.

  8. Causal reasoning meets visual representation

    Moreover, researchers propose some prospective challenges, opportunities, and future research directions for benchmarking causal reasoning algorithms in visual representation learning.

  9. Causal Reasoning Meets Visual Representation Learning: A Prospective Study

    Causal Reasoning Meets Visual Representation Learning: A Prospective Study. Visual representation learning is ubiquitous in various real-world applications, including visual comprehension, video understanding, multi-modal analysis, human-computer interaction, and urban computing. Due to the emergence of huge amounts of multi modal heterogeneous ...

  10. 2204.12037

    Causal Reasoning Meets Visual Representation Learning: A Prospective Study (2204.12037) Published Apr 26, 2022 Abstract. Visual representation learning is ubiquitous in various real-world applications, including visual comprehension, video understanding, multi-modal analysis, human-computer interaction, and urban computing. ... we conduct a ...

  11. Causal reasoning meets visual representation learning: A prospective study

    Causal reasoning meets visual representation learning: A prospective study. by Beijing Zhongke Journal Publising Co. Overview of the structure of this paper, including the discussion of related methods, datasets, challenges, and the relations among causal reasoning, visual representation learning, and their integration.

  12. ‪Yang Liu (刘阳)‬

    Causal reasoning meets visual representation learning: A prospective study Y Liu, YS Wei, H Yan, GB Li, L Lin Machine Intelligence Research 19 (6), 485-511 , 2022

  13. PDF Causal reasoning meets visual representation learning: A prospective study

    Causal reasoning meets visual representation learning: A prospective study November 22 2023 Overview of the structure of this paper, including the discussion of related methods, datasets, challenges, and the relations among causal reasoning, visual representation learning, and their integration. Credit: Beijing Zhongke Journal Publising Co.

  14. Researchers explore causal machine learning, a new advancement for AI

    Researchers explore causal machine learning, a new advancement for AI in health care ... says a study by the group, ... Causal reasoning meets visual representation learning: A prospective study.

  15. Causal Reasoning Meets Visual Representation Learning: A Prospective Study

    Causal Reasoning Meets Visual Representation Learning: A Prospective Study . Visual representation learning is ubiquitous in various real-world applications, including visual comprehension, video understanding, multi-modal analysis, human-computer interaction, and urban computing.