CVPR Workshop 2026 on 'From Perception to Persuasion: Challenges and Advances in Misinformation Detection in Society (PP-MisDet )'
Misinformation in visual and multimodal form is no longer just an image forensics problem or a generic “detect the fake” benchmark. It is a systemic risk to civic trust, public safety, democratic process, and crisis response. The proposed workshop explicitly targets how manipulated visuals and misleading captions are weaponised to steer attention, escalate outrage, justify policy action, or erode institutional legitimacy.
Advances in generative vision–language systems have made it easy to fabricate persuasive visual “evidence” and pair it with misleading narratives, accelerating large-scale misinformation campaigns. PP- MisDet focuses on perception and persuasion: how falsified or staged visual content is created, circulated, trusted, and operationalised in high-stakes settings such as elections, health, and conflict. The workshop brings together computer vision, multimedia forensics, human factors, policy, and platform integrity, addressing both algorithmic innovation and interdisciplinary perspectives. We emphasise robustness under coordinated abuse, human–AI collaboration in verification and trustworthy evaluation benchmarks for real-world harm, beyond binary “deepfake vs. real.”
Topics to be covered
- Multimodal misinformation and persuasion analysis
- Detection, attribution, and provenance of AI-generated or manipulated media
- Media authenticity
- Cross-platform misinformation
- Human perception, cognitive bias, and trust calibration
- Safety-critical verification systems
- Adversarial pressure against verification pipelines
- Explainability, accountability, and auditability of fact-checking AI
- Human–AI collaboration for verification
- Benchmark datasets, evaluation protocols, and stress tests
- Responsible synthetic data generation
- Societal, legal, and policy implications
- Transparency, safety, fairness, accountability, and abuse prevention in multimodal vision–language systems
- Vision for societal good and civic resilience
- Future directions
Submission
All submissions will be handled electronically via the workshop's CMT paper submission website CMT Website. Papers are limited to eight pages, including figures and tables, in the CVPR style. Additional pages containing only cited references are allowed. A complete paper should be submitted using the CVPR templates, which are blind-submission review-formatted templates.
Please refer to CVPR 2026 Author Guidelines for detailed formatting instruction.
The Microsoft CMT service was used for managing the peer-reviewing process for this conference. This service was provided for free by Microsoft and they bore all expenses, including costs for Azure cloud services as well as for software development and support.
Organisation
Organising Co-Chairs
Dr Priyanka Singh
Senior Lecturer of Cybersecurity
School of Electrical Engineering and Computer Science at The University of Queensland
Dr Singh has a PhD in image forensics and has more than 11 years of experience in multimedia forensics, homomorphic encryption, privacy-preserving frameworks, and cloud security. Her recent work spans around provenance and tracking of media manipulation, explainable models for digital forensics and cyber security domain, encrypted domain perceptual hashing for privacy-preserving workflows, and attending to the security and privacy issues.
Professor Xue Li
Professor
School of Electrical Engineering and Computer Science at The University of Queensland
Professor Li's major areas of research interests and expertise include data mining, social computing, health data analytics, and intelligent web information systems. His work on large-scale behavioural pattern discovery and risk monitoring directly supports media manipulation and real-time harm assessment.
Associate Professor Pradeep K. Atrey
Co-Director of the Albany Lab for Privacy and Security (ALPS)
University of Albany, SUNY
Associate Professor Atrey's group works on multimedia intelligence in encrypted spaces, media rich disinformation detection and mitigation, privacy, and secure large-scale analytics. He has served in senior organisational roles for ACM Multimedia and related venues and has an extensive publication record in multimedia forensics, provenance, and content authentication.
Technical Program Committee
- Yanjun Zhang, University of Technology Sydney, Australia
- Guanfeng Liu, Macquarie University, Australia
- Jia Wu, Macquarie University, Australia
- Lin Yue, University of Adelaide, Australia
- Manoranjan Mohanty, Carnegie Mellon University, Qatar
- Vivek K. Singh, Rutgers University, USA
- Sharif Abuadbba, CSIRO’s Data61, Australia
- Shahroz Tariq, CSIRO’s Data61, Australia
- Rajat Subhra Chakraborty, IIT Kharagpur, India
- Ruchira Naskar, IIEST Shibpur, India
- Parag Rughani, National Forensic Sciences University, India
- Ling Chen, National Yang Ming Chiao Tung University, Taiwan
- Marimuthu Palaniswami, The University of Melbourne
- Shantanu Pal, Deakin University
- Mashhuda Glencross, The University of Queensland
- Anne Kruger, The University of Queensland
- Katie Williams, The University of Queensland
- Saeed Akhlaghpour, The University of Queensland
Web and Publicity Chair
- Gagandeep Singh, The University of Queensland, Australia
Keynotes
Mohan S. Kankanhalli
National University of Singapore
Talk title:
Rethinking Misinformation Defense Through the Lens of Human Behaviour
Brief summary:
This keynote explores misinformation defence beyond classifier-based detection, focusing on how human cognition and social motivation shape responses to misleading content. The talk discusses credibility indicators, user sharing behaviour, and explainable AI systems for anticipating human responses to misinformation.
Speaker note:
Prof. Mohan S. Kankanhalli is Provost’s Chair Professor of Computer Science at the National University of Singapore and Founding Director of the NUS AI Institute. His research spans multimodal computing, computer vision, and trustworthy AI.
Gianluca Demartini
The University of Queensland
Talk title:
How Bias in LLMs can Influence Human Decisions
Brief summary:
This keynote explores how bias in large language models can shape human decision-making and online information exposure. The talk discusses political bias in AI systems and how AI can also be leveraged to detect persuasive and manipulative content online.
Speaker note:
Prof. Gianluca Demartini is Professor of Data Science and ARC Future Fellow at The University of Queensland. His research focuses on Information Retrieval, Responsible AI, and human-centred data science.
Vimala Balakrishna
Universiti Malaya
Talk title:
The Psychology of Digital Deception in the Age of AI
Brief summary:
This keynote examines how misinformation, scams, social engineering, and AI-generated content exploit human behaviour and decision-making. The session also discusses deepfakes, multimodal deception, and emerging AI-driven approaches for misinformation detection.
Speaker note:
Prof. Ts. Dr. Vimala Balakrishnan is a Professor at Universiti Malaya whose research focuses on cyberpsychology, social media behaviour, digital deception, and human-centred cybersecurity.
Dataset Challenge
For Submissions, please fill out this form
DGM⁴+ Challenge on Global Scene Inconsistency Detection
PP-MisDet Workshop @ CVPR 2026
Overview
The DGM⁴+ Challenge on Global Scene Inconsistency Detection aims to advance research on detecting and explaining semantic inconsistencies between visual content and accompanying text.
Modern multimodal misinformation increasingly relies on contextual manipulation, where fabricated foregrounds, misleading backgrounds, and deceptive captions are combined to produce persuasive but false narratives. Unlike traditional deepfake benchmarks that focus on low-level artifacts, DGM⁴+ emphasizes scene-level and narrative-level inconsistency.
This challenge provides a standardized benchmark for evaluating models that jointly reason over images and text to identify misleading content.
Task Description
Given an image–caption pair, participants must jointly perform:
Authenticity Classification: Predict whether the pair is semantically consistent (authentic) or contains contextual manipulation (manipulated).
Manipulation-Type Classification: For manipulated samples, predict whether the inconsistency corresponds to: Text Swap or Text Attribute.
ModificationTextual Grounding of Inconsistency: Identify caption tokens or spans that introduce misleading, fabricated, or contextually inconsistent information.
All three outputs are mandatory and are jointly evaluated.
Dataset
The DGM⁴+ dataset contains approximately 5,000 news-style image–caption pairs, including:
Authentic scenes
Text swap manipulations
Text attribute manipulations
Narrative reframing
Dataset Splits
Split | Availability | Purpose |
Training | Public | Model development |
Validation | Public | Hyperparameter tuning |
Test | Hidden | Leaderboard evaluation |
Public splits are released under a CC BY-NC 4.0 license.
Dataset is available to view and download at: https://drive.google.com/file/d/1kXNqljyJ7EHmHnRn3ORPgLjawnJ63YHL/view?usp=sharing
For more information about the dataset: https://arxiv.org/abs/2509.26047
Required Outputs
Each submission must provide predictions for all test samples.
(a) Authenticity Prediction
A CSV file containing:
id,label,confidence,manipulation_type
Where:
Binary_label ∈ {authentic, manipulated}
confidence ∈ [0,1]
manipulation_type ∈ {origin, text_swap, text_attribute}
For authentic samples, manipulation_type must be origin.
(b) Text Grounding Output
A JSON file containing token indices corresponding to misleading or manipulated text:
{
"0001": [3,4,5,6]
}
Tokenization follows the official tokenizer released with the dataset.
For authentic samples, an empty list must be submitted.
Optional Cross-Modal Grounding and Explanation Track
In addition to required outputs, participants are encouraged to submit cross-modal explanations linking misleading textual elements to relevant visual evidence.
For each manipulated sample, teams may optionally provide:
Alignments between identified misleading tokens and corresponding image regions, and/or
Visualizations illustrating how textual claims are supported or contradicted by visual content.
These explanations are intended to demonstrate how models reason about semantic inconsistency across modalities.
Optional submissions are not included in the official leaderboard evaluation and do not affect ranking.
However, high-quality cross-modal grounding and explanation outputs will be highlighted during the poster session and invited presentations.
Selected teams may receive a Best Explainability Award.
Optional Submission Format
Optional explanations may be submitted as:
Visualization images (PNG/JPEG), and/or
JSON files linking token indices to bounding boxes.
Example format:
{
"0001": {
"tokens": [3,4,5],
"regions": [0.12,0.34,0.56,0.78]
}
}
Submission Format
Each submission must be packaged as:
submission.zip
├── classification.csv
└── text_grounding.json
1. Classification Evaluation
Authenticity classification is evaluated using:
Accuracy
F1-score
ROC–AUC
Confidence scores are used for AUC computation.
2. Text Grounding Evaluation
For manipulated samples, prediction of text swap vs. text attribute is evaluated using:
Accuracy
Macro F1-score
This evaluates the system’s ability to distinguish different forms of semantic inconsistency.
Evaluation Protocol
Evaluation is conducted independently for each component and combined into a final score.
Final rankings are computed using:
Score = 0.5 × F1_binary
+ 0.3 × F1_text
+ 0.2 × F1_type
This weighting prioritizes reliable detection of semantic inconsistency while rewarding accurate explanation and manipulation-type recognition.
Challenge Timeline
Date | Milestone |
Jan 10, 2026 | Challenge launch |
Mar 05, 2026 | Submission portal opens |
May 10, 2026 | Test set release |
May 20, 2026 | Final submissions due |
May 30, 2026 | Results announced |
Jun 03, 2026 | On-site presentations and awards |
Participation
To participate:
Register via the challenge platform (add link here)
Download the training and validation sets
Develop and evaluate models
Submit predictions on the hidden test set
Registration links and instructions will be provided on this page.
Awards
Outstanding contributions will be recognised through:
Best Overall Performance
Best Explainability and Grounding
Best Student Team
Selected teams will be invited to present short papers and demos at PP-MisDet.
Ethical Use Policy
All manipulated samples are synthetically generated and do not depict real individuals or events.
Participants agree to:
Use the dataset exclusively for research purposes
Not generate or disseminate misinformation
Not redistribute the dataset without permission
Violations may result in disqualification.
Transparency and Reproducibility
After the challenge concludes, evaluation scripts, annotation guidelines, and benchmark statistics will be publicly released to support long-term community use.
Updates
Additional information regarding baseline models, leaderboard access, and submission procedures will be posted on this page.
Please check regularly for updates.
Contact
Dr Priyanka Singh
Priyanka.Singh@uq.edu.au
Important Dates
Submission site opens: Feb 7, 2026 (11:59pm AOE)
Workshop paper submission deadline: March 6, 2026 (11:59pm AOE)
Notification to authors: March 20, 2026 (11:59pm AOE)
Camera ready deadline: April 10, 2026 (11:59pm AOE)