cover

Evaluating Visual Adapters: MIVPG Performance on Single and Multi-Image Inputs

15 Nov 2025

Details MIVPG experiments across single- and multi-image scenarios. Model uses frozen LLM and Visual Encoder, updating only the MIVPG for efficiency.

cover

MIVPG and Instance Correlation: Enhanced Multi-Instance Learning

15 Nov 2025

MIVPG uses a Correlated Self-Attention (CSA) module to unveil instance correlation, fulfilling all MIL properties while outperforming Q-Former.

cover

Multimodal Fusion: MIVPG's Hierarchical MIL Approach for Multi-Image Samples

15 Nov 2025

Details MIVPG's hierarchical approach to MIL for multi-image samples. It treats both image patches and whole images as 'instances' for feature aggregation

cover

MIL Perspective: Analyzing Q-Former as a Multi-Head Mechanism

14 Nov 2025

Proves Q-Former is a Multi-Head MIL module due to permutation invariance in its cross-attention.

cover

Visual Prompt Generators (VPGs): Encoding Images to LLM Tokens

14 Nov 2025

Explains how MLLMs use VPGs and cross-attention with learnable query embeddings to extract essential visual tokens from image patches for LLM input.

cover

Multiple Instance Learning: Review of Instance and Embedding Level Approaches

13 Nov 2025

Reviews Multiple Instance Learning, contrasting instance-level and embedding-level approaches, while focusing on neural network pooling methods.

cover

MLLM Adapters: Review of VPGs and Multimodal Fusion

12 Nov 2025

Reviews state-of-the-art MLLMs. Highlights the challenge of expanding current models beyond the simple one-to-one image text relationship.

cover

Dusted Input Images: Visualizing Decision Boundary Distillation

12 Nov 2025

This article explains and visualizes the use of "dusted input images"—inputs perturbed with strong Gaussian noise—to distill the model's decision boundary

cover

Network Size and Task Number: Ablation Study on IIL Performance and Stability

12 Nov 2025

This article presents an ablation study showing that the proposed IIL method performs well with larger networks