MulTaBench benchmark evaluates multimodal tabular learning strategies
——0——
MulTaBench pairs tabular records with text and images across clinical, real-estate, and product domains. The benchmark draws datasets from Kaggle and OpenML to test structured-only, unstructured, joint-frozen, and target-aware modeling approaches. A paper posted on the Hugging Face platform shows target-aware representations outperform frozen LLM and VLM embeddings and concludes that future progress depends on models that jointly process tabular and unstructured modalities aligned to prediction targets.