REAL-MM-RAG-Bench is a benchmark designed to evaluate multi-modal retrieval models under realistic and challenging conditions.
AI & ML interests
Enterprise AI and ML, Foundation Models, Responsible AI
Recent Activity
Datasets and models of the Otter-Knowledge project
GGUF-formatted versions of IBM Granite 3.2 models. Licensed under the Apache 2.0 license.
-
ibm-research/granite-3.2-2b-instruct-GGUF
Text Generation • 3B • Updated • 489 • 9 -
ibm-research/granite-3.2-8b-instruct-GGUF
Text Generation • 8B • Updated • 677 • 7 -
ibm-research/granite-vision-3.2-2b-GGUF
3B • Updated • 506 • 10 -
ibm-research/granite-guardian-3.2-3b-a800m-GGUF
Text Generation • 3B • Updated • 42
This category highlights the collective efforts of the AI Automation team in advancing Industry 4.0 applications and exploring innovations beyond it.
-
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance
Paper • 2506.03828 • Published • 13 -
FailureSensorIQ: A Multi-Choice QA Dataset for Understanding Sensor Relationships and Failure Modes
Paper • 2506.03278 • Published • 5 -
ibm-research/AssetOpsBench
Viewer • Updated • 2.81k • 246 • 1 -
AssetOpsBench
📉Evaluating Autonomous AI Agents for Industry 4.0 Tasks
REAL-MM-RAG-Bench is a benchmark designed to evaluate multi-modal retrieval models under realistic and challenging conditions.
Welcome to IBM’s multi-modal foundation model for materials, FM4M, designed to support and advance research in materials science and chemistry.
-
ibm-research/materials.pos-egnn
Graph Machine Learning • Updated • 395 • 7 -
ibm-research/materials.mhg-ged
Feature Extraction • Updated • 565 • 4 -
ibm-research/materials.selfies-ted2m
Feature Extraction • Updated • 74 • 2 -
ibm-research/materials.selfies-ted
Feature Extraction • 0.4B • Updated • 21.3k • 9
REAL-MM-RAG-Bench is a benchmark designed to evaluate multi-modal retrieval models under realistic and challenging conditions.
This category highlights the collective efforts of the AI Automation team in advancing Industry 4.0 applications and exploring innovations beyond it.
-
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance
Paper • 2506.03828 • Published • 13 -
FailureSensorIQ: A Multi-Choice QA Dataset for Understanding Sensor Relationships and Failure Modes
Paper • 2506.03278 • Published • 5 -
ibm-research/AssetOpsBench
Viewer • Updated • 2.81k • 246 • 1 -
AssetOpsBench
📉Evaluating Autonomous AI Agents for Industry 4.0 Tasks
Datasets and models of the Otter-Knowledge project
REAL-MM-RAG-Bench is a benchmark designed to evaluate multi-modal retrieval models under realistic and challenging conditions.
GGUF-formatted versions of IBM Granite 3.2 models. Licensed under the Apache 2.0 license.
-
ibm-research/granite-3.2-2b-instruct-GGUF
Text Generation • 3B • Updated • 489 • 9 -
ibm-research/granite-3.2-8b-instruct-GGUF
Text Generation • 8B • Updated • 677 • 7 -
ibm-research/granite-vision-3.2-2b-GGUF
3B • Updated • 506 • 10 -
ibm-research/granite-guardian-3.2-3b-a800m-GGUF
Text Generation • 3B • Updated • 42
Welcome to IBM’s multi-modal foundation model for materials, FM4M, designed to support and advance research in materials science and chemistry.
-
ibm-research/materials.pos-egnn
Graph Machine Learning • Updated • 395 • 7 -
ibm-research/materials.mhg-ged
Feature Extraction • Updated • 565 • 4 -
ibm-research/materials.selfies-ted2m
Feature Extraction • Updated • 74 • 2 -
ibm-research/materials.selfies-ted
Feature Extraction • 0.4B • Updated • 21.3k • 9