Skip to content
ToolBench logo

ToolBench

Verified

Open-source framework for training LLMs to master thousands of real-world APIs.

Autonomous AgentsAgent Frameworks 5.7kOpen source
View on GitHub
Updated 2026-06-16
ToolBench GitHub repository

What is ToolBench?

ToolBench is an open-source project that builds high-quality SFT datasets so language models can learn to call real APIs effectively. It gathers thousands of REST endpoints, generates single-tool and multi-tool instructions, and annotates complete solution paths that include reasoning, calls, and results.

Data creation relies on an enhanced ChatGPT instance together with a depth-first search decision tree that explores tool sequences efficiently. An API retriever component is included so models can discover relevant tools at inference time rather than depending on a fixed list.

The release targets researchers and developers who want reproducible tool-use capabilities in open models without relying solely on closed APIs or manual annotation.

Capabilities

generate tool-use instruction data
fine-tune ToolLLaMA models
evaluate api calling performance
support thousands of real-world apis
provide rapidapi backend service

What you can build with ToolBench

Fine-tuning for API calling

Train or adapt models on the released dataset to improve accuracy on both simple and chained tool invocations.

Benchmarking tool-use agents

Apply the included ToolEval scripts to measure how well different LLMs plan and execute API sequences.

Open-domain tool retrieval

Combine the provided retriever with ToolLLaMA to let models fetch and use APIs they have never seen during training.

Install ToolBench

Quick start
git clone git@github.com:OpenBMB/ToolBench.git
cd ToolBench
  1. 1Clone the ToolBench GitHub repository
  2. 2Download the latest data archive from the linked Google Drive folder
  3. 3Install dependencies listed in the project requirements
  4. 4Launch the RapidAPI backend service or use the hosted key after form approval
  5. 5Run the supplied fine-tuning or evaluation scripts with your chosen base model

Works with

OpenAI APIPythonLLaMA-2RapidAPI

ToolBench: pros & cons

Pros

  • +Massive, automatically generated dataset with intact reasoning traces
  • +Open-source model checkpoints that reduce API hallucination
  • +Built-in support for both single-tool and multi-tool scenarios
  • +Public evaluation framework covering multiple model families

Cons

  • Requires access to RapidAPI or a local simulation server
  • Training still demands substantial GPU resources
  • Periodic server IP updates needed for the hosted backend
Did you find this helpful?

Frequently asked questions

A LLaMA-based model fine-tuned on ToolBench data to improve tool-use performance and reduce hallucinations.

User reviews

Verified reviews from the community shape this listing's rating.

Loading reviews…

Sign in to review

Promote ToolBench

Add this badge to your website, or share the tool.

DFeatured on DhanasviToolBench 0