Alibaba Debuts Open-Source Deep Research Agent with Benchmarks Rivaling OpenAI

Tongyi DeepResearch, with 30B-parameter scale agent and enhanced WebAgent framework, challenges proprietary tools with efficient performance

Alibaba’s Tongyi Lab has released Tongyi DeepResearch, a new open-source deep research AI agent, asserting that it matches or approaches the performance of leading proprietary tools in complex information retrieval and reasoning.

The agent has been integrated into Alibaba’s mapping app Amap and its legal research tool Tongyi FaRui, where it now offers enhanced case law retrieval with verified citations.

DeepResearch contains approximately 30.5 billion total parameters, of which only about 3.3 billion are activated per token at inference.

It is designed for long-horizon, multi-step information seeking.

Benchmarks such as Humanity’s Last Exam, BrowseComp, WebWalkerQA, xBench-DeepSearch and GAIA show state-of-the-art or near state-of-the-art performance among open-source agents.

Alibaba has also expanded its WebAgent framework, which includes models and tools like WebWalker, WebDancer, WebSailor, WebShaper, and WebWatcher.

These components aim to enhance web traversal, reasoning, and agentic search.

WebWalkerQA, for example, is a benchmark used to test performance on web navigation tasks involving many pages.

In public statements, Alibaba claims that DeepResearch achieves “incredible efficiency” compared to U.S. proprietary tools, largely because of its lighter parameter activation and its ability to handle extended context and complex retrieval without needing the full parameter set for every operation.

While early users and reviewers praise its transparency and open-source nature, some analysts warn that independent testing in more varied real-world environments will be needed to confirm robustness, reliability, and scalability—especially for domains that demand high accuracy or are sensitive to error.

#Alibaba

Add Comment