Alibaba AI Model Tops Humans in Reading Comprehension

简介: Alibaba’s Institute of Data Science and Technologies (iDST) said Monday its deep neural network model scored 82.

_

Score one for machines in the battle of man versus machine, with an Alibaba deep-learning model this month topping humans for the first time in one of the world’s most-challenging reading comprehension tests.

Alibaba’s Institute of Data Science and Technologies (iDST) said Monday its deep neural network model scored 82.44 in the Stanford Question Answering Dataset (SQuAD) on Jan. 11, beating the human score of 82.304 for Exact Match, i.e. providing exact answers to questions. The SQuAD is a large-scale reading comprehension dataset comprised of over 100,000 question-answer pairs based on over 500 Wikipedia articles.

“It is our great honor to witness the milestone where machines surpass humans in reading comprehension,” said Luo Si, iDST’s chief scientist for Natural Language Processing. “We are thrilled to see NLP research has achieved significant progress over the year. We look forward to sharing our model-building methodology with the wider community and exporting the technology to our clients in the near future.”

Teams competing in the challenge need to build machine-learning models that can provide answers to the questions in the dataset, such as “what causes rain?” The Alibaba model’s accuracy was tied to its ability to read from paragraphs to sentences to words, locating precise phrases that contain potential answers. That model, which leverages the Hierarchical Attention Network, is viewed as having strong commercial value. Alibaba has used the underlying technology in its 11.11 Global Shopping Festival for several years, with machines answering large amounts of inbound customer inquiries.

Other potential customer-service uses included tutorials for visitors to museums and online responses to inquiries from some medical patients.

The SQuAD is perceived as the world’s top machine reading-comprehension test and attracts universities and institutes ranging from Google, Facebook, IBM, Microsoft to Carnegie Mellon University, Stanford University and the Allen Research Institute.

While its SQuAD performance is a milestone, it’s just one of the proof points made by the iDST’s Natural Language Processing Team recently. Other successes include the best scores and prizes in the ACM CIKM Cup, which focuses on personalized e-commerce searches, Chinese Grammar Error Diagnosis and English-named entity classifications tasks at the Text Analysis Conference, a series of workshops arranged by the U.S. National Institute of Standards and Technology.

The iDST is Alibaba’s primary research arm focusing on artificial intelligence. It’s heavily into Natural Language Processing and solving problems that lead to real-world applications.

目录
相关文章
|
3月前
|
人工智能 Java Nacos
基于 Spring AI Alibaba + Nacos 的分布式 Multi-Agent 构建指南
本文将针对 Spring AI Alibaba + Nacos 的分布式多智能体构建方案展开介绍,同时结合 Demo 说明快速开发方法与实际效果。
3226 67
|
3月前
|
人工智能 运维 Java
Spring AI Alibaba Admin 开源!以数据为中心的 Agent 开发平台
Spring AI Alibaba Admin 正式发布!一站式实现 Prompt 管理、动态热更新、评测集构建、自动化评估与全链路可观测,助力企业高效构建可信赖的 AI Agent 应用。开源共建,现已上线!
5219 75
|
9月前
|
人工智能
Open AI Model
Open AI Model is an open model for defining AI. Focused on AI rather than application, Open AI Model [OAM] brings simplest but most powerful design for modeling AI.
508 140
|
4月前
|
人工智能 Java 机器人
基于Spring AI Alibaba + Spring Boot + Ollama搭建本地AI对话机器人API
Spring AI Alibaba集成Ollama,基于Java构建本地大模型应用,支持流式对话、knife4j接口可视化,实现高隐私、免API密钥的离线AI服务。
4105 2
基于Spring AI Alibaba + Spring Boot + Ollama搭建本地AI对话机器人API
|
3月前
|
人工智能 监控 Java
Spring AI Alibaba实践|后台定时Agent
基于Spring AI Alibaba框架,可构建自主运行的AI Agent,突破传统Chat模式限制,支持定时任务、事件响应与人工协同,实现数据采集、分析到决策的自动化闭环,提升企业智能化效率。
Spring AI Alibaba实践|后台定时Agent
|
5月前
|
人工智能 Java 开发者
邀您参与 “直通乌镇” Spring AI Alibaba 开源竞技挑战赛!
邀您参与 “直通乌镇” Spring AI Alibaba 开源竞技挑战赛!