English
全部
搜索
图片
视频
地图
资讯
Copilot
更多
购物
航班
旅游
笔记本
Top stories
Sports
U.S.
Local
World
Science
Technology
Entertainment
Business
More
Politics
时间不限
过去 1 小时
过去 24 小时
过去 7 天
过去 30 天
最佳匹配
最新
生物通
1月
评估大语言模型(LLMs)在可解释的深度强化学习(explainable deep ...
本文评估了CoT、MCTS增强和SFT三种方法在生成强化学习解释中的效果,发现MCTS显著提升大模型在复杂环境(如Lunar Lander)的解释质量,而SFT对中小模型更有效。通过LLMs作为评判者,验证了自动化评估框架与人工评估高度一致(Cohen's κ=0.77,Spearman ρ=0.88)。
一些您可能无法访问的结果已被隐去。
显示无法访问的结果
今日热点
Shooting in Portland
House revives ACA subsidies
Senate votes to limit power
California loses $160M
Announces run for LA mayor
Walz issues warning order
Wisconsin man pleads guilty
Cancels Kennedy Center shows
SLC church shooting
Launches reelection bid
Seeks $1.5T defense budget
Meta faces China probe
Announces fraud task force
Agree to $15.65M, 1-yr deal
Tennessee professor reinstated
3 arrested in home burglary
NASA cancels spacewalk
Strikes cut power in UKR
Dolphins fire head coach
Arrested in Ohio
Farmers block Paris streets
Judge disqualifies prosecutor
Unveil free child care plan
FBI cuts access to evidence
Final State-of-State address
US trade deficit falls
US jobless claims rise
RU frees French scholar
Will release prisoners
To be subpoenaed
Severe storms in Oklahoma
Ex-referee avoids prison
Announces royal commission
反馈