基于深度强化学习的自动驾驶策略学习方法

doi:10.12146/j.issn.2095-3135.201703003

首页 > 按期查看>2017年第3期 >29-40. DOI:10.12146/j.issn.2095-3135.201703003

基于深度强化学习的自动驾驶策略学习方法
DOI:
                        10.12146/j.issn.2095-3135.201703003
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
基金项目:
伦理声明:

Training Method of Automatic Driving Strategy Based on Deep Reinforcement Learning

Author:

Ethical statement:

Affiliation:

Funding:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

摘要:

自动驾驶是人工智能研究的重要应用领域，文章提出了一种基于深度强化学习的自动驾驶策略模型学习方法。首先采用在线交互式学习方法对深度网络模型进行训练，并基于专业司机的经验数据对模型进行预训练，进而结合经验池回放技术提高模型训练收敛速度，通过对状态空间进行聚类再采样，提高其独立同分布特性以及策略模型的泛化能力。通过与神经网络拟和 Q-迭代算法的比较，所提方法的训练时间可缩短 90% 以上，稳定性能提高超过 30%。以复杂度略高于训练集的测试道路长度为基准，与经验过滤的 Q-学习算法相比，采用聚类再采样的方法可以使策略模型的平均行驶距离提高 70% 以上。

Abstract:

Automatic drive is an important application field of artificial intelligence. In this paper, a novel training strategy for self-driving vehicles was investigated based on the deep reinforcement learning model. The proposed method involves a Q-learning algorithm with filtered experience replay and pre-training with experiences from professional drivers, which accelerates the training process due to reduced exploration spaces. By resampling the input state after clustering, generalization ability of the strategy can be improved due to the individual and independent distribution of the samples. Experimental results show that, in comparison with conventional neural fitted Q-iteration algorithm, the training efficiency and controlling stability can be improved more than 90% and 30% respectively by the proposed approach. Experimental results with more complex testing tracks show that, average travel distance can be improved more than 70% in comparison with the Q-learning algorithm by the proposed method.

参考文献

相似文献

引证文献

引用本文

引文格式
夏伟,李慧云.基于深度强化学习的自动驾驶策略学习方法 [J].集成技术,2017,6(3):29-40

Citing format
XIA Wei, LI Huiyun. Training Method of Automatic Driving Strategy Based on Deep Reinforcement Learning[J]. Journal of Integration Technology,2017,6(3):29-40

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:

历史

收稿日期:
最后修改日期:
录用日期:
在线发布日期: 2017-05-22
出版日期:

首页

期刊简介

编委会

作者中心

审稿中心

读者中心

伦理规范

最新资讯

联系我们

English

引用本文

分享

文章指标

历史