Design Optimization of Direct Heuristic Dynamic Programming Based on Hybrid Estimation of Distributi

来源 :2015全国理论计算机科学学术年会 | 被引量 : 0次 | 上传用户：whqqqqqqq

【摘要】

：

【作者】

：

Xiong LUO Mi ZHOU Yixuan LV

【机构】

：

School of Computer and Communication Engineering,University of Science and Technology Beijing,Beijin

【出处】

：

2015全国理论计算机科学学术年会

【发表日期】

：

2015年7期

【关键词】

：

Approximate Dynamic Programming Direct Heuristic Dynamic Programming Estimation

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

　　As an important class of approximate dynamic programming, the direct heuristic dynamic programming (DHDP) is discussed in this paper.DHDP performs well due to its model-free online learning capability.While the classical DHDP is implemented with gradient-based adaptation learning algorithm of neural network, in this paper we present a design strategy of DHDP with a novel hybrid estimation of distribution algorithm for online learning and control, and the proposed design optimization method achieves the weight training of neural networks with faster convergence rate.Our proposed approach can be viewed as an improvement for DHDP.The simulation is conducted on a practical system plant to test the online learning performance by using our approach.Then, the simulation results show the effectiveness of our approach.

其他文献

Household Electric Power Load Forecasting based on A Novel Combining Model

The individual household electricity consumption is major part of the city in the electricity market.The accurate prediction of household power load is very important for power sector to reasonable de

会议

support vector machinesseasonal autoregressive integrated moving averagecombin

An Effective Herd Behavior Detection Approach based on Density Clustering

Herd behavior is a phenomenon that often appears in the stock market.It is caused by the irrational imitation of investors and is expressed as major investors make similar investing decisions in a sho

会议

Herd Behavior DetectionDensity ClusteringBehavior Computing

A Simplified Enumeration Scheme for Minimizing Total Completion Time plus Total Penalty with Release

In this paper, we consider the problem of scheduling jobs with release dates and rejection on a bounded single parallel batching machine.Our objective is to minimize the sum of total completion time o

会议

Batch schedulingPTASRelease dateTotal penalty

Particle Swarm Optimization Based Algorithm for Conditional Probability Neural Network Learning

Conditional probability neural network (CPNN) has special advantage in pattern classification problems.However, how to find the optimal parameters of the CPNN to achieve better performance is an extra

会议

Age EstimationLabel Distribution LearningConditional Probability Neural Networ

A PM10 concentration level prediction model based on continuous HMM for Lanzhou city

The air pollution in Lanzhou city has caused wide public concern over the recent years.Among the factors leading to air pollution in lanzhou city, high PM10 concentration is an important one.Thus, pre

会议

PM10 concentration-levelpredictionhidden Markov model of continuous observatio

Flow-based adaptive mesh refinement algorithms in the simulation of flow pasting a cylinder

In CFD simulations, the smaller the cell size is, the more accurate the result is.However, a smaller cell size in all simulation regions means much more cells which in turn increased the consumption o

会议

Adaptive mesh refinementflow-basedOpenFOAM

Stochastic Model Checking Framework for Complex Cloud Applications

In virtualized and dynamical cloud computing environment, all resources such as infrastructure, hardware,platform, software and data can be virtualized and partitioned into some kinds of resouces pool

会议

Cloud computingStochastic Model CheckingMarkov Processesservice computingMod

Real-time Ligaturing Simulation of Blood Vessel in Virtual Simulation Training System of Liver Surge

This paper presents an integrated method for ligaturing simulation of blood vessel in Virtual Simulation Training System of Liver Surgery.The integrated method mainly includes four aspects: simulation

会议

Ligaturing simulationSutureFTL algorithmCollision detectionMotion decomposit

Fault Diagnosis in High-speed Train Running Gears with improved Deep Belief Networks

This paper explores the Deep Belief Networks (DBNs) in the application of high-speed train vibration signals processing.Firstly, a new method based on DBNs is proposed.The vibration signals are prepro

会议

Deep Belief NetworksFeature ExtractionFault DiagnosisK-Nearest Neighbor

A Lower Bound on Max-SAT for Regular(3,4)-CNF

A regular (3, 4)-CNF formula F is a 3-CNF formula, where each variable occurs exactly four times in F.A regular (3, 4, u)-CNF formula F is a regular (3, 4)-CNF formula, where each variable occurs u ti

会议

MAX-SATLower boundregular (34u)-CNF formulabipartite cover setclause cover

Design Optimization of Direct Heuristic Dynamic Programming Based on Hybrid Estimation of Distributi

与本文相关的学术论文