基于信息熵的POMDP模型观测函数估计

发布时间:2015-10-28 作者:钟可立,王小捷 阅读量:

[摘要] 部分可观测马尔可夫决策过程(POMDP)广泛应用于建模决策任务。模型中的观测矩阵主要用来建模环境的不确定性,通常很难从训练数据中直接获取,需要引入额外的信息进行估计。通过引入信息熵来修正模型中的观测矩阵,修正后的观测矩阵更能反映环境的不确定性。模拟环境下的实验表明,引入信息熵进行修正估计的观测矩阵有效提高了POMDP模型的性能,而在基于POMDP模型的对话系统中,修正的估计提高了系统的决策准确度。

[关键词] 部分可观测马尔可夫决策过程;不确定性;意图识别;观测矩阵;信息熵

[Abstract] Partially Observable Markov Decision Process (POMDP) is a decision model used extensively for decision tasks. The observation matrix of the model is a channel that reflects the uncertainty of surroundings, which is hard to do directly from the corpus. Extra information needs to be introduced for estimation of the observation matrix and better reflection of surroundings. The concept of information entropy is introduced to modify the observation matrix in the model by which the modified observation matrix can reflect the uncertainty of the situation more precisely. Simulated experiment and real situation show that introducing information entropy to modify the observation matrix improves performance of the POMDP model together with the decision-making accuracy in a dialogue system based on POMDP.

[Keywords] partially observable markov decision process; uncertainty; intention identification; observation matrix; information entropy

下载阅览: PDF