联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp2

您当前位置:首页 >> CS作业CS作业

日期:2019-11-16 08:45


The goal of this Assignment is to expose you to some of the 'delights' of

building machine agents for playing video games. For this purpose we will assume the 'Deadly Corridor' task from the VizDoom

game engine. VizDoom is an open source port of the FPS title that provides several different

'task' configurations as well as the 'death match' for which the game is well

known. What makes this platform interesting from a learning agent perspective

is that the first person perspective renders the task partially observable, whereas most instances of the Atari arcade games provide full observability. The basic challenge of your Assignment is to demonstrate the application of a

visual reinforcement learning agent on the specific instance of the 'Deadly

Corridor' task.

In order to get you started, we will provide a paper detailing the application of

the Tangled Program Graph (TPG) framework for visual reinforcement learning

to VizDoom. You will need to familiarize yourself with this reference. Two code bases for TPG are made available: ? Python maintained by Ryan Amaral with some FAQ. ? Java maintained by Robert Smith. o Windows OS compatible Eclipse package: Copy this into your

Eclipse workspace folder and import the Assignment. You might

also need to change the Java path to match your version of Java. o Once imported into Java, all the dependencies should be

properly set up and you should be able to drop the TPG source

into the Assignment. Then update your API execution file and it

will be able to see all the correct code/DLL paths without any

need for additional setup. The deadly corridor task requires you to successfully collect the ‘armour’ in the

last of a sequential sequence of 3 rooms connected by short corridors, as per

the following figure:

Your agent is spawn in the first room (LHS) and has to successfully pass the

opponent agents present in each room in order to finally collect the armour

(RHS). Given that you are using a TPG learning agent for this task, you will

have to consider how to provide rewards for achieving useful behaviours in this

task. One example might be to reward removing opponent agents from the

game as well as minimizing distance to the armour. Other factors might include

reducing the cost (to character health) of being hit or experimenting with

different methods of reproduction. The baseline behaviour corresponds to an agent that dies in room 2, and is

worth a grade of B-. You need do following task for this Assignment: ? Provide your code, and show outcomes from your Assignment. ? 4 page written report. Such a report needs to summarize the findings of

your Assignment detailing what you have learn over the course of the

Assignment. View this as an opportunity to pass on some

important/pragmatic tricks of the trade and/or caveats you picked up

over the course of the Assignment. With this in mind emperical

evidence needs to be demonstrated to emphasize the significance of

your findings/recommendations.


版权所有:留学生编程辅导网 2018 All Rights Reserved 联系方式:QQ:99515681 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。