Project Summary

Our hero Sam will fight with zombies in a 16x16 arena. The difference between our Sam and Sam who in the video game is that our Sam is using golden sword to fight. Sam is able to move and turn around to deal damage and try to kill all the zombies from all directions. Under 20 seconds limit, Sam will try to kill all the enemies while avoiding as less damage as possible.

Current situation video:

Approach

Deep Q Learning

Evaluation

As of right now, we are evaluating Sam’s performance based on the average return after N episodes. We are doing so because, reward shows that the agent has been learning how to achieve that reward through certain decisions.

Qualitative:

Quantitative:

Remaining Goals and Challenges

For the update version in the future, we expect our hero Sam will be able to move and turn around faster for dodging the attacks from the enemies. Moreover, Sam’s attacking timing is also need to be improved; so, Sam can deal more damage to kill enemies. the enemies will spawn randomly in the map. For answering this, we need to improve our Q table, angle and choosing target functions to make the situation better. For instance, which direction should the agent turn, which way has less enemies, avoiding to get knock back into walls or corners and attack faster and more accuracy. If we can do all of these above, maybe we can try different enemies like witches who can fly in the air; so, Sam should switch weapon like bow to shoot witches.

Resources Used

build_test.py, mob_fun.py, tabular_q_learning.py and moving_target_test.py from Canvas page.
https://eg.bucknell.edu/~cld028/courses/379-FA19/MalmoDocs/classmalmo_1_1_mission_spec-members.html
https://microsoft.github.io/malmo/0.17.0/Python_Examples/Tutorial.pdf
https://tsmatz.wordpress.com/2020/07/09/minerl-and-malmo-reinforcement-learning-in-minecraft/
https://github.com/microsoft/malmo/blob/master/Schemas/Types.xsd
https://microsoft.github.io/malmo/0.14.0/Schemas/Mission.html#element_Weather
https://towardsdatascience.com/simple-reinforcement-learning-q-learning-fcddc4b6fe56