v2.0#

v1.6#

v1.5#

Materials

Competitions

Videos

Poster [Poster] The Neural MMO Platform for Massively Multiagent Research (NeurIPS 2021) (v1.5.3)

Presentations

The IJCAI 2022 Neural MMO Challenge (IJCAI 2022) (1.5.1)
[Slides] [Video] Neural MMO: Building a Massively Multiagent Research Platform with Ray and RLlib (Ray Summit 2021, Online) (v1.5.0)
[Slides] Neural MMO: A Saga in Deep Reinforcement Learning (English Week 2021, IUT Vannes) (v1.5.0)

v1.5.5: Better Baselines

Slightly exceeds performance of the scripted combat model with CleanRL
Training takes 1 GPU, 32 cores, and <3 days
Includes checkpoints and training code

v1.5.4: CleanRL Integration

Emulation layer that makes Neural MMO look like a simpler environment:
- Multiagent -> N single-agent environments
- Flat observations and actions
- Fixed horizons
Single-file CleanRL baseline using the above wrappers
Evaluation tools

v1.5.3: PettingZoo Integration

v1.5.2: Ray Tune and WanDB Integrations

Trinity:
- Added support for simulations with both scripted and trained agents
- Added ability to name scripted agents based on their policy
Embyr:
- Minor aesthetic changes to prefer a flat-shaded style
- Broke some overlay features :/ RLlib bug under construction
Projekt
- Replaced Bokeh dashboard with WanDB integration
- Wrapped RLlib trainers in Ray Tune to enable parallel evaluation during training
- Added Skill Rating (SR) metric for direct comparison to scripted baselines
- Changed batching mode to agent steps, yielding a large policy improvement

v1.5.1: Competition Build

Blade:
- Modularized configs to enable dynamic environment customization
- Reworked terrain generation to create more diverse terrain
- Increased default map and population size
- Added competition configs and baselines
Trinity: Formal API for scripted agents using the same observation interface as learned models
Embyr: Culled vertices and recalculated normals to improve terrain smoothness and performance

v1.5: Large maps, Dashboard, Scripted Baselines

Blade: Full rework to support large environments and scripted players/NPCs
- Map representation
  
  Terrain generation for large maps
  
  Environment caching to enable fast resets
  
  Tiles are now limited to one occupying agent
  
  Reworked tile material enum and properties
- NPCs
  
  Passive: Meanders around the map
  
  Neutral: Meanders around the map until attacked, then fights back
  
  Hostile: Actively hunts and attacks players and other NPCs
  
  Level ranges and spawning locations are configurable for all NPC types
  
  Navigation based on A* search
- Scripted Baselines
  
  Extension of the NPC AI module to support scripted player policies
  
  Fixed-horizon food/water min-max search with Dijkstra’s algorithm and dynamic programming backends
  
  Intentional exploration capabilities enable broad coverage of large and small maps
- Equipment
  
  NPCs spawn with chestplates/platelegs of a level appropriate for their skills
  
  Players/NPCs wearing equipment drop it upon death
  
  Players automatically equip any items better than their current items
  
  Equipment provides a large bonus to defense
  
  Reworked combat formulas to account for this new system
Trinity: New home for non-neural-specific infrastructure and tools
- Serialized observations
  
  Maintains a flat tensor representation of the environment state
  
  This representation is kept synchronous with the game state representation
  
  Each entity (Player/Tile) is represented as discrete and continuous vectors
  
  Observations are computed by slicing from tensor representations without traversing game objects
  
  Discrete values are flat-indexed for ease of use in embedding layers
- Evaluation
  
  Runs the given model on multiple maps and aggregates data for the dashboard
  
  Outputs a tabular summary of the results for baselines and publications
  
  Usable on training maps, held-out evaluation maps (default), and transfer maps
- Dashboard
  
  Environment log function records customizable data for customizable plot types whenever an agent dies
  
  Data is aggregated during training and at the end of evaluation
  
  Bokeh dashboard is built using the aggregated data for the specified plot types
  
  Dashboard is rendered in an interactive browser session
Ethyr: Simplified attribute processing
- The Trinity additions flatten the bottom layer of the observation hierarchy
- This removes a slow loop and significant complexity from IO embedding/unembed modules
- We have standardized on the Recurrent baseline architecture for this release
Embyr: Full rework to support large environments and scripted players/NPCs
- Map representation
  
  All terrain representation code has been rewritten using the performant Unity Entity Component System
  
  Tiles are loaded into and welded together in chunks
  
  Lava/water assets have been replaced with more performant variants
- Visuals
  
  Tile textures are now configurable with the hifi (default)/medfi/lofi command
  
  Attack animations have been replaced with more distinctive and aesthetic assets
  
  A graphical bug causing sharp normals in some tile models has been fixed
  
  UI and console retouched to match the new website theme
Projekt: Demo code for evaluation, overlays and logging
- Unified command-line utility for map generation, training, evaluation, visualization, and rendering
- Experiment config for canonical large/small baseline tasks
- Single-file ~400 line RLlib wrapper/demo
- Non-RLlib specific code has been moved to Trinity
- Improved overall code cohesion and quality

v1.4#

Materials

ICML 2020 LAOW Workshop [Poster] [Paper] Ingredients for Massively Multiagent Artificial Intelligence Research

RLlib Support and Overlays

Blade: Minor API changes have been made for compatibility with Gym and RLlib
- Exposed the registerOverlay() and getValStim() methods for writing custom overlays
- Environment reset method now returns only obs instead of (obs, rewards, dones, infos)
- Environment obs and dones are now both dictionaries keyed by agent ids rather than agent game objects
- The IO modules from v1.3 now delegates batching to the user, e.g. RLlib. As such, several potential sources of error have been removed
- A bug allowing agents to use melee combat from farther away than intended has been fixed
- Minor range and damage balancing has been performed across all three combat styles
Trinity: This module has been temporarily shelved
- Now hosts the Twisted server code for interfacing with the client
- Core functionality has been ported to RLlib in collaboration with the developers
- We are working with the RLlib developers to add additional features essential to the long-term scalability of Neural MMO
- The Trinity/Ascend namespace will likely be revived in later infrastructure expansions. For now, the stability of RLlib makes delegating infrastructure pragmatic to enable us to focus on environment development, baseline models, and research
Ethyr: Proper NN building blocks for complex worlds
- Streamlined IO, memory, and attention modules for use in building PyTorch policies
- A high-quality pretrained baseline reproducible at the scale of a single desktop
Embyr: Overlay shaders for visualizing learned policies
- Pressing tab now brings up an in-game console
- A help menu lists several shader options for visualizing exploration, attention, and learned value functions
- Shaders are rendered over the environment in real-time with partial transparency
- It is no longer necessary to start the client and server in a particular order
- The client no longer needs to be relaunched when the server restarts
- Agents now turn smoothly towards their direction of movement and targeted adversaries
- A graphical bug causing some agent attacks to render at ground level has been fixed
- Moved twistedserver.py into the main neural-mmo repository to better separate client and server
- Confirmed working on Ubuntu, MacOS, and Windows + WSL
/projekt: Demo code fully rewritten for RLlib
- The new demo is much shorter, approximately 250 lines of code
- State-of-the-art LSTM + self-attention based policy trained with distributed PPO
- Batched GPU evaluation for real-time rendering
- Trains in a few hours on a reasonably good desktop (5 rollout worker cores, 1 underutilized GTX 1080Ti GPU)
- To avoid introducing RLlib into the base environment as a hard dependency, we provide a small wrapper class over Realm using RLlib’s environment types
- Attempted to migrate from a pip requirements.txt to Poetry for streamlined dependency management, but Poetry is still too buggy at the present.
- We have migrated configuration to Google Fire for improved command line argument parsing

v2.0#

v1.6#

v1.5#

v1.4#

v1.3#

v1.2#

v1.1#

v1.0#

v0.x#