Natural Language Training Interface

One of Casino of Life's most powerful features is its natural language interface for training AI agents. This guide will help you understand how to effectively communicate with your agents using plain English.

Introduction to CaballoLoko

CaballoLoko is your AI training assistant in Casino of Life. It acts as an interpreter between your natural language commands and the underlying reinforcement learning systems.

Basic Communication

You can communicate with CaballoLoko using the chat() method:

from casino_of_life.agents import CaballoLoko

caballo_loko = CaballoLoko()
response = caballo_loko.chat("Train Liu Kang to be aggressive with fireballs")

Training Commands

CaballoLoko understands a wide range of training directives. Here are some examples:

Character Selection

caballo_loko.chat("I want to train Scorpion")
caballo_loko.chat("Switch to Sub-Zero")

Strategy Definition

caballo_loko.chat("Train the agent to focus on defensive play")
caballo_loko.chat("Make the agent more aggressive in the first round")
caballo_loko.chat("Teach the agent to use combos more frequently")

Move-Specific Training

Training Parameters

Getting Feedback

You can ask CaballoLoko for feedback on the training progress:

Advanced Interactions

CaballoLoko can also help with more complex training scenarios:

Best Practices

  1. Be specific: The more specific your instructions, the better CaballoLoko can interpret them.

  2. Build incrementally: Start with basic strategies before adding complexity.

  3. Ask for feedback: Regularly check in on training progress and request suggestions.

  4. Combine approaches: Mix natural language guidance with programmatic fine-tuning for best results.

  5. Save successful conversations: You can save productive training dialogues for future use.

Example Training Dialogue

Here's an example of a productive training conversation:

By leveraging CaballoLoko's natural language capabilities, you can create sophisticated training protocols without needing to program complex reward functions or action patterns manually.

Last updated