Casino of Life
  • Cimai's Casino of Life Docs
  • Casino of Life
    • Getting Started with Casino of Life
    • Natural Language Training Interface
    • Understanding the Reward System
    • Web Interface and Dashboard
  • Technical Architecture
  • Advanced Training Techniques
  • Example Projects
  • API Reference
Powered by GitBook
On this page
  • System Overview
  • Core Components
  • Data Flow
  • Integration Points
  • Performance Considerations

Technical Architecture

This document outlines the technical architecture of Casino of Life, providing insights into its components, data flow, and integration points.

System Overview

Casino of Life consists of several interconnected components that work together to enable natural language-driven AI training for retro fighting games.

┌─────────────────┐     ┌───────────────────┐     ┌────────────────┐
│                 │     │                   │     │                │
│  Natural        │────▶│  Training         │────▶│  Game          │
│  Language       │     │  Pipeline         │     │  Environment   │
│  Interface      │     │                   │     │                │
│                 │     │                   │     │                │
└────────▲────────┘     └───────┬───────────┘     └────────┬───────┘
         │                      │                          │
         │                      │                          │
         │                      ▼                          ▼
┌────────┴────────┐     ┌───────────────────┐     ┌────────────────┐
│                 │     │                   │     │                │
│  Web            │◀───▶│  Reward           │◀────│  Observation   │
│  Interface      │     │  System           │     │  Processor     │
│                 │     │                   │     │                │
└─────────────────┘     └───────────────────┘     └────────────────┘

Core Components

1. Natural Language Interface

The natural language interface, powered by CaballoLoko, translates human instructions into training configurations.

Key Classes:

  • CaballoLoko: Main interface for natural language processing

  • IntentProcessor: Identifies training intents from text

  • ParameterExtractor: Extracts specific training parameters

  • ResponseGenerator: Creates human-readable responses

Example Flow:

  1. User input is processed by CaballoLoko.chat()

  2. IntentProcessor identifies the training intent

  3. ParameterExtractor pulls specific parameters

  4. Configuration is passed to the training pipeline

  5. ResponseGenerator creates a human-readable response

2. Game Environment

Built on top of the Stable-Retro library, the game environment component handles game emulation and state management.

Key Classes:

  • RetroEnv: Main environment class for game emulation

  • ObservationProcessor: Processes raw game frames

  • ActionSpace: Defines available actions for the agent

  • StateManager: Handles game state loading and saving

Technical Details:

  • Stochastic frame skipping (2-4 frames)

  • 84x84 grayscale observation processing

  • 4-frame stacking for temporal information

  • Multi-player support (2 players)

3. Training Pipeline

The training pipeline integrates with Stable-Baselines3 to provide reinforcement learning capabilities.

Key Classes:

  • DynamicAgent: Main agent class with adaptive learning

  • TrainingManager: Handles training configuration and execution

  • ModelRegistry: Manages saved models and checkpoints

  • HyperparameterOptimizer: Optimizes training parameters

Supported Algorithms:

  • PPO (Proximal Policy Optimization)

  • A2C (Advantage Actor Critic)

  • DQN (Deep Q-Network)

  • SAC (Soft Actor-Critic)

4. Reward System

The modular reward system allows for flexible definition of success criteria.

Key Classes:

  • BaseRewardEvaluator: Abstract base class for reward evaluators

  • MultiObjectiveRewardEvaluator: Combines multiple reward sources

  • RewardEvaluatorManager: Manages and switches between reward systems

  • RewardScaler: Scales and normalizes rewards

5. Web Interface

The web interface provides visualization and control capabilities.

Key Classes:

  • TrainingServer: Main server class for the web interface

  • DashboardManager: Manages dashboard components and views

  • WebSocketHandler: Handles real-time data streaming

  • APIEndpoints: Defines RESTful API endpoints

Data Flow

  1. Training Initialization:

    • User provides natural language instruction

    • CaballoLoko processes instruction into training parameters

    • Training pipeline configures agent and environment

    • Training begins with specified parameters

  2. Training Loop:

    • Environment produces observation

    • Observation processor converts raw frames to agent input

    • Agent selects action based on policy

    • Environment executes action and returns next observation and reward

    • Reward evaluators calculate composite reward

    • Agent updates its policy based on experience

    • Metrics are collected and sent to web interface

  3. Model Persistence:

    • Checkpoints are saved at configured intervals

    • Models can be loaded for continued training

    • Trained agents can be exported for deployment

Integration Points

External Libraries

Casino of Life integrates with several key libraries:

  • Stable-Retro: Game emulation and environment

  • Stable-Baselines3: Reinforcement learning algorithms

  • PyTorch: Neural network backend

  • FastAPI: Web server and API

  • React: Frontend dashboard

Custom Integration

You can extend Casino of Life with custom components:

# Example: Custom observation processor
from casino_of_life.environment import ObservationProcessor

class CustomObservationProcessor(ObservationProcessor):
    def __init__(self, resolution=(96, 96)):
        super().__init__()
        self.resolution = resolution
        
    def process(self, observation):
        # Custom processing logic
        processed_obs = self._resize(observation, self.resolution)
        processed_obs = self._normalize(processed_obs)
        return processed_obs
        
# Register and use the custom processor
from casino_of_life.environment import RetroEnv

env = RetroEnv(
    game='MortalKombatII-Genesis',
    observation_processor=CustomObservationProcessor(resolution=(96, 96))
)

Performance Considerations

  • Memory Management: Automatic garbage collection for efficient memory use

  • Vectorized Environments: Support for parallel environment execution

  • Frame Skipping: Reduces computational load while maintaining learning capability

  • Checkpointing: Efficient model saving and loading

  • Observation Caching: Reduces redundant processing

The modular architecture of Casino of Life allows for flexible configuration and extension while maintaining performance and stability.

PreviousWeb Interface and DashboardNextAdvanced Training Techniques

Last updated 3 months ago

See the documentation for details.

See the documentation for details.

Reward System
Web Interface