Technical Architecture
This document outlines the technical architecture of Casino of Life, providing insights into its components, data flow, and integration points.
System Overview
Casino of Life consists of several interconnected components that work together to enable natural language-driven AI training for retro fighting games.
Core Components
1. Natural Language Interface
The natural language interface, powered by CaballoLoko, translates human instructions into training configurations.
Key Classes:
CaballoLoko
: Main interface for natural language processingIntentProcessor
: Identifies training intents from textParameterExtractor
: Extracts specific training parametersResponseGenerator
: Creates human-readable responses
Example Flow:
User input is processed by
CaballoLoko.chat()
IntentProcessor
identifies the training intentParameterExtractor
pulls specific parametersConfiguration is passed to the training pipeline
ResponseGenerator
creates a human-readable response
2. Game Environment
Built on top of the Stable-Retro library, the game environment component handles game emulation and state management.
Key Classes:
RetroEnv
: Main environment class for game emulationObservationProcessor
: Processes raw game framesActionSpace
: Defines available actions for the agentStateManager
: Handles game state loading and saving
Technical Details:
Stochastic frame skipping (2-4 frames)
84x84 grayscale observation processing
4-frame stacking for temporal information
Multi-player support (2 players)
3. Training Pipeline
The training pipeline integrates with Stable-Baselines3 to provide reinforcement learning capabilities.
Key Classes:
DynamicAgent
: Main agent class with adaptive learningTrainingManager
: Handles training configuration and executionModelRegistry
: Manages saved models and checkpointsHyperparameterOptimizer
: Optimizes training parameters
Supported Algorithms:
PPO (Proximal Policy Optimization)
A2C (Advantage Actor Critic)
DQN (Deep Q-Network)
SAC (Soft Actor-Critic)
4. Reward System
The modular reward system allows for flexible definition of success criteria.
Key Classes:
BaseRewardEvaluator
: Abstract base class for reward evaluatorsMultiObjectiveRewardEvaluator
: Combines multiple reward sourcesRewardEvaluatorManager
: Manages and switches between reward systemsRewardScaler
: Scales and normalizes rewards
5. Web Interface
The web interface provides visualization and control capabilities.
Key Classes:
TrainingServer
: Main server class for the web interfaceDashboardManager
: Manages dashboard components and viewsWebSocketHandler
: Handles real-time data streamingAPIEndpoints
: Defines RESTful API endpoints
Data Flow
Training Initialization:
User provides natural language instruction
CaballoLoko processes instruction into training parameters
Training pipeline configures agent and environment
Training begins with specified parameters
Training Loop:
Environment produces observation
Observation processor converts raw frames to agent input
Agent selects action based on policy
Environment executes action and returns next observation and reward
Reward evaluators calculate composite reward
Agent updates its policy based on experience
Metrics are collected and sent to web interface
Model Persistence:
Checkpoints are saved at configured intervals
Models can be loaded for continued training
Trained agents can be exported for deployment
Integration Points
External Libraries
Casino of Life integrates with several key libraries:
Stable-Retro: Game emulation and environment
Stable-Baselines3: Reinforcement learning algorithms
PyTorch: Neural network backend
FastAPI: Web server and API
React: Frontend dashboard
Custom Integration
You can extend Casino of Life with custom components:
Performance Considerations
Memory Management: Automatic garbage collection for efficient memory use
Vectorized Environments: Support for parallel environment execution
Frame Skipping: Reduces computational load while maintaining learning capability
Checkpointing: Efficient model saving and loading
Observation Caching: Reduces redundant processing
The modular architecture of Casino of Life allows for flexible configuration and extension while maintaining performance and stability.
Last updated