# API Reference

This document provides a comprehensive reference for the key modules and classes in the Casino of Life framework. Use this as a quick reference when building your applications.

### Module: `casino_of_life.agents`

#### Class: `DynamicAgent`

The primary agent class for training AI in retro fighting games.

**Constructor Parameters**

| Parameter          | Type                  | Default  | Description                                      |
| ------------------ | --------------------- | -------- | ------------------------------------------------ |
| `env`              | `RetroEnv`            | Required | The game environment                             |
| `policy`           | `str`                 | "PPO"    | RL algorithm to use ("PPO", "A2C", "DQN", "SAC") |
| `learning_rate`    | `float`               | 0.0003   | Learning rate for policy optimization            |
| `name`             | `str`                 | None     | Optional name for the agent                      |
| `reward_evaluator` | `BaseRewardEvaluator` | None     | Custom reward evaluator                          |
| `frame_stack`      | `int`                 | 4        | Number of frames to stack for observation        |
| `gamma`            | `float`               | 0.99     | Discount factor for future rewards               |
| `initial_policy`   | `object`              | None     | Pre-trained policy to start from                 |

**Methods**

| Method           | Parameters                      | Return Type       | Description                             |
| ---------------- | ------------------------------- | ----------------- | --------------------------------------- |
| `train`          | `timesteps: int, callback=None` | `dict`            | Train the agent for specified timesteps |
| `evaluate`       | `episodes: int, detailed=False` | `float` or `dict` | Evaluate agent performance              |
| `save`           | `filename: str`                 | None              | Save agent to disk                      |
| `load`           | `filename: str`                 | None              | Load agent from disk                    |
| `predict`        | `observation`                   | `tuple`           | Get action from current observation     |
| `set_parameters` | `params: dict`                  | None              | Update agent parameters                 |
| `get_parameters` | None                            | `dict`            | Get current agent parameters            |

#### Class: `CaballoLoko`

The natural language interface for training guidance.

**Constructor Parameters**

| Parameter        | Type  | Default   | Description                            |
| ---------------- | ----- | --------- | -------------------------------------- |
| `language_model` | `str` | "default" | Language model to use                  |
| `max_history`    | `int` | 10        | Maximum conversation history to retain |

**Methods**

| Method                | Parameters     | Return Type | Description                        |
| --------------------- | -------------- | ----------- | ---------------------------------- |
| `chat`                | `message: str` | `str`       | Send message and get response      |
| `get_training_config` | `message: str` | `dict`      | Convert message to training config |
| `get_history`         | None           | `list`      | Get conversation history           |
| `clear_history`       | None           | None        | Clear conversation history         |

#### Class: `HierarchicalAgent`

Agent with hierarchical policy structure for complex behaviors.

**Constructor Parameters**

| Parameter            | Type              | Default  | Description               |
| -------------------- | ----------------- | -------- | ------------------------- |
| `env`                | `RetroEnv`        | Required | The game environment      |
| `high_level_policy`  | `HighLevelPolicy` | Required | Strategy selection policy |
| `low_level_policies` | `dict`            | Required | Action execution policies |

**Methods**

| Method    | Parameters       | Return Type | Description                            |
| --------- | ---------------- | ----------- | -------------------------------------- |
| `train`   | `timesteps: int` | `dict`      | Train both policy levels               |
| `predict` | `observation`    | `tuple`     | Get action using hierarchical decision |

### Module: `casino_of_life.environment`

#### Class: `RetroEnv`

The game environment that wraps Stable-Retro for fighting games.

**Constructor Parameters**

| Parameter          | Type  | Default      | Description                                      |
| ------------------ | ----- | ------------ | ------------------------------------------------ |
| `game`             | `str` | Required     | Game ROM to use (e.g., "MortalKombatII-Genesis") |
| `state`            | `str` | "tournament" | Initial game state                               |
| `players`          | `int` | 2            | Number of players (1 or 2)                       |
| `character`        | `str` | None         | Player character to use                          |
| `observation_type` | `str` | "grayscale"  | Observation processing type                      |

**Methods**

| Method                  | Parameters     | Return Type | Description                                    |
| ----------------------- | -------------- | ----------- | ---------------------------------------------- |
| `step`                  | `action`       | `tuple`     | Execute action and return next state           |
| `reset`                 | None           | `ndarray`   | Reset environment to initial state             |
| `render`                | `mode="human"` | `ndarray`   | Render current frame                           |
| `close`                 | None           | None        | Close environment and free resources           |
| `clone`                 | `**kwargs`     | `RetroEnv`  | Create copy of environment with new parameters |
| `get_action_space`      | None           | `Space`     | Get available action space                     |
| `get_observation_space` | None           | `Space`     | Get observation space                          |

### Module: `casino_of_life.reward_evaluators`

#### Class: `BaseRewardEvaluator`

Abstract base class for reward evaluators.

**Methods**

| Method     | Parameters                        | Return Type | Description            |
| ---------- | --------------------------------- | ----------- | ---------------------- |
| `evaluate` | `state, next_state, action, info` | `float`     | Calculate reward value |
| `reset`    | None                              | None        | Reset internal state   |

#### Class: `BasicRewardEvaluator`

Standard reward evaluator for fighting games.

**Constructor Parameters**

| Parameter        | Type    | Default | Description                          |
| ---------------- | ------- | ------- | ------------------------------------ |
| `health_reward`  | `float` | 1.0     | Reward for maintaining health        |
| `damage_penalty` | `float` | -1.0    | Penalty for taking damage            |
| `hit_reward`     | `float` | 0.5     | Reward for landing hits              |
| `block_reward`   | `float` | 0.2     | Reward for successful blocks         |
| `move_penalty`   | `float` | -0.01   | Penalty to discourage button mashing |

#### Class: `MultiObjectiveRewardEvaluator`

Combines multiple reward evaluators.

**Constructor Parameters**

| Parameter    | Type   | Default  | Description                         |
| ------------ | ------ | -------- | ----------------------------------- |
| `evaluators` | `list` | Required | List of reward evaluator instances  |
| `weights`    | `list` | None     | Optional weights for each evaluator |

### Module: `casino_of_life.training`

#### Class: `TrainingManager`

Manages the training process for agents.

**Constructor Parameters**

| Parameter   | Type           | Default  | Description                        |
| ----------- | -------------- | -------- | ---------------------------------- |
| `agent`     | `DynamicAgent` | Required | Agent to train                     |
| `web_hooks` | `dict`         | None     | Optional web dashboard integration |

**Methods**

| Method            | Parameters                                  | Return Type | Description                             |
| ----------------- | ------------------------------------------- | ----------- | --------------------------------------- |
| `train`           | `timesteps, eval_frequency, save_frequency` | `dict`      | Run training with evaluation and saving |
| `evaluate`        | `episodes`                                  | `dict`      | Evaluate current agent                  |
| `save_checkpoint` | `filename=None`                             | `str`       | Save training checkpoint                |
| `load_checkpoint` | `filename`                                  | None        | Load training checkpoint                |

#### Class: `CurriculumTrainer`

Implements curriculum learning for progressive difficulty.

**Constructor Parameters**

| Parameter              | Type           | Default  | Description               |
| ---------------------- | -------------- | -------- | ------------------------- |
| `agent`                | `DynamicAgent` | Required | Agent to train            |
| `curriculum`           | `list`         | Required | List of curriculum stages |
| `evaluation_frequency` | `int`          | 5000     | Steps between evaluations |

**Methods**

| Method        | Parameters | Return Type | Description                          |
| ------------- | ---------- | ----------- | ------------------------------------ |
| `train`       | None       | `dict`      | Run full curriculum training         |
| `train_stage` | `stage`    | `bool`      | Train on a specific curriculum stage |

#### Class: `SelfPlayTrainer`

Implements self-play training with progressive sampling.

**Constructor Parameters**

| Parameter              | Type           | Default  | Description                     |
| ---------------------- | -------------- | -------- | ------------------------------- |
| `env`                  | `RetroEnv`     | Required | Game environment                |
| `initial_agent`        | `DynamicAgent` | Required | Starting agent                  |
| `checkpoint_frequency` | `int`          | 10000    | Steps between model checkpoints |
| `opponent_sampling`    | `dict`         | Required | Sampling probabilities          |
| `total_timesteps`      | `int`          | Required | Total training steps            |

**Methods**

| Method            | Parameters | Return Type    | Description                  |
| ----------------- | ---------- | -------------- | ---------------------------- |
| `train`           | None       | `dict`         | Run self-play training       |
| `sample_opponent` | None       | `DynamicAgent` | Sample opponent from history |

### Module: `casino_of_life.web`

#### Class: `TrainingServer`

Web server for monitoring and controlling training.

**Constructor Parameters**

| Parameter          | Type              | Default     | Description             |
| ------------------ | ----------------- | ----------- | ----------------------- |
| `host`             | `str`             | "localhost" | Server host             |
| `port`             | `int`             | 8080        | Server port             |
| `dashboard_config` | `DashboardConfig` | None        | Dashboard configuration |
| `security_config`  | `SecurityConfig`  | None        | Security settings       |

**Methods**

| Method           | Parameters        | Return Type | Description                   |
| ---------------- | ----------------- | ----------- | ----------------------------- |
| `start`          | None              | None        | Start web server              |
| `stop`           | None              | None        | Stop web server               |
| `register_agent` | `agent, agent_id` | None        | Register agent for monitoring |
| `start_training` | `config`          | `str`       | Start new training session    |
| `stop_training`  | `training_id`     | None        | Stop training session         |
| `get_status`     | `training_id`     | `dict`      | Get training status           |

#### Class: `DashboardConfig`

Configuration for the web dashboard.

**Constructor Parameters**

| Parameter        | Type   | Default    | Description                   |
| ---------------- | ------ | ---------- | ----------------------------- |
| `theme`          | `str`  | "light"    | Dashboard theme               |
| `default_view`   | `str`  | "overview" | Default dashboard view        |
| `refresh_rate`   | `int`  | 5          | Update frequency in seconds   |
| `custom_metrics` | `list` | None       | Additional metrics to display |

### Module: `casino_of_life.evaluation`

#### Class: `TournamentEvaluator`

Evaluates multiple agents in tournament format.

**Constructor Parameters**

| Parameter            | Type   | Default  | Description                         |
| -------------------- | ------ | -------- | ----------------------------------- |
| `agents`             | `list` | Required | List of agents to compete           |
| `matches_per_pair`   | `int`  | 10       | Number of matches between each pair |
| `evaluation_metrics` | `list` | None     | Metrics to track                    |

**Methods**

| Method            | Parameters       | Return Type | Description                     |
| ----------------- | ---------------- | ----------- | ------------------------------- |
| `run_tournament`  | None             | `dict`      | Run full tournament             |
| `run_match`       | `agent1, agent2` | `dict`      | Run single match between agents |
| `display_results` | None             | None        | Display tournament results      |

### Module: `casino_of_life.data`

#### Class: `DemonstrationRecorder`

Records gameplay demonstrations for imitation learning.
