Skip to content

[WIP] Add SAC#7

Open
AspadaX wants to merge 12 commits intomainfrom
sac
Open

[WIP] Add SAC#7
AspadaX wants to merge 12 commits intomainfrom
sac

Conversation

@AspadaX
Copy link
Copy Markdown
Owner

@AspadaX AspadaX commented Sep 15, 2025

No description provided.

AspadaX and others added 12 commits September 15, 2025 13:38
- Replace manual normal sampling and logprob calculation with utility function
- Fix tensor operations using proper broadcasting and element-wise methods
- Add ToElement import for tensor casting operations
- Improve numerical stability in probability calculations
- Generalize forward pass to work with any tensor dimension
- Remove AutodiffBackend constraint from QNet to support inference
- Extract temporal difference calculation into separate function
- Add data import for batch processing
- Clone networks for target initialization instead of loading records
- Begin implementing QNet training loop with TD targets
- Modify `train_net` to accept `Tensor<B, 2>` and `DataBatch<B>`
- Update both Q-networks with temporal difference target during training
- Add policy network entropy calculation to training loop
- Replace transition struct fields with batch-compatible states and actions
- Add Q-network forward passes for both critics
- Compute minimum Q-value between two critics
- Calculate policy loss using min Q-value and entropy
- Remove placeholder comment for optimizer (TODO remains)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant