Design #1

Open
opened 2026-03-25 16:01:07 +00:00 by Quaternions · 0 comments
Owner

Inputs:

  • Render 160x90 depth texture
    • 4 (?) frames of depth history
  • Airtime & height difference to next surf/platform (sourced from bot)
    • Plan next 2 surfs/platforms ahead
  • Velocity vector relative to camera orientation This is redundant, it is implied by the position history nvm it's rather important and not quite the same as the previous position
  • Position/Angles of 4 (?) previous game ticks as a sort of contextual planning memory ("ah right, that's the curve I was going for")

Outputs:

  • WASD Space (GameControls), mouse delta
  • Always one strafe tick (0.01s)

Architecture:

  • rust burn
  • reinforcement learning

Training procedure:

  • Use the WR bot for each map as a reference path.
  • Reward the agent with the wr curve progress delta. Find the closest point on the WR curve, compare to the previous best, and reward the difference. No negative reward for backwards progress. The agent gets a lump reward of 1/final_time when it reaches the finish.
Inputs: - Render 160x90 depth texture - 4 (?) frames of depth history - Airtime & height difference to next surf/platform (sourced from bot) - Plan next 2 surfs/platforms ahead - Velocity vector relative to camera orientation ❌ This is redundant, it is implied by the position history ✅ nvm it's rather important and not quite the same as the previous position - Position/Angles of 4 (?) previous game ticks as a sort of contextual planning memory ("ah right, that's the curve I was going for") Outputs: - WASD Space (GameControls), mouse delta - Always one strafe tick (0.01s) Architecture: - rust burn - reinforcement learning Training procedure: - Use the WR bot for each map as a reference path. - Reward the agent with the wr curve progress delta. Find the closest point on the WR curve, compare to the previous best, and reward the difference. No negative reward for backwards progress. The agent gets a lump reward of `1/final_time` when it reaches the finish.
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: StrafesNET/strafe-ai#1