Wen Smith
Wen Smith

Reputation: 1

Charging scheduling of electric vehicle by TD3 algorithm(RL)

I set three reward values, namely r_anx, r_dep and r_price, in which r_dep<=0, and obtained corresponding reward values by setting different weight coefficients. However, two situations would occur. (1) The three reward values showed an upward trend, but I tried to make r_dep as 0 as possible, which did not achieve my expected effect. (2)r_dep and r_anx increase, but r_price decreases.

[enter image description here](https://i.s[enter image description here](https://i.sstatic.net/amExA.png)tack.imgur.com/ZZK4J.png)

May I ask how to solve the above problems, so that the three reward values are all rising trend, and r_dep is as close as possible to 0.

Upvotes: 0

Views: 21

Answers (0)

Related Questions