Data Science Asked by tmaric on January 26, 2021
I asked this question on Artificial Intelligence, but got no answer, so I am moving it here.
I have two signals that I want to use to model a reward for a reinforcement learning algorithm.
The first one is the CPU TIME: running mean from this diagram:
The second one is the running mean of the MAX RESIDUAL from this diagram:
Both signals are equally important, but they have very different scales. I could combine the signals linearly together like this:
$r = w_rho rho + w_tau tau$
where $r$ is the reward function, $tau$ is the CPU TIME: running mean, and $rho$ is the MAX RESIDUAL. The problem is, how to set the weights $w_tau,w_rho$ to make the contributions equally important if $rho$ and $tau$ are on very different scales?
Reinforcement learning algorithms will learn policies based on the reward, and if one signal has values that are much smaller than the other, it will influence the reward much less, which is not the behavior I would like to model.
Edit: Dataset on Kaggle
Edit: comment from Pedro
It seems that a linear combination of signals is possible for the scaled mean CPU Time (mean to get rid of oscillations) and the scaled MAX residual:
Using z-normalisation ensures that they have same mean and Standard Deviation, but ofcourse values will be still different since mean and Standard Deviation depend on the Distribution of the data.
Alternative is to use feature scaling, where you force the values between 0 and 1 for both signals.
Answered by Noah Weber on January 26, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP