TransWikia.com

How to combine two differently scaled, but equally important "running" signals into a reward function?

Data Science Asked by tmaric on January 26, 2021

I asked this question on Artificial Intelligence, but got no answer, so I am moving it here.

I have two signals that I want to use to model a reward for a reinforcement learning algorithm.

The first one is the CPU TIME: running mean from this diagram:

enter image description here

The second one is the running mean of the MAX RESIDUAL from this diagram:

enter image description here

Both signals are equally important, but they have very different scales. I could combine the signals linearly together like this:

$r = w_rho rho + w_tau tau$

where $r$ is the reward function, $tau$ is the CPU TIME: running mean, and $rho$ is the MAX RESIDUAL. The problem is, how to set the weights $w_tau,w_rho$ to make the contributions equally important if $rho$ and $tau$ are on very different scales?

Reinforcement learning algorithms will learn policies based on the reward, and if one signal has values that are much smaller than the other, it will influence the reward much less, which is not the behavior I would like to model.

Edit: Dataset on Kaggle

Edit: comment from Pedro

It seems that a linear combination of signals is possible for the scaled mean CPU Time (mean to get rid of oscillations) and the scaled MAX residual:

enter image description here

One Answer

Using z-normalisation ensures that they have same mean and Standard Deviation, but ofcourse values will be still different since mean and Standard Deviation depend on the Distribution of the data.

Alternative is to use feature scaling, where you force the values between 0 and 1 for both signals.

Answered by Noah Weber on January 26, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP