Artificial Intelligence Questions, Problems & Solutions : TransWikia.com ~ Page 3

Classification or regression for deep Q learning

DQN implemented at https://github.com/PacktPublishing/PyTorch-1.x-Reinforcement-Learning-Cookbook/blob/master/Chapter07/chapter7/dqn.py uses the mean square error loss function for the neural network to learn the state -> action mapping :self.criterion=torch.nn.MSELoss()Could cross-entropy be used...

Asked on 12/16/2021

0 answer

Is the Bellman equation that uses sampling weighted by the Q values (instead of max) a contraction?

It is proved that the Bellman update is a contraction (1). Here is the Bellman update that is used for Q-Learning: $$Q_{t+1}(s, a) = Q_{t}(s, a) + alpha*(r(s, a,...

Asked on 12/16/2021 by sirfroggy

0 answer

Why does reinforcement learning using a non-linear function approximator diverge when using strongly correlated data as input?

While reading the DQN paper, I found that randomly selecting and learning samples reduced divergence in RL using a non-linear function approximator (e.g a neural network). So,...

Asked on 12/13/2021

1 answer

How Graph Convolutional Neural Networks forward propagate?

In the basic variant of GCN we have the following: Here we aggregate the information from the...

Asked on 12/13/2021

1 answer

In which cases is the categorical cross-entropy better than the mean squared error?

In my code, I usually use the mean squared error (MSE), but the TensorFlow tutorials always use the categorical cross-entropy (CCE). Is the CCE loss function better than MSE? Or...

Asked on 12/11/2021

3 answer

What are the keys and values of the attention model for the encoder and decoder in the "Attention Is All You Need" paper?

I have recently encountered the paper on NLP. It is very new to me and I am still unable to see how that works. I have used all the resources...

Asked on 12/11/2021

1 answer

Is my 57% sports betting accuracy correct?

I have been creating sports betting algorithms for many years using Microsoft access and I am transitioning to the ML world and trying to get a grasp on determining the...

Asked on 12/11/2021 by Sports_Stats

1 answer

Understanding the "unroling" step in the proof of the policy gradient theorem

In the proof of the policy gradient theorem in the RL book of Sutton and Barto (that I shamelessly paste here): ...

Asked on 12/09/2021

2 answer

Forcing a neural network to be close to a previous model - Regularization through given model

I'm wondering, has anyone seen any paper where one trains a network but biases it to produce similar outputs to a given model (such as one given from expert opinion...

Asked on 12/09/2021 by BLBA

0 answer

Why is DDPG not learning and it does not converge?

I have used a different setting, but DDPG is not learning and it does not converge. I have used these codes 1,2, and ...

Asked on 12/09/2021 by I_Al-thamary

0 answer

Page 3 of 16
‹ Previous
1
2
3
4
5
6
7
Next ›
Last »

Ask a Question

Get help from others!

Artificial Intelligence : Recent Questions and Answers (Page 3)

Ask a Question