Tensorflow model works for classification but not for regression (all predictions equal the output layer bias)

Data Science Asked by SuperCodeBrah on October 15, 2020

I’m trying to build a model for FX prediction. It’s giving some promising results for classifying each period as buy/sell/neutral. When used as a classifier, actual returns are converted to 0, 1, or 2 to represent the possible actions (sell=0, neutral=1, buy=2) based on whether the future price change exceeds a threshold to the high or low side.

However, when trying to predict the future change as a linear regression problem, it’s making nonsensical predictions. Every time I train the model, it quickly converges to a different fixed value prediction for every period.

Here’s a boiled down version of the model build/compilation:

def build(self, is_evolution=False):
    self.model = models.Sequential()
    self.model.add(layers.Dense(self.num_layers, activation="relu", input_shape=(self.x_train.shape[1],)))
    self.model.add(layers.Dense(self.num_layers, activation="relu"))
    if self.is_classification:
        class_counts = np.array(list(map(lambda x: self.y_train[self.y_train == x].shape[0], self.classes)))
        total_count = self.y_train.shape[0]
        bias_initializer = tf.constant_initializer(tuple(class_counts / total_count))
        self.model.add(layers.Dense(len(self.classes), activation="sigmoid", bias_initializer=bias_initializer))

    loss = "sparse_categorical_crossentropy" if self.is_classification else tf.keras.losses.MeanSquaredError()
    self.model.compile(optimizer=tf.keras.optimizers.Adam(), loss=loss, metrics=["accuracy"])

Here’s an example of the predictions when is_classification is set to False. Not only is this nonsensical because all of the predictions are the same, but the numbers in this case are also very big (i.e., the actual mean is very close to zero), although the prediction numbers can vary quite a bit with different trainings.


Based on suggestions here: Tensorflow regression model giving same prediction every time, I’ve tried changing the learning rate to no avail. I’m also making sure to scale the train and test set, although I’m not sure that should matter for linear regression.

At first, I thought this was just because the dataset was very noisy, but thinking about it more, I think a model that can classify a buy or sell should also be able to make regression predictions when the output layer is modified as it is above.

Any thoughts are appreciated.

Update: I was inspecting individual weights and I see that the linear regression predictions are equal to the output layer bias, however, the other weights appear to be “normal” (i.e., various positive and negative weights).

One Answer

Surprisingly, changing the optimizer from adam to tf.keras.optimizers.SGD(lr=0.01, momentum=0.9) eliminated the identical predictions, but it looks like more tinkering is required to get good results like with the classification model.

The only question is why does this change make that much of a difference? I'd be curious to know the answer if anyone has one.

Answered by SuperCodeBrah on October 15, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP