Classification model not working for a large dataset

Question

I'm working with an online retail order dataset consisting on 3 columns: Client ID, month of purchase and Product ID (one-hot encoded). Something like this:

My objective is to use this dataset to train a Keras classification model. While this approach shows good results when using a very small sample of the dataset (e.g: 5-10 rows), the model does not converge when larger samples are used (e.g: 100-200 rows) or the full dataset (around 540k rows). This is my Python code:
model = Sequential()
model.add(Dense(units=64, activation='relu', input_dim=2))
model.add(Dense(units=saidas_units, activation='softmax'))
model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])
model.fit(myInputs, myProducts,
      batch_size=10,
      epochs=2000,
      verbose=1,
      validation_data=(myInputs, myProducts))

Weirdly enough, in some cases even though the "loss" value is low and the "accuracy" value is high, the product predictions produced by the model are always the same, no matter the inputs (clientId/month).

Demetri Pananos · Accepted Answer

Weirdly enough, in some cases even though the "loss" value is low and the "accuracy" value is high, the product predictions produced by the model are always the same, no matter the inputs (clientId/month).

I'm betting the model is learning the marginal probability of the positive class.  That would likely mean your features have little information. Are you only using month and client ID to predict if a customer bought a product?  Also, can you give me the number your model is predicting and the same mean of the outcome in the training data?

Classification model not working for a large dataset

One Answer

Add your own answers!

Ask a Question