Deep neural networks for mostly zero valued data

Question

I have some data with around 1000 features. Problem is most of the features are 0. In each row usually around 100 features have values and rest of the features are 0. A sample example data is given below

ID  f1  f2  f3 ........ f1000  label
 1  45  0    0            32     1
 2  23  0    0            0      2
 3  0   40   0            0      1
 4  0   0    23           24     3
 5  78  0    64           0      2

As you can see there are lot of 0 in each row. Which deep neural networks method should I use? I was thinking about RNN or CNN but I am not sure if these are the best choices. I was thinking about some methods which will ignore zero values i.e does not consider the zero values in the calculation. I think RNN can do this. But I am not exactly sure.

Andreas Look · Answer

Why using deep learning at all? It seems you already have a big set of features. So maybe try using Random Forests? They are pretty good at handling outliers (in you ur case, useless 0 values).

CNN are good for dealing with 2 dimensional input data, like pictures. 
RNN you should use when dealing with timeseries. In my opinion deep learning should be thus used, when dealing with abstract data.

Dhaval Thakkar · Answer

I don't understand that why have you decided to use CNN or RNN? 
This problem can be solved by ANN if you need to use Deep Learning. 
CNNs are good at Image Recognition tasks. 
RNNs are good at Time series analysis. 
I'd suggest you to use a couple of Machine learning algorithms. They might be enough for this. You can also use Gradient Boosting methods. 
Hope this helped.

insys · Answer

There's this misunderstanding that deep learning is generally suitable if you have loads of data, which, judging from your comment, is your case. This is generally an inaccurate belief.

Deep learning (including CNN and RNN), are complex models with thousands of parameters that are able to model complex relationships. Such relationships are generally "hidden" in vast amounts of data, but this is not always the case. You may have at hand data that are generated from a simple distribution, and as such it will not need a complex model to approximate, even if your sample size is huge.

Here's a fabricated example: Let's say that you have the function y=a*x1 + b*x2 + c*x3 + d*x4. This function entails linear relationships between all independent variables and the dependent variable, y. You could sample this function a million times and still you would have at hand data that could be approximated through linear regression.

Coming to your case: to identify what kind of algorithm you would need, you first need to look at your data, perhaps through performing a statistical analysis. Then I would suggest to start simple. Try logistic regression, for starters. Is the model satisfactory in validation? If not, try perhaps decision trees. Or a shallow neural net. Always validate (you have loads of data so validation should be easy!). My (admittedly wild) guess is that your classification problem could be addressed with much simpler algorithms than DNN. But YMMV, of course.

FWIW, here's another answer similar to your case. And another one.

Deep neural networks for mostly zero valued data

3 Answers

Add your own answers!

Ask a Question