Data Science Asked by aslconwnb on April 29, 2021
I am trying to write an autoencoder to reduce the dimensionality of my genomic data. Currently, my data is in the form of a $273278 times 1$ vector. Each element of the vector indicates whether a position has no mutations (0), one mutation (1), or two mutations (2). As such, the input and output of my autoencoder looks like this:
$$begin{bmatrix}
0
1
0
2
vdots
end{bmatrix}$$
This uses label encoding to represent the categorical data. This works, but the autoencoder isn’t very accurate since the 0, 1, and 2 data are not related to each other.
I am considering using one-hot encoding to create a $273278 times 3$ matrix where each column corresponds to 0, 1, or 2. As such, the above vector would turn into this:
$$begin{bmatrix}
1 & 0 & 0
0 & 1 & 0
1 & 0 & 0
0 & 0 & 1
vdots & vdots & vdots
end{bmatrix}$$
However, I am unsure of how to input this matrix into a (keras) neural network. Is there a function to do this? Would flattening this matrix be mathematically appropriate? Is there another method to do this?
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP