Artificial Intelligence Asked by Alena Volkova on December 2, 2020
I want to use RNN for classifying whole sequences of events, generated by website visitors. Each event has some categorical properties and a Unix timestamp:
sequence1 = [{'timestamp': 1597501183, 'some_field': 'A'}, {'timestamp': 1597681183, 'some_field': 'B'}]
sequence2 = [{'timestamp': 1596298782, 'some_field': 'B'}]
sequence3 = [{'timestamp': 1596644362, 'some_field': 'A'}, {'timestamp': 1596647951, 'some_field': 'C'}]
Unfortunately, they can’t be treated as classic time series, because they’re of variable length and irregular, so timestamps contain essential information and cannot be ignored. While categorical features can be one-hot encoded or made into embeddings, I’m not sure what to do with the timestamps. It doesn’t look like a good idea to use them raw. I’ve come up with two options so far:
I’m wondering if there are common ways to deal with this? I haven’t found much on this subject.
I would say offsets from previous N-event is necessary in this case. You can also encode their max and avg as other features in order to compress the irregular sampling times into a vector.
Answered by F4RZ4D on December 2, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP