Data Science Asked by Brutus35 on February 14, 2021
I am fairly new to ML and I’m working on a problem, but not sure which algorithm to choose. The dataset contains a set of incremental time-series events, in consistent and set intervals, with a list of potential observations. Something akin to what the output of an audio recognition program might produce where it looks something like:
listener_id | interval_timestamp | candidate_song |
---|---|---|
12345 | 00:00:00 | Hey Jude (The Beatles) |
12345 | 00:00:05 | Hey Jude (The Beatles) |
12345 | 00:00:10 | Hey Jude (The Beatles) |
12345 | 00:00:15 | Hey Jude (The Beatles) |
12345 | 00:00:20 | Hey Jude (The Beatles) |
12345 | 00:00:20 | Yesterday (The Beatles) |
12345 | 00:00:25 | Hey Jude (The Beatles) |
12345 | 00:00:30 | Hey Jude (The Beatles) |
… | ||
12345 | 00:01:30 | Yellow Submarine (The Beatles) |
12345 | 00:01:30 | Hey Jude (The Beatles) |
12345 | 00:01:30 | Circle Sky (The Monkees) |
12345 | 00:01:30 | I’m a Believer (The Monkees) |
… | ||
12345 | 00:03:20 | Hey Jude (The Beatles) |
12345 | 00:03:25 | Imagine (John Lennon) |
12345 | 00:03:30 | Imagine (John Lennon) |
12345 | 00:03:35 | Imagine (John Lennon) |
12345 | 00:03:40 | Imagine (John Lennon) |
I want to be able to have the program pick the best possible option based on previous values, so the output would like something like:
listener_id | interval_timestamp | candidate_song |
---|---|---|
12345 | 00:00:00 | Hey Jude (The Beatles) |
12345 | 00:00:05 | Hey Jude (The Beatles) |
12345 | 00:00:10 | Hey Jude (The Beatles) |
12345 | 00:00:15 | Hey Jude (The Beatles) |
12345 | 00:00:20 | Hey Jude (The Beatles) |
12345 | 00:00:25 | Hey Jude (The Beatles) |
12345 | 00:00:30 | Hey Jude (The Beatles) |
… | ||
12345 | 00:01:30 | Hey Jude (The Beatles) |
… | ||
12345 | 00:03:20 | Hey Jude (The Beatles) |
12345 | 00:03:25 | Imagine (John Lennon) |
12345 | 00:03:30 | Imagine (John Lennon) |
12345 | 00:03:35 | Imagine (John Lennon) |
12345 | 00:03:40 | Imagine (John Lennon) |
1 observation per time interval, with the best guess based on the previous values.
I did some research on RNN and LSTM, but those seem to be mostly used for predictions and forecasts on what the next interval probably is, not choosing from a set of captured observations.
Would anyone be able to point me in the right direction in terms of algorithms/approches to best solve this problem?
Any help is appreciated.
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP