Stack Overflow Asked by Cale Kanack on December 16, 2021
I have created a query where I have device identifiers with the MIN(first_login_datetime), MIN(first_purchase_datetime), and a list of push_datetimes (multiple pushes may have been received per device. See example table below for a single device.
I am attempting to filter out devices that received ANY push between the first_login_datetime and first_purchase_datetime. The device in question (see device_a in table below) has received pushes both between the first_login_datetime and the first_purchase_datetime, but also after the first_purchase_datetime. Because of this, I am not able to add a filter condition for "WHERE push_datetime NOT BETWEEN first_login_datetime AND first_purchase_datetime", as this removes the instances of the push_datetime that are between the two, but the device_a remains in my cohort since there were pushes received after the first_purchase_datetime, even though the device did still have a push between the two datetimes. I need to be able to completely remove this device (and any device that had a push between the two datetimes) from the query, any suggestions?
device_id first_login_datetime first_purchase_datetime push_datetime
1 device_a 2020-06-27T00:26:10 2020-07-02T19:35:05 2020-06-28T20:00:14.187559
2 device_a 2020-06-27T00:26:10 2020-07-02T19:35:05 2020-07-01T20:07:13.020859
3 device_a 2020-06-27T00:26:10 2020-07-02T19:35:05 2020-07-01T23:52:33.943492
4 device_a 2020-06-27T00:26:10 2020-07-02T19:35:05 2020-07-03T16:10:09.884610
5 device_a 2020-06-27T00:26:10 2020-07-02T19:35:05 2020-07-03T19:46:51.505925
I think a window function is the simplest (and fastest) solution:
select t.* except (bad_cnt)
from (select t.*,
countif(push_datetime between first_login_datetime and first_purchase_datetime) over (partition by device_id) as bad_cnt
from t
) t
where bad_cnt > 0
Answered by Gordon Linoff on December 16, 2021
You need to get the distinct devices that you want to remove, then left join against your original table.
with my_table as (select * from `project.dataset.table`),
devices_to_remove as (
select distinct device_id
from my_table
where push_datetime between first_login_datetime and first_purchase_datetime
)
select my_table.*
from my_table
left join devices_to_remove using(device_id)
where devices_to_remove.device_id is null
This will remove all records from any devices that meet your elimination criteria.
Answered by rtenha on December 16, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP