Data Science Asked by sofissecondhuman on November 27, 2020
Let’s say we have a 6*4 data frame in which third and fourth column contain missing value
1 2 3 L1
4 5 6 L2
7 8 9 L3
4 8 NaN NaN
2 3 4 5
7 9 NaN NaN
I’d like to fill the missing value by looking at another row that has the same value for the first column. So, in the end, I should have:
1 2 3 L1
4 5 6 L2
7 8 9 L3
4 8 6 L2 <- Taken from 4 5 6 L2 row
2 3 4 L4
7 9 9 L3 <- Taken from 7 8 9 L3 row
How can we do it with Pandas in the fastest way possible?
Sorted and did a forward-fill NaN
import pandas as pd, numpy as np
data = np.array([[1,2,3,'L1'],[4,5,6,'L2'],[7,8,9,'L3'],[4,8,np.nan,np.nan],[2,3,4,5],[7,9,np.nan,np.nan]],dtype='object')
df = pd.DataFrame(data,columns=['A','B','C','D'])
df.sort_values(by='A',inplace=True)
df.fillna(method='ffill')
Answered by 10xAI on November 27, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP