Stack Overflow Asked on December 18, 2021
I have a dataframe df
as where Col1, Col2
and Col3
are column names:
Col1 Col2 Col3
a b
B 2 3
C 10 6
First row above with values a, b
is subcategory so Col1
is empty for row 1.
I am trying to get the following:
B Col2 a 2
B Col3 b 3
C Col2 a 10
C Col3 b 6
I am not sure how to approach above.
Edit:
df.to_dict()
Out[16]:
{'Unnamed: 0': {0: nan, 1: 'B', 2: 'C'},
'Col2': {0: 'a', 1: '2', 2: '10'},
'Col3': {0: 'b', 1: '3', 2: '6'}}
You can melt the dataframe, create a new column dependent on which rows are null, and then filter out the rows where the columns both have a
and b
:
(
df.melt("Col1")
.assign(temp=lambda x: np.where(x.Col1.isna(), x.value, np.nan))
.ffill()
.query("value != temp")
)
Col1 variable value temp
1 B Col2 2 a
2 C Col2 10 a
4 B Col3 3 b
5 C Col3 6 b
Answered by sammywemmy on December 18, 2021
You can do the following:
df = pd.DataFrame({'Col1': {0: np.nan, 1: 'B', 2: 'C'},
'Col2': {0: 'a', 1: '2', 2: '10'},
'Col3': {0: 'b', 1: '3', 2: '6'}})
melted = pd.melt(df, id_vars=['Col1'], value_vars=['Col3',
'Col2']).dropna().reset_index(drop=True)
subframe = pd.DataFrame({'Col2': ['a'], 'Col3': ['b']}).melt()
melted.merge(subframe, on='variable')
Out[1]:
Col1 variable value_x value_y
0 B Col3 3 b
1 C Col3 6 b
2 B Col2 2 a
3 C Col2 10 a
Then you can rename your columns as you want
Answered by Youssef Ali on December 18, 2021
You can try this replacing that NaN with a blank(or any string you want the colum to be named):
df.fillna('').set_index('Col1').T
.set_index('',append=True).stack().reset_index()
Output:
level_0 Col1 0
0 Col2 a B 2
1 Col2 a C 10
2 Col3 b B 3
3 Col3 b C 6
df.fillna('Col0').set_index('Col1').T
.set_index('Col0',append=True).stack().reset_index(level=[1,2])
Output:
Col0 Col1 0
Col2 a B 2
Col2 a C 10
Col3 b B 3
Col3 b C 6
Answered by Scott Boston on December 18, 2021
Use stack
and join
df_final = (df.iloc[1:].set_index('Col1').stack().reset_index(0)
.join(df.iloc[0,1:].rename('1')).sort_values('Col1'))
Out[345]:
Col1 0 1
Col2 B 2 a
Col3 B 3 b
Col2 C 10 a
Col3 C 6 b
Answered by Andy L. on December 18, 2021
df = pd.DataFrame.from_dict({'Col1': {0: np.nan, 1: 'B', 2: 'C'},
'Col2': {0: 'a', 1: '2', 2: '10'},
'Col3': {0: 'b', 1: '3', 2: '6'}})
# set index as a multi-index from the first row
df.index = pd.MultiIndex.from_product([df.iloc[0,:]])
# get rid of the empty row and reset the index
df = df.iloc[1:,:].reset_index()
answer = pd.melt(df, id_vars=['Col1',0], value_vars=['Col2','Col3'],value_name='vals')
answer[['Col1','variable',0,'vals']]
Col1 variable 0 vals
0 B Col2 a 2
1 C Col2 b 10
2 B Col3 a 3
3 C Col3 b 6
Answered by MattR on December 18, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP