Data Science Asked on November 10, 2021
I would like to compare two columns and find common value sets in each column, then output the rows with the common values.
Let’s say I have a dataframe with:
no.(col1) | Username (col2) | Referral(col3) | email(col4)
0 | john | mike | [email protected]
1 | peter | paul | [email protected]
2 | joan | patricia | [email protected]
3 | mike | john | [email protected]
The output would be “0 | john | mike | [email protected]
” and “3 | mike | john | [email protected]
” because they have the same values in col2
and col3
respectively.
let's consider we have a data frame
named df, then one approach might be saving these two columns in different dataframes
and then trying to compare them and find out their similarities, hence we would have:
column1 = df.iloc[:,1].values
column2 = df.iloc[:,2].values
Then, saving all the indices in column1 where the set exists in column2
equal_indices = []
for i in range(len(column1)):
for j in range(len(column2[or column1, since they are equal])):
if column1[i] == column2[j] and column2[i]==column1[j]:
equal_indices.append(i)
print(i,j)
print(column1[i], column2[i])
Now equal_indices
contains all the indices you want. Then you can delete the rows with similar columns from column1 or column2. or just return the found indices of column1 from the dataframe df:
df.iloc[[equal_indices]]
Answered by Fatemeh Asgarinejad on November 10, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP