How to compare and find common values from different columns in same dataframe?

Question

I would like to compare two columns and find common value sets in each column, then output the rows with the common values.

Let's say I have a dataframe with:

no.(col1) | Username (col2) | Referral(col3) | email(col4)

0 | john | mike | email0@email.com

1 | peter | paul | email1@email.com

2 | joan | patricia | email2@email.com

3 | mike | john | email3@email.com

Fatemeh Asgarinejad · Answer

let's consider we have a data frame named df, then one approach might be saving these two columns in different dataframes and then trying to compare them and find out their similarities, hence we would have:

column1 = df.iloc[:,1].values
column2 = df.iloc[:,2].values

Then, saving all the indices in column1 where the set exists in column2

equal_indices = []
for i in range(len(column1)): 
    for j in range(len(column2[or column1, since they are equal])): 
        if column1[i] == column2[j] and column2[i]==column1[j]:
            equal_indices.append(i) 
            print(i,j) 
            print(column1[i], column2[i])

Now equal_indices contains all the indices you want. Then you can delete the rows with similar columns from column1 or column2. or just return the found indices of column1 from the dataframe df:

df.iloc[[equal_indices]]

How to compare and find common values from different columns in same dataframe?

One Answer

Add your own answers!

Ask a Question