TransWikia.com

Pandas/Python - comparing two columns for matches not in the same row

Data Science Asked on December 9, 2021

I have this data:

enter image description here

I wanted to compare A and B for matches not by row but rather search A0 if it is in column B and so on. Moreover, I wanted to ignore the .AX in column A because it would not find any matches in column B anyway.

I used this, but it matches values row by row and it returns False or True. I would like to print the matches in a new Column C:

    df3['match'] = df3.A == df3.B

Thank you.

One Answer

To clarify, this question is about comparing two columns to check if the 3-letter combinations match.

So, I would approach this in the following manner:

# Extract the 3-letter combinations from column a
df3["a normalised"] = df3["a"].str[:3]

# Then check if what is in `a normalised` is in column b 

b_matches = list(df3[df3[“b”].isin(list(df3[“a normalised”]))][“b”].unique())
df3.loc[:, "match"] = False

b_match_idx = df3[df3["a normalised"].isin(b_matches)].index

df3.at[np.array(b_match_idx),"match"] = True

EDIT: The parentheses have now been resolved. Also the .loc warning can now be mitigated.

Answered by shepan6 on December 9, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP