TransWikia.com

Look up a number inside a list within a pandas cell, and return corresponding string value from a second DF

Data Science Asked by Donum on January 14, 2021

(I’ve edited the first column name in the labels_df for clarity)

I have two DataFrames, train_df and labels_df. train_df has integers that map to attribute names in the labels_df. I would like to look up each number within a given train_df cell and return in the adjacent cell, the corresponding attribute name from the labels_df.

enter image description here

I’ve tried variations of the function below but fear I am wayyy off:

def my_mapping(df1, df2):
    tags = df1['attribute_ids']
    for i in tags.iteritems():
        df1['new_col'] = df2.iloc[i]
    return df1

The data are originally from two csv files:

train.csv

enter image description here

labels.csv

enter image description here

I tried this from @Danny :

sample_train_df['attribute_ids'].apply(lambda x: [sample_labels_df[sample_labels_df['attribute_name'] == i]
                                              ['attribute_id_num'] for i in x])

*please note – I am running the above code on samples of each DF due to run times on the original DFs.
which returned:

enter image description here

One Answer

I created my own data.

train.csv

id,attrib
1,1 2 3
2,3 4 5
3,2 3 5
4,1 1 1

labels.csv

attrib_id,attrib_name
1,a
2,b
3,c
4,d
5,e

Read the csv files and create df1 and df2

After that use

def get_name(x):
    result = []
    for t in x.split(' '):
        result.append(df2[df2['attrib_id']==int(t)]['attrib_name'].values[0])
    return result

df1['attrib'] = df1['attrib'].apply(lambda x: get_name(x))

This will result in df1 looking like

enter image description here I guess you also doing the same thing when you referred @Danny. The only thing important here is to convert the string into integer

Answered by shivam shah on January 14, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP