Stack Overflow Asked on November 22, 2021
I have thousands of dictionaries that I need to put into a single pandas data frame. The dictionaries look like this:
{'screen_width': 375,
'city': 'London',
'source': 'Mobile',
'appVersion': '5.3.0',
'connectionType': 'wifi',
'sheetName': 'Regional Asset',
'$device': 'iPhone',
'$user_id': '[email protected]',
'$device_id': '172fe47',
'$os': 'iOS',
'$manufacturer': 'Apple',
'$os_version': '13.4.1',
'$lib_version': '1.3.0',
'distinct_id': '[email protected]',
'fieldName': 'barcode',
'$screen_height': 812,
'mp_country_code': 'UK',
'$model': 'iPhone12,3',
'time': 1593404157
}
The problem I am having is that with each dictionary there might be an entry (such as city) missing from the dictionary, in which case the key isn’t there either. This is causing me massive problems.
What I’ve tried so far:
file = ('{0}.csv'.format(file_name))
df = pd.read_json(file)
df1 = pd.DataFrame(columns = [Column_names])
for i in range(df.shape[0]):
df1.loc[i] = [df.iloc[i,0]] + [df.iloc[i,1]['$screen_width']] + [df.iloc[i,1]['$city']] + [df.iloc[i,1]['source']] + [df.iloc[i,1]['connectionType']]
+ [df.iloc[i,1]['sheetName']] + [df.iloc[i,1]['$device']] + [df.iloc[i,1]['$user_id']] + [df.iloc[i,1]['$device_id']]
+ [df.iloc[i,1]['$os']] + [df.iloc[i,1]['mp_country_code']] + [df.iloc[i,1]['$manufacturer']] + [df.iloc[i,1]['$os_version']] + [df.iloc[i,1]['$lib_version']]
+ [df.iloc[i,1]['distinct_id']] + [df.iloc[i,1]['$screen_height']]+ [df.iloc[i,1]['$model']] + [df.iloc[i,1]['$region']]
+ [df.iloc[i,1]['mp_lib']] + [df.iloc[i,1]['time']] + [df.iloc[i,1]['mp_processing_time_ms']] + [df.iloc[i,1]['$browser']] + [df.iloc[i,1]['$insert_id']]
But as soon as it comes across a dictionary with city missing I get
KeyError: '$city'
I’ve also tried to add
try:
enter code here
except (KeyError):
pass
But that just returns an empty data frame.
Can anyone help?
Thanks
@Venkat J provided the list of dictionaries. You can pass this directly to the DataFrame constructor.
import pandas as pd
data = [
{"col1": "10",
"col2": "London",
"col3": "Mobile",
"col4": "Mobile"},
{"col1": "20",
"col2": "TOKYO",
"col4": "Mobile",
"col5": "Mobile"},
{"col1": "30",
"col2": "NewYork",
"col3": "Mobile",
"col4": "Mobile",
"col5": "Mobile"}
]
pd.DataFrame(data)
Answered by jsmart on November 22, 2021
If you know all the possible columns that exists in your json file, you can define a dataFrame with all possible columns, and then load each dictionary into your dataFrame. The file sample.txt contains list of dictinories
[
{"col1": "10",
"col2": "London",
"col3": "Mobile",
"col4": "Mobile"},
{"col1": "20",
"col2": "TOKYO",
"col4": "Mobile",
"col5": "Mobile"},
{"col1": "30",
"col2": "NewYork",
"col3": "Mobile",
"col4": "Mobile",
"col5": "Mobile"}
]
Program:
import pandas as pd
import json
if __name__ == "__main__":
result= pd.DataFrame(columns=['col1', 'col2', 'col3','col4','col5'])
f= open('sample.txt', 'r')
raw_data = json.loads(f.read())
for i in raw_data:
result = result.append(i, ignore_index=True)
print(i)
print(result)
The output of the program is:
col1 col2 col3 col4 col5
0 10 London Mobile Mobile NaN
1 20 TOKYO NaN Mobile Mobile
2 30 NewYork Mobile Mobile Mobile
Answered by Venkat J on November 22, 2021
import pandas as pd
my_dict ={'screen_width': 375,
'city': 'London',
'source': 'Mobile',
'appVersion': '5.3.0',
'connectionType': 'wifi',
'sheetName': 'Regional Asset',
'$device': 'iPhone',
'$user_id': '[email protected]',
'$device_id': '172fe47',
'$os': 'iOS',
'$manufacturer': 'Apple',
'$os_version': '13.4.1',
'$lib_version': '1.3.0',
'distinct_id': '[email protected]',
'fieldName': 'barcode',
'$screen_height': 812,
'mp_country_code': 'UK',
'$model': 'iPhone12,3',
'time': 1593404157
}
my_dict_2 = {'screen_width': 375,
'source': 'Mobile',
'appVersion': '5.3.0',
'connectionType': 'wifi',
'sheetName': 'Regional Asset',
'$device': 'iPhone',
'$user_id': '[email protected]',
'$device_id': '172fe47',
'$os': 'iOS',
'$manufacturer': 'Apple',
'$os_version': '13.4.1',
'$lib_version': '1.3.0',
'distinct_id': '[email protected]',
'fieldName': 'barcode',
'$screen_height': 812,
'mp_country_code': 'UK',
'$model': 'iPhone12,3',
'time': 1593404157}
my_dict_3 = {'screen_width': 375,
'city': 'London',
'source': 'Mobile',
'appVersion': '5.3.0',
'connectionType': 'wifi',
'$device': 'iPhone',
'$user_id': '[email protected]',
'$device_id': '172fe47',
'$os': 'iOS',
'$manufacturer': 'Apple',
'$os_version': '13.4.1',
'$lib_version': '1.3.0',
'distinct_id': '[email protected]',
'fieldName': 'barcode',
'$screen_height': 812,
'mp_country_code': 'UK',
'$model': 'iPhone12,3',
'time': 1593404157}
list_of_dictionaries = [my_dict, my_dict_2, my_dict_3]
start = True
for my_dict in list_of_dictionaries:
if start:
my_df = pd.DataFrame.from_dict([my_dict])
start = False
else:
my_df = pd.concat([my_df, pd.DataFrame.from_dict([my_dict])])```
Answered by rab1262 on November 22, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP