Stack Overflow Asked by HattrickNZ on January 3, 2022
can I convert a csv file into json as follows:
csv = headers in line1 with values below
json = [{"key1":"value1",...},{"key1":"value2",...}...]
This is my csv file:
$ cat -v head_data.csv
"Rec Open Date","MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description"
"2016-05-30","686","230","63979","Samsung SM-G935FD ","Samsung SM-G935FD","$29.95 Carryover Plan (1GB)"
"2016-05-30","533","970","171631866","Apple iPhone 6 (A1586)","iPhone 6 (A1586)","$69.95 Plan"
"2016-05-30","191","610","145713","Samsung GT-I9195","Samsung GT-I9195","$29.95 Plan"
"2016-05-30","660","660","2994742","Samsung SM-N920I","Samsung SM-N920I","GOVERNMENT TIER 2 PLAN"
"2016-05-30","182","970","37799939","Samsung SM-J200Y","Samsung SM-J200Y","PREPAY PLUS - $0 -"
"2016-05-30","993","360","14096114","Samsung SM-A300Y","Samsung SM-A300Y","$39.95 Carryover Plan"
"2016-05-30","894","730","9851177","Samsung GT-N7105","Samsung GT-N7105","PREPAY STD - $0 - #2"
"2016-05-30","600","070","18420650","Apple iPhone 5C (A1529)","Apple iPhone 5C (A1529)","PREPAY PLUS - $0 -"
"2016-05-30","234","000","1769661","Galaxy S7 SM-G930F ","Galaxy S7 SM-G930F","$39.95 Plan"
This is my script:
$ cat csv_to_json.py
#!/usr/bin/python
#from here
#https://stackoverflow.com/a/7550352/2392358
import csv, json
csvreader = csv.reader(open('head_data.csv', 'rb'), delimiter='t',
quotechar='"')
data = []
for row in csvreader:
r = []
for field in row:
if field == '': field = None
else: field = unicode(field, 'ISO-8859-1')
r.append(field)
data.append(r)
jsonStruct = {
'header': data[0],
'data': data[1:]
}
open('head_data.json', 'wb').write(json.dumps(jsonStruct))
Runnning my script and output
$ python csv_to_json.py
$ cat -v head_data.json
{"header": ["Rec Open Date,"MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description""], "data": [["2016-05-30,"686","230","63979","Samsung SM-G935FD ","Samsung SM-G935FD","$29.95 Carryover Plan (1GB)""], ["2016-05-30,"533","970","171631866","Apple iPhone 6 (A1586)","iPhone 6 (A1586)","$69.95 Plan""], ["2016-05-30,"191","610","145713","Samsung GT-I9195","Samsung GT-I9195","$29.95 Plan""], ["2016-05-30,"660","660","2994742","Samsung SM-N920I","Samsung SM-N920I","GOVERNMENT TIER 2 PLAN""], ["2016-05-30,"182","970","37799939","Samsung SM-J200Y","Samsung SM-J200Y","PREPAY PLUS - $0 -""], ["2016-05-30,"993","360","14096114","Samsung SM-A300Y","Samsung SM-A300Y","$39.95 Carryover Plan""], ["2016-05-30,"894","730","9851177","Samsung GT-N7105","Samsung GT-N7105","PREPAY STD - $0 - #2""], ["2016-05-30,"600","070","18420650","Apple iPhone 5C (A1529)","Apple iPhone 5C (A1529)","PREPAY PLUS - $0 -""], ["2016-05-30,"234","000","1769661","Galaxy S7 SM-G930F ","Galaxy S7 SM-G930F","$39.95 Plan""]]}
Can i slightly modify the code so that I can get output like this:
[{"Rec Open Date":"2016-07-03","MSISDN":540,"IMEI":990,"Data Volume (Bytes)":36671453,"Device Manufacturer":"HUAWEI Technologies Co Ltd","Device Model":"H1512","Product Description":"PREPAY PLUS - $0 -"},
{"Rec Open Date":"2016-07-03","MSISDN":334,"IMEI":340,"Data Volume (Bytes)":129835114,"Device Manufacturer":"Apple Inc","Device Model":"Apple iPhone S (A1530)","Product Description":"$29.95 Plan"},
{"Rec Open Date":"2016-07-03","MSISDN":133,"IMEI":870,"Data Volume (Bytes)":42213030,"Device Manufacturer":"Apple Inc","Device Model":"Apple iPhone 6 Plus (A1524)","Product Description":"$49.95 Plan"}]
edit1 found this here but this does the conversion in the browser and I think it uses js.
This is the file I want to convert
$ cat -v head_data.csv
"Rec Open Date","MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description"
"2016-05-30","686","230","63979","Samsung SM-G935FD ","Samsung,A, SM-G935FD","$29.95 Carryover Plan (1GB)"
"2016-05-30","533","970","171631866","Apple iPhone 6 (A1586)","iPhone 6 (A1586)","$69.95 Plan"
"2016-05-30","191","610","145713","Samsung GT-I9195","Samsung GT-I9195","$29.95 Plan"
"2016-05-30","660","660","2994742","Samsung SM-N920I","Samsung SM-N920I","GOVERNMENT TIER 2 PLAN"
"2016-05-30","182","970","37799939","Samsung SM-J200Y","Samsung SM-J200Y","PREPAY PLUS - $0 -"
"2016-05-30","993","360","14096114","Samsung SM-A300Y","Samsung SM-A300Y","$39.95 Carryover Plan"
"2016-05-30","894","730","9851177","Samsung GT-N7105","Samsung GT-N7105","PREPAY STD - $0 - #2"
"2016-05-30","600","070","18420650","Apple iPhone 5C (A1529)","Apple iPhone 5C (A1529)","PREPAY PLUS - $0 -"
"2016-05-30","234","000","1769661","Galaxy S7 SM-G930F ","Galaxy S7 SM-G930F","$39.95 Plan"
This is the script:
$ cat -v csv_to_json2.py
#!/usr/bin/python
#from here
#https://stackoverflow.com/a/38193687/2392358
import csv
import json
from collections import OrderedDict
dR=csv.DictReader(open("head_data.csv"))
oD=[ OrderedDict(
sorted(dct.iteritems(),
key=lambda item:dR.fieldnames.index(item[0])))
for dct in dR ]
#print to terminal
print json.dumps(oD)
#write to file
#json.dump(oD,"head_op.json")
open('head_op.json', 'wb').write(json.dumps(oD))
Running the script:
$ python csv_to_json2.py
[{"Rec Open Date": "2016-05-30", "MSISDN": "686", "IMEI": "230", "Data Volume (Bytes)": "63979", "Device Manufacturer": "Samsung SM-G935FD ", "Device Model": "Samsung,A, SM-G935FD", "Product Description": "$29.95 Carryover Plan (1GB)"}, {"Rec Open Date": "2016-05-30", "MSISDN": "533", "IMEI": "970", "Data Volume (Bytes)": "171631866", "Device Manufacturer": "Apple iPhone 6 (A1586)", "Device Model": "iPhone 6 (A1586)", "Product Description": "$69.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "191", "IMEI": "610", "Data Volume (Bytes)": "145713", "Device Manufacturer": "Samsung GT-I9195", "Device Model": "Samsung GT-I9195", "Product Description": "$29.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "660", "IMEI": "660", "Data Volume (Bytes)": "2994742", "Device Manufacturer": "Samsung SM-N920I", "Device Model": "Samsung SM-N920I", "Product Description": "GOVERNMENT TIER 2 PLAN"}, {"Rec Open Date": "2016-05-30", "MSISDN": "182", "IMEI": "970", "Data Volume (Bytes)": "37799939", "Device Manufacturer": "Samsung SM-J200Y", "Device Model": "Samsung SM-J200Y", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "993", "IMEI": "360", "Data Volume (Bytes)": "14096114", "Device Manufacturer": "Samsung SM-A300Y", "Device Model": "Samsung SM-A300Y", "Product Description": "$39.95 Carryover Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "894", "IMEI": "730", "Data Volume (Bytes)": "9851177", "Device Manufacturer": "Samsung GT-N7105", "Device Model": "Samsung GT-N7105", "Product Description": "PREPAY STD - $0 - #2"}, {"Rec Open Date": "2016-05-30", "MSISDN": "600", "IMEI": "070", "Data Volume (Bytes)": "18420650", "Device Manufacturer": "Apple iPhone 5C (A1529)", "Device Model": "Apple iPhone 5C (A1529)", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "234", "IMEI": "000", "Data Volume (Bytes)": "1769661", "Device Manufacturer": "Galaxy S7 SM-G930F ", "Device Model": "Galaxy S7 SM-G930F", "Product Description": "$39.95 Plan"}]
This is the output:
$ cat -v head_op.json
[{"Rec Open Date": "2016-05-30", "MSISDN": "686", "IMEI": "230", "Data Volume (Bytes)": "63979", "Device Manufacturer": "Samsung SM-G935FD ", "Device Model": "Samsung,A, SM-G935FD", "Product Description": "$29.95 Carryover Plan (1GB)"}, {"Rec Open Date": "2016-05-30", "MSISDN": "533", "IMEI": "970", "Data Volume (Bytes)": "171631866", "Device Manufacturer": "Apple iPhone 6 (A1586)", "Device Model": "iPhone 6 (A1586)", "Product Description": "$69.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "191", "IMEI": "610", "Data Volume (Bytes)": "145713", "Device Manufacturer": "Samsung GT-I9195", "Device Model": "Samsung GT-I9195", "Product Description": "$29.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "660", "IMEI": "660", "Data Volume (Bytes)": "2994742", "Device Manufacturer": "Samsung SM-N920I", "Device Model": "Samsung SM-N920I", "Product Description": "GOVERNMENT TIER 2 PLAN"}, {"Rec Open Date": "2016-05-30", "MSISDN": "182", "IMEI": "970", "Data Volume (Bytes)": "37799939", "Device Manufacturer": "Samsung SM-J200Y", "Device Model": "Samsung SM-J200Y", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "993", "IMEI": "360", "Data Volume (Bytes)": "14096114", "Device Manufacturer": "Samsung SM-A300Y", "Device Model": "Samsung SM-A300Y", "Product Description": "$39.95 Carryover Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "894", "IMEI": "730", "Data Volume (Bytes)": "9851177", "Device Manufacturer": "Samsung GT-N7105", "Device Model": "Samsung GT-N7105", "Product Description": "PREPAY STD - $0 - #2"}, {"Rec Open Date": "2016-05-30", "MSISDN": "600", "IMEI": "070", "Data Volume (Bytes)": "18420650", "Device Manufacturer": "Apple iPhone 5C (A1529)", "Device Model": "Apple iPhone 5C (A1529)", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "234", "IMEI": "000", "Data Volume (Bytes)": "1769661", "Device Manufacturer": "Galaxy S7 SM-G930F ", "Device Model": "Galaxy S7 SM-G930F", "Product Description": "$39.95 Plan"}]
Using pandas
library was the easiest for me.
pip install pandas
csv2json.py
)import sys
import pandas as pd
data_frame = pd.read_csv(sys.argv[1])
data_frame.to_json(sys.argv[1].replace('.csv', '.json'), orient='records', indent=2)
csv2json.py
script on an example.csv
file inputpython csv2json.py example.csv
example.json
fileExample:
input (example.csv
):
"Rec Open Date","MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description"
"2016-05-30","686","230","63979","Samsung SM-G935FD ","Samsung SM-G935FD","$29.95 Carryover Plan (1GB)"
"2016-05-30","533","970","171631866","Apple iPhone 6 (A1586)","iPhone 6 (A1586)","$69.95 Plan"
"2016-05-30","191","610","145713","Samsung GT-I9195","Samsung GT-I9195","$29.95 Plan"
"2016-05-30","660","660","2994742","Samsung SM-N920I","Samsung SM-N920I","GOVERNMENT TIER 2 PLAN"
"2016-05-30","182","970","37799939","Samsung SM-J200Y","Samsung SM-J200Y","PREPAY PLUS - $0 -"
"2016-05-30","993","360","14096114","Samsung SM-A300Y","Samsung SM-A300Y","$39.95 Carryover Plan"
"2016-05-30","894","730","9851177","Samsung GT-N7105","Samsung GT-N7105","PREPAY STD - $0 - #2"
"2016-05-30","600","070","18420650","Apple iPhone 5C (A1529)","Apple iPhone 5C (A1529)","PREPAY PLUS - $0 -"
"2016-05-30","234","000","1769661","Galaxy S7 SM-G930F ","Galaxy S7 SM-G930F","$39.95 Plan"
output (example.json
):
[
{
"Rec Open Date":"2016-05-30",
"MSISDN":686,
"IMEI":230,
"Data Volume (Bytes)":63979,
"Device Manufacturer":"Samsung SM-G935FD ",
"Device Model":"Samsung SM-G935FD",
"Product Description":"$29.95 Carryover Plan (1GB)"
},
{
"Rec Open Date":"2016-05-30",
"MSISDN":533,
"IMEI":970,
"Data Volume (Bytes)":171631866,
"Device Manufacturer":"Apple iPhone 6 (A1586)",
"Device Model":"iPhone 6 (A1586)",
"Product Description":"$69.95 Plan"
},
{
"Rec Open Date":"2016-05-30",
"MSISDN":191,
"IMEI":610,
"Data Volume (Bytes)":145713,
"Device Manufacturer":"Samsung GT-I9195",
"Device Model":"Samsung GT-I9195",
"Product Description":"$29.95 Plan"
},
{
"Rec Open Date":"2016-05-30",
"MSISDN":660,
"IMEI":660,
"Data Volume (Bytes)":2994742,
"Device Manufacturer":"Samsung SM-N920I",
"Device Model":"Samsung SM-N920I",
"Product Description":"GOVERNMENT TIER 2 PLAN"
},
{
"Rec Open Date":"2016-05-30",
"MSISDN":182,
"IMEI":970,
"Data Volume (Bytes)":37799939,
"Device Manufacturer":"Samsung SM-J200Y",
"Device Model":"Samsung SM-J200Y",
"Product Description":"PREPAY PLUS - $0 -"
},
{
"Rec Open Date":"2016-05-30",
"MSISDN":993,
"IMEI":360,
"Data Volume (Bytes)":14096114,
"Device Manufacturer":"Samsung SM-A300Y",
"Device Model":"Samsung SM-A300Y",
"Product Description":"$39.95 Carryover Plan"
},
{
"Rec Open Date":"2016-05-30",
"MSISDN":894,
"IMEI":730,
"Data Volume (Bytes)":9851177,
"Device Manufacturer":"Samsung GT-N7105",
"Device Model":"Samsung GT-N7105",
"Product Description":"PREPAY STD - $0 - #2"
},
{
"Rec Open Date":"2016-05-30",
"MSISDN":600,
"IMEI":70,
"Data Volume (Bytes)":18420650,
"Device Manufacturer":"Apple iPhone 5C (A1529)",
"Device Model":"Apple iPhone 5C (A1529)",
"Product Description":"PREPAY PLUS - $0 -"
},
{
"Rec Open Date":"2016-05-30",
"MSISDN":234,
"IMEI":0,
"Data Volume (Bytes)":1769661,
"Device Manufacturer":"Galaxy S7 SM-G930F ",
"Device Model":"Galaxy S7 SM-G930F",
"Product Description":"$39.95 Plan"
}
]
Answered by Florent Roques on January 3, 2022
An alternative command line solution:
pip install pyexcel-cli pyexcel-text
pyexcel transcode --name-columns-by-row 0 --output-file-type json example.csv -
output:
{"example.csv": [{"Data Volume (Bytes)": 63979, "Device Manufacturer": "Samsung SM-G935FD ", "Device Model": "Samsung SM-G935FD", "IMEI": 230, "MSISDN": 686, "Product Description": "$29.95 Carryover Plan (1GB)", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 171631866, "Device Manufacturer": "Apple iPhone 6 (A1586)", "Device Model": "iPhone 6 (A1586)", "IMEI": 970, "MSISDN": 533, "Product Description": "$69.95 Plan", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 145713, "Device Manufacturer": "Samsung GT-I9195", "Device Model": "Samsung GT-I9195", "IMEI": 610, "MSISDN": 191, "Product Description": "$29.95 Plan", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 2994742, "Device Manufacturer": "Samsung SM-N920I", "Device Model": "Samsung SM-N920I", "IMEI": 660, "MSISDN": 660, "Product Description": "GOVERNMENT TIER 2 PLAN", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 37799939, "Device Manufacturer": "Samsung SM-J200Y", "Device Model": "Samsung SM-J200Y", "IMEI": 970, "MSISDN": 182, "Product Description": "PREPAY PLUS - $0 -", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 14096114, "Device Manufacturer": "Samsung SM-A300Y", "Device Model": "Samsung SM-A300Y", "IMEI": 360, "MSISDN": 993, "Product Description": "$39.95 Carryover Plan", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 9851177, "Device Manufacturer": "Samsung GT-N7105", "Device Model": "Samsung GT-N7105", "IMEI": 730, "MSISDN": 894, "Product Description": "PREPAY STD - $0 - #2", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 18420650, "Device Manufacturer": "Apple iPhone 5C (A1529)", "Device Model": "Apple iPhone 5C (A1529)", "IMEI": "070", "MSISDN": 600, "Product Description": "PREPAY PLUS - $0 -", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 1769661, "Device Manufacturer": "Galaxy S7 SM-G930F ", "Device Model": "Galaxy S7 SM-G930F", "IMEI": "000", "MSISDN": 234, "Product Description": "$39.95 Plan", "Rec Open Date": "2016-05-30"}]}
Answered by chfw on January 3, 2022
If you want to keep the order of the keys, don't use csv.DictReader
since it overcomplicates things, just record the header and then zip
it with each of the rows:
import csv
from collections import OrderedDict
reader = csv.reader(open("text.csv"))
header = next(reader)
data = [OrderedDict(zip(header,fields)) for fields in reader]
Then you can write it to a file with this:
import json
with open("new.json","w") as f:
json.dump(data, f)
Answered by Tadhg McDonald-Jensen on January 3, 2022
If you don't care about key's order, just do:
import csv
import json
json.dumps(list(csv.DictReader(open("file.csv"))))
Check pretty printing section on the manual for more options, or do
json.dumps(list( csv.DictReader(open("file.csv")) ])).replace("}, ","},n")
To get your expected output.
If you want ordered printing, you may order the keys via OrderedDict:
import csv
import json
from collections import OrderedDict
dR=csv.DictReader(open("/tmp/ah.csv"))
oD=[ OrderedDict(
sorted(dct.iteritems(),
key=lambda item:dR.fieldnames.index(item[0])))
for dct in dR ]
json.dumps(oD)
Answered by xvan on January 3, 2022
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP