TransWikia.com

converting a csv file to json + python with specific json format

Stack Overflow Asked by HattrickNZ on January 3, 2022

can I convert a csv file into json as follows:
csv = headers in line1 with values below
json = [{"key1":"value1",...},{"key1":"value2",...}...]

This is my csv file:

$ cat -v head_data.csv
"Rec Open Date","MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description"
"2016-05-30","686","230","63979","Samsung SM-G935FD ","Samsung SM-G935FD","$29.95 Carryover Plan (1GB)"
"2016-05-30","533","970","171631866","Apple iPhone 6 (A1586)","iPhone 6 (A1586)","$69.95 Plan"
"2016-05-30","191","610","145713","Samsung GT-I9195","Samsung GT-I9195","$29.95 Plan"
"2016-05-30","660","660","2994742","Samsung SM-N920I","Samsung SM-N920I","GOVERNMENT TIER 2 PLAN"
"2016-05-30","182","970","37799939","Samsung SM-J200Y","Samsung SM-J200Y","PREPAY PLUS - $0 -"
"2016-05-30","993","360","14096114","Samsung SM-A300Y","Samsung SM-A300Y","$39.95 Carryover Plan"
"2016-05-30","894","730","9851177","Samsung GT-N7105","Samsung GT-N7105","PREPAY STD - $0 - #2"
"2016-05-30","600","070","18420650","Apple iPhone 5C (A1529)","Apple iPhone 5C (A1529)","PREPAY PLUS - $0 -"
"2016-05-30","234","000","1769661","Galaxy S7 SM-G930F ","Galaxy S7 SM-G930F","$39.95 Plan"

This is my script:

$ cat csv_to_json.py

#!/usr/bin/python

#from here
#https://stackoverflow.com/a/7550352/2392358

import csv, json
csvreader = csv.reader(open('head_data.csv', 'rb'), delimiter='t',
quotechar='"')
data = []
for row in csvreader:
    r = []
    for field in row:
        if field == '': field = None
        else: field = unicode(field, 'ISO-8859-1')
        r.append(field)
    data.append(r)
jsonStruct = {
    'header': data[0],
    'data': data[1:]
}
open('head_data.json', 'wb').write(json.dumps(jsonStruct))

Runnning my script and output

$ python csv_to_json.py


$ cat -v head_data.json
{"header": ["Rec Open Date,"MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description""], "data": [["2016-05-30,"686","230","63979","Samsung SM-G935FD ","Samsung SM-G935FD","$29.95 Carryover Plan (1GB)""], ["2016-05-30,"533","970","171631866","Apple iPhone 6 (A1586)","iPhone 6 (A1586)","$69.95 Plan""], ["2016-05-30,"191","610","145713","Samsung GT-I9195","Samsung GT-I9195","$29.95 Plan""], ["2016-05-30,"660","660","2994742","Samsung SM-N920I","Samsung SM-N920I","GOVERNMENT TIER 2 PLAN""], ["2016-05-30,"182","970","37799939","Samsung SM-J200Y","Samsung SM-J200Y","PREPAY PLUS - $0 -""], ["2016-05-30,"993","360","14096114","Samsung SM-A300Y","Samsung SM-A300Y","$39.95 Carryover Plan""], ["2016-05-30,"894","730","9851177","Samsung GT-N7105","Samsung GT-N7105","PREPAY STD - $0 - #2""], ["2016-05-30,"600","070","18420650","Apple iPhone 5C (A1529)","Apple iPhone 5C (A1529)","PREPAY PLUS - $0 -""], ["2016-05-30,"234","000","1769661","Galaxy S7 SM-G930F ","Galaxy S7 SM-G930F","$39.95 Plan""]]}

Can i slightly modify the code so that I can get output like this:

[{"Rec Open Date":"2016-07-03","MSISDN":540,"IMEI":990,"Data Volume (Bytes)":36671453,"Device Manufacturer":"HUAWEI Technologies Co Ltd","Device Model":"H1512","Product Description":"PREPAY PLUS - $0 -"},
{"Rec Open Date":"2016-07-03","MSISDN":334,"IMEI":340,"Data Volume (Bytes)":129835114,"Device Manufacturer":"Apple Inc","Device Model":"Apple iPhone S (A1530)","Product Description":"$29.95 Plan"},
{"Rec Open Date":"2016-07-03","MSISDN":133,"IMEI":870,"Data Volume (Bytes)":42213030,"Device Manufacturer":"Apple Inc","Device Model":"Apple iPhone 6 Plus (A1524)","Product Description":"$49.95 Plan"}]

related Q here and here

edit1 found this here but this does the conversion in the browser and I think it uses js.

EDIT2 – based on the answer below this is what I want

This is the file I want to convert

$ cat -v head_data.csv
"Rec Open Date","MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description"
"2016-05-30","686","230","63979","Samsung SM-G935FD ","Samsung,A, SM-G935FD","$29.95 Carryover Plan (1GB)"
"2016-05-30","533","970","171631866","Apple iPhone 6 (A1586)","iPhone 6 (A1586)","$69.95 Plan"
"2016-05-30","191","610","145713","Samsung GT-I9195","Samsung GT-I9195","$29.95 Plan"
"2016-05-30","660","660","2994742","Samsung SM-N920I","Samsung SM-N920I","GOVERNMENT TIER 2 PLAN"
"2016-05-30","182","970","37799939","Samsung SM-J200Y","Samsung SM-J200Y","PREPAY PLUS - $0 -"
"2016-05-30","993","360","14096114","Samsung SM-A300Y","Samsung SM-A300Y","$39.95 Carryover Plan"
"2016-05-30","894","730","9851177","Samsung GT-N7105","Samsung GT-N7105","PREPAY STD - $0 - #2"
"2016-05-30","600","070","18420650","Apple iPhone 5C (A1529)","Apple iPhone 5C (A1529)","PREPAY PLUS - $0 -"
"2016-05-30","234","000","1769661","Galaxy S7 SM-G930F ","Galaxy S7 SM-G930F","$39.95 Plan"

This is the script:

$ cat -v csv_to_json2.py
#!/usr/bin/python

#from here
#https://stackoverflow.com/a/38193687/2392358

import csv
import json
from collections import OrderedDict

dR=csv.DictReader(open("head_data.csv"))
oD=[ OrderedDict(
         sorted(dct.iteritems(),
                key=lambda item:dR.fieldnames.index(item[0])))
     for dct in dR ]

#print to terminal
print json.dumps(oD)

#write to file
#json.dump(oD,"head_op.json")
open('head_op.json', 'wb').write(json.dumps(oD))

Running the script:

$ python csv_to_json2.py
[{"Rec Open Date": "2016-05-30", "MSISDN": "686", "IMEI": "230", "Data Volume (Bytes)": "63979", "Device Manufacturer": "Samsung SM-G935FD ", "Device Model": "Samsung,A, SM-G935FD", "Product Description": "$29.95 Carryover Plan (1GB)"}, {"Rec Open Date": "2016-05-30", "MSISDN": "533", "IMEI": "970", "Data Volume (Bytes)": "171631866", "Device Manufacturer": "Apple iPhone 6 (A1586)", "Device Model": "iPhone 6 (A1586)", "Product Description": "$69.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "191", "IMEI": "610", "Data Volume (Bytes)": "145713", "Device Manufacturer": "Samsung GT-I9195", "Device Model": "Samsung GT-I9195", "Product Description": "$29.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "660", "IMEI": "660", "Data Volume (Bytes)": "2994742", "Device Manufacturer": "Samsung SM-N920I", "Device Model": "Samsung SM-N920I", "Product Description": "GOVERNMENT TIER 2 PLAN"}, {"Rec Open Date": "2016-05-30", "MSISDN": "182", "IMEI": "970", "Data Volume (Bytes)": "37799939", "Device Manufacturer": "Samsung SM-J200Y", "Device Model": "Samsung SM-J200Y", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "993", "IMEI": "360", "Data Volume (Bytes)": "14096114", "Device Manufacturer": "Samsung SM-A300Y", "Device Model": "Samsung SM-A300Y", "Product Description": "$39.95 Carryover Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "894", "IMEI": "730", "Data Volume (Bytes)": "9851177", "Device Manufacturer": "Samsung GT-N7105", "Device Model": "Samsung GT-N7105", "Product Description": "PREPAY STD - $0 - #2"}, {"Rec Open Date": "2016-05-30", "MSISDN": "600", "IMEI": "070", "Data Volume (Bytes)": "18420650", "Device Manufacturer": "Apple iPhone 5C (A1529)", "Device Model": "Apple iPhone 5C (A1529)", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "234", "IMEI": "000", "Data Volume (Bytes)": "1769661", "Device Manufacturer": "Galaxy S7 SM-G930F ", "Device Model": "Galaxy S7 SM-G930F", "Product Description": "$39.95 Plan"}]

This is the output:

$ cat -v head_op.json
[{"Rec Open Date": "2016-05-30", "MSISDN": "686", "IMEI": "230", "Data Volume (Bytes)": "63979", "Device Manufacturer": "Samsung SM-G935FD ", "Device Model": "Samsung,A, SM-G935FD", "Product Description": "$29.95 Carryover Plan (1GB)"}, {"Rec Open Date": "2016-05-30", "MSISDN": "533", "IMEI": "970", "Data Volume (Bytes)": "171631866", "Device Manufacturer": "Apple iPhone 6 (A1586)", "Device Model": "iPhone 6 (A1586)", "Product Description": "$69.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "191", "IMEI": "610", "Data Volume (Bytes)": "145713", "Device Manufacturer": "Samsung GT-I9195", "Device Model": "Samsung GT-I9195", "Product Description": "$29.95 Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "660", "IMEI": "660", "Data Volume (Bytes)": "2994742", "Device Manufacturer": "Samsung SM-N920I", "Device Model": "Samsung SM-N920I", "Product Description": "GOVERNMENT TIER 2 PLAN"}, {"Rec Open Date": "2016-05-30", "MSISDN": "182", "IMEI": "970", "Data Volume (Bytes)": "37799939", "Device Manufacturer": "Samsung SM-J200Y", "Device Model": "Samsung SM-J200Y", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "993", "IMEI": "360", "Data Volume (Bytes)": "14096114", "Device Manufacturer": "Samsung SM-A300Y", "Device Model": "Samsung SM-A300Y", "Product Description": "$39.95 Carryover Plan"}, {"Rec Open Date": "2016-05-30", "MSISDN": "894", "IMEI": "730", "Data Volume (Bytes)": "9851177", "Device Manufacturer": "Samsung GT-N7105", "Device Model": "Samsung GT-N7105", "Product Description": "PREPAY STD - $0 - #2"}, {"Rec Open Date": "2016-05-30", "MSISDN": "600", "IMEI": "070", "Data Volume (Bytes)": "18420650", "Device Manufacturer": "Apple iPhone 5C (A1529)", "Device Model": "Apple iPhone 5C (A1529)", "Product Description": "PREPAY PLUS - $0 -"}, {"Rec Open Date": "2016-05-30", "MSISDN": "234", "IMEI": "000", "Data Volume (Bytes)": "1769661", "Device Manufacturer": "Galaxy S7 SM-G930F ", "Device Model": "Galaxy S7 SM-G930F", "Product Description": "$39.95 Plan"}]

4 Answers

Using pandas library was the easiest for me.

  1. Install dependencies
pip install pandas
  1. Create your csv to json script (let's call it csv2json.py)
import sys
import pandas as pd

data_frame = pd.read_csv(sys.argv[1])
data_frame.to_json(sys.argv[1].replace('.csv', '.json'), orient='records', indent=2)
  1. Run the csv2json.py script on an example.csv file input
python csv2json.py example.csv
  1. Your json has been generated in example.json file

Example:

input (example.csv):

"Rec Open Date","MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description"
"2016-05-30","686","230","63979","Samsung SM-G935FD ","Samsung SM-G935FD","$29.95 Carryover Plan (1GB)"
"2016-05-30","533","970","171631866","Apple iPhone 6 (A1586)","iPhone 6 (A1586)","$69.95 Plan"
"2016-05-30","191","610","145713","Samsung GT-I9195","Samsung GT-I9195","$29.95 Plan"
"2016-05-30","660","660","2994742","Samsung SM-N920I","Samsung SM-N920I","GOVERNMENT TIER 2 PLAN"
"2016-05-30","182","970","37799939","Samsung SM-J200Y","Samsung SM-J200Y","PREPAY PLUS - $0 -"
"2016-05-30","993","360","14096114","Samsung SM-A300Y","Samsung SM-A300Y","$39.95 Carryover Plan"
"2016-05-30","894","730","9851177","Samsung GT-N7105","Samsung GT-N7105","PREPAY STD - $0 - #2"
"2016-05-30","600","070","18420650","Apple iPhone 5C (A1529)","Apple iPhone 5C (A1529)","PREPAY PLUS - $0 -"
"2016-05-30","234","000","1769661","Galaxy S7 SM-G930F ","Galaxy S7 SM-G930F","$39.95 Plan"

output (example.json):

[
  {
    "Rec Open Date":"2016-05-30",
    "MSISDN":686,
    "IMEI":230,
    "Data Volume (Bytes)":63979,
    "Device Manufacturer":"Samsung SM-G935FD ",
    "Device Model":"Samsung SM-G935FD",
    "Product Description":"$29.95 Carryover Plan (1GB)"
  },
  {
    "Rec Open Date":"2016-05-30",
    "MSISDN":533,
    "IMEI":970,
    "Data Volume (Bytes)":171631866,
    "Device Manufacturer":"Apple iPhone 6 (A1586)",
    "Device Model":"iPhone 6 (A1586)",
    "Product Description":"$69.95 Plan"
  },
  {
    "Rec Open Date":"2016-05-30",
    "MSISDN":191,
    "IMEI":610,
    "Data Volume (Bytes)":145713,
    "Device Manufacturer":"Samsung GT-I9195",
    "Device Model":"Samsung GT-I9195",
    "Product Description":"$29.95 Plan"
  },
  {
    "Rec Open Date":"2016-05-30",
    "MSISDN":660,
    "IMEI":660,
    "Data Volume (Bytes)":2994742,
    "Device Manufacturer":"Samsung SM-N920I",
    "Device Model":"Samsung SM-N920I",
    "Product Description":"GOVERNMENT TIER 2 PLAN"
  },
  {
    "Rec Open Date":"2016-05-30",
    "MSISDN":182,
    "IMEI":970,
    "Data Volume (Bytes)":37799939,
    "Device Manufacturer":"Samsung SM-J200Y",
    "Device Model":"Samsung SM-J200Y",
    "Product Description":"PREPAY PLUS - $0 -"
  },
  {
    "Rec Open Date":"2016-05-30",
    "MSISDN":993,
    "IMEI":360,
    "Data Volume (Bytes)":14096114,
    "Device Manufacturer":"Samsung SM-A300Y",
    "Device Model":"Samsung SM-A300Y",
    "Product Description":"$39.95 Carryover Plan"
  },
  {
    "Rec Open Date":"2016-05-30",
    "MSISDN":894,
    "IMEI":730,
    "Data Volume (Bytes)":9851177,
    "Device Manufacturer":"Samsung GT-N7105",
    "Device Model":"Samsung GT-N7105",
    "Product Description":"PREPAY STD - $0 - #2"
  },
  {
    "Rec Open Date":"2016-05-30",
    "MSISDN":600,
    "IMEI":70,
    "Data Volume (Bytes)":18420650,
    "Device Manufacturer":"Apple iPhone 5C (A1529)",
    "Device Model":"Apple iPhone 5C (A1529)",
    "Product Description":"PREPAY PLUS - $0 -"
  },
  {
    "Rec Open Date":"2016-05-30",
    "MSISDN":234,
    "IMEI":0,
    "Data Volume (Bytes)":1769661,
    "Device Manufacturer":"Galaxy S7 SM-G930F ",
    "Device Model":"Galaxy S7 SM-G930F",
    "Product Description":"$39.95 Plan"
  }
]

Answered by Florent Roques on January 3, 2022

An alternative command line solution:

  1. Install dependencies:
pip install pyexcel-cli pyexcel-text
  1. Run the following command to transform csv to json
pyexcel transcode --name-columns-by-row 0 --output-file-type json example.csv -

output:

{"example.csv": [{"Data Volume (Bytes)": 63979, "Device Manufacturer": "Samsung SM-G935FD ", "Device Model": "Samsung SM-G935FD", "IMEI": 230, "MSISDN": 686, "Product Description": "$29.95 Carryover Plan (1GB)", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 171631866, "Device Manufacturer": "Apple iPhone 6 (A1586)", "Device Model": "iPhone 6 (A1586)", "IMEI": 970, "MSISDN": 533, "Product Description": "$69.95 Plan", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 145713, "Device Manufacturer": "Samsung GT-I9195", "Device Model": "Samsung GT-I9195", "IMEI": 610, "MSISDN": 191, "Product Description": "$29.95 Plan", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 2994742, "Device Manufacturer": "Samsung SM-N920I", "Device Model": "Samsung SM-N920I", "IMEI": 660, "MSISDN": 660, "Product Description": "GOVERNMENT TIER 2 PLAN", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 37799939, "Device Manufacturer": "Samsung SM-J200Y", "Device Model": "Samsung SM-J200Y", "IMEI": 970, "MSISDN": 182, "Product Description": "PREPAY PLUS - $0 -", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 14096114, "Device Manufacturer": "Samsung SM-A300Y", "Device Model": "Samsung SM-A300Y", "IMEI": 360, "MSISDN": 993, "Product Description": "$39.95 Carryover Plan", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 9851177, "Device Manufacturer": "Samsung GT-N7105", "Device Model": "Samsung GT-N7105", "IMEI": 730, "MSISDN": 894, "Product Description": "PREPAY STD - $0 - #2", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 18420650, "Device Manufacturer": "Apple iPhone 5C (A1529)", "Device Model": "Apple iPhone 5C (A1529)", "IMEI": "070", "MSISDN": 600, "Product Description": "PREPAY PLUS - $0 -", "Rec Open Date": "2016-05-30"}, {"Data Volume (Bytes)": 1769661, "Device Manufacturer": "Galaxy S7 SM-G930F ", "Device Model": "Galaxy S7 SM-G930F", "IMEI": "000", "MSISDN": 234, "Product Description": "$39.95 Plan", "Rec Open Date": "2016-05-30"}]}

Answered by chfw on January 3, 2022

If you want to keep the order of the keys, don't use csv.DictReader since it overcomplicates things, just record the header and then zip it with each of the rows:

import csv
from collections import OrderedDict
reader = csv.reader(open("text.csv"))

header = next(reader)

data = [OrderedDict(zip(header,fields)) for fields in reader]

Then you can write it to a file with this:

import json

with open("new.json","w") as f:
    json.dump(data, f)

Answered by Tadhg McDonald-Jensen on January 3, 2022

If you don't care about key's order, just do:

import csv
import json
json.dumps(list(csv.DictReader(open("file.csv"))))

Check pretty printing section on the manual for more options, or do

json.dumps(list( csv.DictReader(open("file.csv")) ])).replace("}, ","},n")

To get your expected output.


If you want ordered printing, you may order the keys via OrderedDict:

import csv
import json
from collections import OrderedDict

dR=csv.DictReader(open("/tmp/ah.csv"))
oD=[ OrderedDict(
         sorted(dct.iteritems(),
                key=lambda item:dR.fieldnames.index(item[0])))
     for dct in dR ]
json.dumps(oD)

Answered by xvan on January 3, 2022

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP