Stack Overflow Asked on December 5, 2021
NAME ADDRESS:COUNTRY AGE JOB EMAIL
BART RF_STREET:USA 66 ENGINEER BART@YAHOO
KYLE 78_STREET:AUS KYLE@GOOGLE
WILLIE 6_STREET:AUS WILLIE@GOOGLE
TRIPP H_STREET:NZ 55 DOCTOR TRIPP@YAHOO
.
.
.
I have a txt file look like the above. I tried to replace the spaces to ,
and to not remove the empty data like AGE
& JOB
for KYLE and WILLIW; hence below is the code
input_file = open('A.txt', mode='r')
input_read=input_file.readlines()
input_file.close()
data=[]
for i in input_read:
data.append(i.split())
My output from above code is
NAME,ADDRESS:COUNTRY,AGE,JOB,EMAIL
BART,RF_STREET:USA,66,ENGINEER,BART@YAHOO
KYLE,78_STREET:AUS,KYLE@GOOGLE
WILLIE,6_STREET:AUS,WILLIE@GOOGLE
TRIPP,H_STREET:NZ,55,DOCTOR,TRIPP@YAHOO
.
.
.
Meanwhile my desired output is
NAME,ADDRESS:COUNTRY,AGE,JOB,EMAIL
BART,RF_STREET:USA,66,ENGINEER,BART@YAHOO
KYLE,78_STREET:AUS,,,KYLE@GOOGLE
WILLIE,6_STREET:AUS,,,WILLIE@GOOGLE
TRIPP,H_STREET:NZ,55,DOCTOR,TRIPP@YAHOO
.
.
.
Here is one way to do it, splitting all the rows based on the width of the header columns:
import re
# ... open file
input_read = input_file.readlines()
colBreaks = [0] + [m.end() for m in re.finditer(r"s{2,}", input_read[0])]
data = []
for line in input_read:
data.append([line[i:j].strip() for i, j in zip(colBreaks, colBreaks[1:] + [None])])
print([','.join(result) for result in data])
Splitting by r"s{2,}"
means headers can have up to 1 space in them and this will still work properly.
Output:
['NAME,ADDRESS:COUNTRY,AGE,JOB,EMAIL',
'BART,RF_STREET:USA,66,ENGINEER,BART@YAHOO',
'KYLE,78_STREET:AUS,,,KYLE@GOOGLE',
'WILLIE,6_STREET:AUS,,,WILLIE@GOOGLE',
'TRIPP,H_STREET:NZ,55,DOCTOR,TRIPP@YAHOO']
Due credit to this answer and this answer for their helpful one-liners.
Answered by jdaz on December 5, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP