Stack Overflow Asked by AndrewLittle1 on January 3, 2022
I have a list_3, with one element, a string:
[['nnn Headquarters or Regional OfficennnnntttttttttMain Headquarterstttttttnn', 'nnn FoundersnnnnntttttttttThomas Lon Vantttttttnn', 'nnn Founder DiversitynnnnntttttttttN/Atttttttnn', 'nnn Year Foundednnnnnttttttttt2016tttttttnn', 'nnn # of Employeesnnnnnttttttttt1-10tttttttnn', 'nnn Seeking Funding?nnnnntttttttttNo tttttttnn', 'nnn Funding PhasennnnntttttttttN/Atttttttnn'], ['nnn Headquarters or Regional OfficennnnntttttttttMain Headquarterstttttttnn', 'nnn FoundersnnnnntttttttttMacKenzie T Stout,tttttttnn', 'nnn Founder DiversitynnnnntttttttttN/Atttttttnn', 'nnn Year Foundednnnnnttttttttt2020tttttttnn', 'nnn # of Employeesnnnnnttttttttt1-10tttttttnn', 'nnn Seeking Funding?nnnnntttttttttYestttttttnn', 'nnn Funding PhasennnnntttttttttPre-Seedtttttttnn']]
I want to use regex to strip ntr, from the output and return the text in an easy to read format
This is what I have tried:
list_33 = []
for i in list_3:
string = ''.join(list_3)
list_33.append(re.sub('s+','', string))
print(list_33)
output:
['HeadquartersorRegionalOfficeMainHeadquarters', 'FoundersThomasLonVan', 'FounderDiversityN/A', 'YearFounded2016', '#ofEmployees1-10', 'SeekingFunding?No', 'FundingPhaseN/A']
This is almost what I need but I would like there to be one space between each word and colon after the first text block from list_3, ie:
['Headquarters or Regional Office: Main Headquarters', 'Founders: Thomas Lon Van', 'Founder Diversity: N/A', 'Year Founded: 2015', '# of Employees 1-10', 'Seeking Funding?: No', 'Funding Phase: N/A']
Any ideas of how I can incorporate both regex functions into one?
Thanks
ps. I know that I don’t need to use a for loop for a list with just one element, but in the future the list will have more elements, I am trying to generalize the code structure using just one input right now.
You can navigate through each string in the list and the use re.sub
to replace each occurrence of more than 2 white space by a :
>>> import re
>>> lst = ['nnn Headquarters or Regional OfficennnnntttttttttMain Headquarterstttttttnn', 'nnn FoundersnnnnntttttttttThomas Lon Vantttttttnn', 'nnn Founder DiversitynnnnntttttttttN/Atttttttnn', 'nnn Year Foundednnnnnttttttttt2016tttttttnn', 'nnn # of Employeesnnnnnttttttttt1-10tttttttnn', 'nnn Seeking Funding?nnnnntttttttttNo tttttttnn', 'nnn Funding PhasennnnntttttttttN/Atttttttnn']
>>> [re.sub(r'ss+', ': ', word).strip(': ') for word in lst]
['Headquarters or Regional Office: Main Headquarters', 'Founders: Thomas Lon Van', 'Founder Diversity: N/A', 'Year Founded: 2016', '# of Employees: 1-10', 'Seeking Funding?: No', 'Funding Phase: N/A']
Answered by Prem Anand on January 3, 2022
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP