Unix & Linux Asked by Joni on December 10, 2020
I have a semicolon-separated data file, where only a few lines have the "complete" dataset. They are at the end of the block of lines to which the dataset applies. I want to add data from this complete filled row to the rows above, using a shell script (or a similar command-line tool).
For example, let’s say the file I have contains the following data:
86540701
86951202
86262402
86509002
86770802
86459902
86301002
86485102
86556002;Vivo Y11;1630000;NULL;;;
86447404
86161405
86388604
86106105
86426405;Xiaomi Redmi 8A Pro (Redmi 8A Dual);1465000;4;;;
I want to be able to find the complete lines and substitute that data into the incomplete lines above, like so:
86540701;Vivo Y11;1630000;NULL;;;
86951202;Vivo Y11;1630000;NULL;;;
86262402;Vivo Y11;1630000;NULL;;;
86509002;Vivo Y11;1630000;NULL;;;
86770802;Vivo Y11;1630000;NULL;;;
86459902;Vivo Y11;1630000;NULL;;;
86301002;Vivo Y11;1630000;NULL;;;
86485102;Vivo Y11;1630000;NULL;;;
86556002;Vivo Y11;1630000;NULL;;;
86447404;Xiaomi Redmi 8A Pro (Redmi 8A Dual);1465000;4;;;
86161405;Xiaomi Redmi 8A Pro (Redmi 8A Dual);1465000;4;;;
86388604;Xiaomi Redmi 8A Pro (Redmi 8A Dual);1465000;4;;;
86106105;Xiaomi Redmi 8A Pro (Redmi 8A Dual);1465000;4;;;
86426405;Xiaomi Redmi 8A Pro (Redmi 8A Dual);1465000;4;;;
This is a task where we can use tac
to parse the file in reversed order:
tac file | awk -F';' 'NF > 1 {p = substr($0,index($0,FS))} {print $1 p}' | tac
So, we don't store any lines, but we print after reading each of them.
When NF > 1
we store the substring from the first FS
to end of line for future use.
Correct answer by thanasisp on December 10, 2020
Another awk
-based solution using a double-pass approach (requires GNU awk
or nawk
for the gensub()
function):
awk -F';' 'FNR==NR{if (NF>1) data[++i]=gensub(/^[^;]+/,"","1");next}
{if (NF==1) $0=$0 data[j+1]; else j++;} 1' input.csv input.csv
This will scan the file twice. The first time, it creates an array of "data parts" of those lines that contain more than one field. The second time, it substitutes the data part where it is missing, and increases the array counter every time it encounters a "complete" line so that the next data part is substituted for the following lines.
Answered by AdminBee on December 10, 2020
Using GNU sed
with the extended regex mode turned ON -E
$ sed -Ee '/n/ba
H;/;/!d;z;x;D;:a
s/n(.*n)?[^;]+(;.*)/2&/
P;/n.*n/D;s/.*n//
' file
$ perl -lne '$, = ";";
push(@A,$_),next if !/;/;
my $a = s/.*?;//r;
print $_, $a for splice @A;
print;
' file
Answered by guest_7 on December 10, 2020
Using sed
:
sed -E '
/;/!{ :a N;/;/!{ s/n/-/;ta; }; };
/;/ { s/n/-/; };
:c s/([^-]*)-([^;]*)(;.*)$/13n23/; tc' infile
Answered by αғsнιη on December 10, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP