Stack Overflow Asked by NEHA CHOUDHARY on December 8, 2020
I have a file1
NR2SKRD12BWP210H6P51CNODSVT-(.A1(n7),.A2(),.ZN(n8)); |2
MUX2D2BWP210H6P51CNODSVT-(.I0(n8),.I1(),.S(),.Z(n9)); |4
CKLHQD16BWP210H6P51CNODSVT-(.CPN(#),.E(1),.TE(n9),.Q(n10)); |5
LHCSNQD1BWP210H6P51CNODSVT-(.CDN(n10),.D(),.E(1),.SDN(),.Q(n11)); |6
OAI21D8BWP210H6P51CNODSVT-(.A1(n11),.A2(),.B(),.ZN(n12)); |9
DCCKND16BWP210H6P51CNODSVT-(.I(n12),.ZN(n13)); |10
INVSKFD14BWP210H6P51CNODSVT-(.I(n13),.ZN(n14)); |11
NR2SKRD12BWP210H6P51CNODSVT-(.A1(n7),.A2(n1),.ZN(n8)); |2
MUX2D2BWP210H6P51CNODSVT-(.I0(n8),.I1(n2),.S(),.Z(n9)); |4
I need to find matching lines in file1 i.e 1st field. Field are seperated by-
and if match found , delete 1st match line.
I want output as
CKLHQD16BWP210H6P51CNODSVT-(.CPN(#),.E(1),.TE(n9),.Q(n10)); |5
LHCSNQD1BWP210H6P51CNODSVT-(.CDN(n10),.D(),.E(1),.SDN(),.Q(n11)); |6
OAI21D8BWP210H6P51CNODSVT-(.A1(n11),.A2(),.B(),.ZN(n12)); |9
DCCKND16BWP210H6P51CNODSVT-(.I(n12),.ZN(n13)); |10
INVSKFD14BWP210H6P51CNODSVT-(.I(n13),.ZN(n14)); |11
NR2SKRD12BWP210H6P51CNODSVT-(.A1(n7),.A2(n1),.ZN(n8)); |2
MUX2D2BWP210H6P51CNODSVT-(.I0(n8),.I1(n2),.S(),.Z(n9)); |4
Here NR2SKRD12BWP210H6P51CNODSVT
andMUX2D2BWP210H6P51CNODSVT
have same $1. So delete their 1st match line.
I tried the code
awk -F'-' 'FNR==NR{a[$1];next} !(($1) in a)' file1
But this code is to find match and delete lines between two files. How can I find match and delete for single file.
*delete the first match line only. Keep the second,third, fourth etc repeats.
To delete first duplicate:
awk -F- 'NR==FNR {++a[$1]; next} a[$1]==1; {a[$1]=1}' file file
Read same file twice. Count $1 on the first read, decide what to do with the count on the next.
Answered by rowboat on December 8, 2020
another awk
$ awk -F- 'NR==FNR{a[$1]++; next} !(--a[$1])' file{,}
CKLHQD16BWP210H6P51CNODSVT-(.CPN(#),.E(1),.TE(n9),.Q(n10)); |5
LHCSNQD1BWP210H6P51CNODSVT-(.CDN(n10),.D(),.E(1),.SDN(),.Q(n11)); |6
OAI21D8BWP210H6P51CNODSVT-(.A1(n11),.A2(),.B(),.ZN(n12)); |9
DCCKND16BWP210H6P51CNODSVT-(.I(n12),.ZN(n13)); |10
INVSKFD14BWP210H6P51CNODSVT-(.I(n13),.ZN(n14)); |11
NR2SKRD12BWP210H6P51CNODSVT-(.A1(n7),.A2(n1),.ZN(n8)); |2
MUX2D2BWP210H6P51CNODSVT-(.I0(n8),.I1(n2),.S(),.Z(n9)); |4
double scan file, first round count the occurrences of each key, second round print the last one only.
Answered by karakfa on December 8, 2020
This might work for you (GNU sed):
sed -E 'H;x;s/^(n[^-]*-)[^n]*(.*1)/2/;x;$!d;x;s/.//' file
Make a copy of the current line in the hold space.
If the current key already exists in the hold space, remove the first line.
At the end of the file, swap to the hold space, remove the first newline that was introduced when making copies and print the results.
Answered by potong on December 8, 2020
Could you please try following, written and tested with shown samples in GNU awk
.
awk '
BEGIN{ FS="-" }
FNR==NR{
arr[$1]++
next
}
arr[$1]>1 && ++arrAgain[$1]==1{ next }
1
' Input_file Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
BEGIN{ FS="-" } ##Setting field separator as dash here.
FNR==NR{ ##Checking FNR==NR condition which will be TRUE when 1st time Input_file is being read.
arr[$1]++ ##Creating array arr with 1st field index and keep increasing its value with 1 on each of its occurrence.
next ##next will skip all further statements from here.
}
arr[$1]>1 && ++arrAgain[$1]==1{ ##Checking if arr value with 1st field index is greater than 1 and its first time occurring in arrAgain then skip that line.
next ##next will skip all further statements from here.
}
1 ##1 will print current line.
' Input_file Input_file ##Mentioning Input_file names here.
Answered by RavinderSingh13 on December 8, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP