Unix & Linux Asked by SKG on September 14, 2020
I have two files, file1 and file2
file1:
r11_abc_gkhsa 1.0 1.5 1.9
r11_bcd_gkhsa 1.0 1.5 1.7
r11_acd_gkhsa 1.3 1.6 1.5
r11_xyz_gkhsa 1.0 1.5 1.9
file2:
sd1_bcd_gkhsa 1.8 1.5 1.9
ab1_abc_gkhsa 1.6 1.4 1.5
sfs_xyz_gkhsa 1.4 1.6 1.4
sd1_acd_gkhsa 1.2 1.3 1.5
sfs_ryb_gkhsa 1.5 1.2 1.7
I want to match " abc , bcd, acd, and xyz" of file1 with file2. Whenever it matched with file2 I want to print it the following way.
Output:
r11_abc_gkhsa 1.0 1.5 1.9 ab1_abc_gkhsa 1.6 1.4 1.5
r11_bcd_gkhsa 1.0 1.5 1.7 sd1_bcd_gkhsa 1.8 1.5 1.9
r11_acd_gkhsa 1.3 1.6 1.5 sd1_acd_gkhsa 1.2 1.3 1.5
r11_xyz_gkhsa 1.0 1.5 1.9 sfs_xyz_gkhsa 1.4 1.6 1.4
sfs_ryb_gkhsa 1.5 1.2 1.7
can use Perl or sed. can someone give me ideas to work on it.
If you just want to use plain bash
arrays --
#read in the data from 2 files
unset arr1; declare -A arr1;
while read -r -u3 line; do
i=${line%_*};
i=${i#*_};
arr1[$i]+=" $line";
done 3< <(cat f1 f2);
exec 3<&-
#output array by iterating throug the keys
for k in "${!arr1[@]}"; do
echo ${arr1[$k]};
done | sort
Output --
r11_abc_gkhsa 1.0 1.5 1.9 ab1_abc_gkhsa 1.6 1.4 1.5
r11_acd_gkhsa 1.3 1.6 1.5 sd1_acd_gkhsa 1.2 1.3 1.5
r11_bcd_gkhsa 1.0 1.5 1.7 sd1_bcd_gkhsa 1.8 1.5 1.9
r11_xyz_gkhsa 1.0 1.5 1.9 sfs_xyz_gkhsa 1.4 1.6 1.4
sfs_ryb_gkhsa 1.5 1.2 1.7
Answered by jai_s on September 14, 2020
Using join
,sort
, and sed
:
join -j 2 -t_ -a 1 -a 2 -o 1.1,1.2,1.3,1.9999,2.1,2.2,2.3
<(sort -t_ -k2 file1) <(sort -t_ -k2 file2) |
sed 's/__/ /g;s/^ *//g' | sort
sort
file1 & file2 using bash
's *process substitution, then... _
as a field separator, join
the two sorted files on common instances of field #2, and also print singly any line from either file that doesn't match. The nonexistent field 1.9999
separates each joined pair with an extra _
to simplify step #3.sed
.sort
the results.Output:
r11_abc_gkhsa 1.0 1.5 1.9 ab1_abc_gkhsa 1.6 1.4 1.5
r11_acd_gkhsa 1.3 1.6 1.5 sd1_acd_gkhsa 1.2 1.3 1.5
r11_bcd_gkhsa 1.0 1.5 1.7 sd1_bcd_gkhsa 1.8 1.5 1.9
r11_xyz_gkhsa 1.0 1.5 1.9 sfs_xyz_gkhsa 1.4 1.6 1.4
sfs_ryb_gkhsa 1.5 1.2 1.7
Answered by agc on September 14, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP