Unix & Linux Asked by cdxun on December 20, 2020
I have two .csv files that I need to match based on column 1.
The two file structures look like this.
FILE1
gopAga1_00004004-RA,1122.825534, -2.497919969, 0.411529843
gopAga1_00010932-RA,440.485381, 1.769511316, 0.312853434
gopAga1_00007012-RA, 13.37565185, -1.973108929, 0.380227982
etc...
FILE2
gopAga1_00004004-RA, ENSACAP00000013845
gopAga1_00009937-RA, ENSACAP00000000905
gopAga1_00010932-RA, ENSACAP00000003279
gopAga1_00000875-RA, ENSACAP00000000296
gopAga1_00010837-RA, ENSACAP00000011919
gopAga1_00007012-RA, ENSACAP00000012682
gopAga1_00017831-RA, ENSACAP00000016147
gopAga1_00005588-RA, ENSACAP00000011117
etc..
This is my current command that I am running using join:
This is formatted from what I have also read on the following threads here
join -1 1 -2 1 -t , -a 1 -e "NA" -o "2.2,1.1,1.2,1.3" <(sort -k 1 healthy_vs_unhealthy_de.csv) <(sort RBH.csv) > output.txt
However, every time I run this prompt it only writes the first row to output.
Anyone know why my code is running like this and not actually merging the two files based on the GOP ID?
we should specify delimiter as comma for sort
# join -1 1 -2 1 -t , -a 1 -e "NA" -o "2.2,1.1,1.2,1.3" <(sort -t',' -k 1 healthy_vs_unhealthy_de.csv) <(sort -t',' RBH.csv)
ENSACAP00000013845,gopAga1_00004004-RA,1122.825534, -2.497919969
ENSACAP00000012682,gopAga1_00007012-RA, 13.37565185, -1.973108929
ENSACAP00000003279,gopAga1_00010932-RA,440.485381, 1.769511316
Answered by Siva on December 20, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP