Unix & Linux Asked by Renan41 on December 3, 2021
Have here two large text files, about 30mb
each one, which would like to grep
them one in another, as grep -f "file01.txt" "file02.txt" > file03.txt
.
Doing so returns "memory exaust" error.
How could those files be compared disregarding alphabetic order?
Unless your file01.txt
contains actual regular expressions, try:
grep -Ff "file01.txt" "file02.txt" > file03.txt
-F
tells grep
to treat file01.txt
as fixed strings, not regular expressions. This will both greatly increase the speed and greatly reduce the memory requirements.
Alternatively, if your file01.txt
really does contain regular expressions, you can split it into parts and apply grep
to each part separately:
split -dn 10 "file01.txt" ./tmp-file01.
for f in ./tmp-file01.*; do grep -f "$f" "file02.txt"; done >file03.txt
The above splits file01.txt
into 10 parts. Depending on your available memory, you may need more than that.
If file01.txt does not have regexes, then use -F
in the second line:
for f in ./tmp-file01.*; do grep -Ff "$f" "file02.txt"; done >file03.txt
Answered by John1024 on December 3, 2021
You can't - pattern must be loaded into grep
and this exaust memory.
But if you want to compare files, why don't you simply use diff
(after sorting the contents)?
For the one-line per pattern (like list of MD5s):
while read md5; do
grep -w "$md5" file02.txt
done < file01.txt > file03.txt
This of course is much slower, especially with big file02.txt (when it doesn't fit into cache), but works for every size of the pattern file01.txt.
Answered by Yfa Kolh on December 3, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP