Unix & Linux Asked by JugaruGabi on January 29, 2021
So I am having the following file:
Certificate Name,Expiry Date,Type,Certificate file path
CN=server10.de.go.com:app, 16-Jan-2021, physical, /path/to/file/
CN=sv999-c.int.de.aaas.internet.go.com:testclient, 31-Oct-2020, vm, /path/to/file/
CN=server26.de.go.com:app, 06-Feb-2021, physical, /path/to/file/
How can I sort the file to show the earliest date first and the latest last?
Tried with the command below, but it doesn’t do the trick (it sorts only the days part from the date and that is it):
sort -t "," -k 2 file
Also, I need to make sure that the header is not being captured by the sorting command.
A little bit of help here, please?
I like to use msort for its flexibility. Here is a snippet of a demonstration script:
# Utility functions: print-as-echo, print-line-with-visual-space.
pe() { for _i;do printf "%s" "$_i";done; printf "n"; }
pl() { pe;pe "-----" ;pe "$*"; }
pl " Input data file $FILE:"
head $FILE
pl " Modified ( ', ' -> ',') input file t1:"
sed 's/, /,/g' $FILE > t1
head t1
pl " Results, msort:"
{
head -n1
msort -q -l -j -d "," -n 2,2 --comparison-type date --date-format "d-m-y"
} < t1
producing:
-----
Input data file data1:
Certificate Name,Expiry Date,Type,Certificate file path
CN=server10.de.go.com:app, 16-Jan-2021, physical, /path/to/file/
CN=sv999-c.int.de.aaas.internet.go.com:testclient, 31-Oct-2020, vm, /path/to/file/
CN=server26.de.go.com:app, 06-Feb-2021, physical, /path/to/file/
-----
Modified ( ', ' -> ',') input file t1:
Certificate Name,Expiry Date,Type,Certificate file path
CN=server10.de.go.com:app,16-Jan-2021,physical,/path/to/file/
CN=sv999-c.int.de.aaas.internet.go.com:testclient,31-Oct-2020,vm,/path/to/file/
CN=server26.de.go.com:app,06-Feb-2021,physical,/path/to/file/
-----
Results, msort:
Certificate Name,Expiry Date,Type,Certificate file path
CN=sv999-c.int.de.aaas.internet.go.com:testclient,31-Oct-2020,vm,/path/to/file/
CN=server10.de.go.com:app,16-Jan-2021,physical,/path/to/file/
CN=server26.de.go.com:app,06-Feb-2021,physical,/path/to/file/
Run on:
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 5.7.0-3-amd64, x86_64
Distribution : Debian GNU/Linux bullseye/sid
bash GNU bash 5.0.18
Part of the flexibility comes from a good design of features, for example from the man page:
-f,--date-format <date format>
Permutation of ymd with separators, e.g. y-m-d for international
date format, m/d/y for American date format, or a permutation of
yd with separators, e.g. y-d, for day-of-year dates. All three
components may be numbers in any available number system. The
month field may also be a month name, determined by the same de-
vices as independent month name fields.
The msort code is available in many repositories, here is some additional information:
msort sort records in complex ways (man)
Path : /usr/bin/msort
Version : 8.53
Type : ELF 64-bit LSB shared object, x86-64, version 1 ( ...)
Help : probably available with -h,--help
Repo : Debian GNU/Linux bullseye/sid
Home : http://www.billposer.org/Software/msort.html (pm)
Best wises ... cheers, drl
Answered by drl on January 29, 2021
Perl can do date parsing and sorting:
perl -MTime::Piece -F, -ane '
next if $. == 1;
push @lines, [$_, Time::Piece->strptime($F[1], " %d-%b-%Y")->epoch];
END {
@sorted = sort {$a->[1] <=> $b->[1]} @lines;
print $sorted[$_]->[0] for (0,-1);
}
' file
CN=sv999-c.int.de.aaas.internet.go.com:testclient, 31-Oct-2020, vm, /path/to/file/
CN=server26.de.go.com:app, 06-Feb-2021, physical, /path/to/file/
Answered by glenn jackman on January 29, 2021
With GNU sort
or compatible¹, you can use the M
sorting flag for month names:
{
head -n1
LC_ALL=C sort -t, -k2.9n -k2.5,2.7M -k2n
} < file
Or to ignore the spacing after the ,
(in case it's not always exactly one SPC character):
{
head -n1
LC_ALL=C sort -t, -k2.8bn -k2.4b,2.6bM -k2n
} < file
(here using the C locale to expect English month names as opposed to those in the language of the user).
Here, we avoid passing the header to sort
, but it wouldn't be a problem if we did as the first sort key (year) for that header would evaluate to 0
and would be sorted first. Using head
like that works for regular file (or other seekable files) as input, it wouldn't work for pipes (with GNU sed
, you could replace head -n1
with sed -u q
).
POSIXly, you could do something like:
awk -F',[[:space:]]*' -v months=JanFebMarAprMayJunJulAugSepOctNovDec '
NR == 1 {print; next}
{
split($2, f, "-")
printf "%04d%02d%02dt%sn", f[3], index(months, f[2]),
f[1], $0 | "sort | cut -f2-"
}' < file
Where we prepend each line (except the first which is printed first unmodified) with 2020xx15
, where xx
is the position of the month abbreviation inside that months
variable (so values 01, 04, 07, 10... not the month numbers, but something that grows with the month number), which are then sorted lexically and removed with cut
.
If your awk
is mawk
(which doesn't support POSIX character classes), you can replace [[:space:]]
with [ t]
(add other whitespace character to the list as needed).
¹ GNU sort
is found on GNU systems and most BSDs, though some BSDs have since replaced it with their own with a compatible API. busybox sort
now also supports that M
flag (though at least with 1.30.1, it seems buggy when combined with key specifications that include offsets; I find that I need to remove the key end part for it to work here).
Answered by Stéphane Chazelas on January 29, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP