Reformat date to unix time stamp in csv table

Question

I have a .csv file containing the date and time in the format 01/20/2016 23:53:01 in the first column. My columns are separated by semicolons, i.e.

01/21/2016 03:03:01;18616;0;1
01/21/2016 03:13:01;29040;36553;2
…

On SO I found the bash command

date -d '06/12/2012 07:21:22' +"%s"

which works for what I want. I've now been trying to integrate with awk to replace the first column. I found an answer for a similar problem:

awk -F'"' -v OFS='"'  '$8 {cmd="date -d ""$8"" +%FT%T%z"; cmd | getline $8; close(cmd)} 1' input.json

Which I've tried to adapt to my own inputs. But I've been getting an empty output.

ConceptRat · Answer

Hmmmm.  Okay this was a while ago but I figure may as well throw in a suggestion.
I'm fairly certain, that dropping into the shell to call "date" for every line of your file might be a little bit slow if you have lots of lines.
I was going to go into a script I wrote to process timesheet entries, extracted from Google Calendar, and output some HTML to then convert to a PDF Invoice.  But then it got too long talking about that.  So I'm just going to give you the code to save on the reading.
I'm using the AWK functions gensub and mktime [https://www.gnu.org/software/gawk/manual/html_node/Time-Functions.html].  The mktime function expects the input in the datespec format "YYYY MM DD HH MM SS [DST]" so your input needs to be shifted around which is where the gensub function comes in.  Here's what I've got for you...
awk -F';' -v OFS=';' '{ $1=mktime(gensub(/(..)/(..)/(....) (..):(..):(..)/, "\3 \1 \2 \4 \5 \6", 1, $1)); } 1' datetime.csv > unix.csv

I'd would do this a little differently and output the individual arguments in a print instead of writing back to the first argument.  A tiny bit more self-documenting ;)  TIMTOWTDI
awk -F';' -v OFS=';' '{ print mktime(gensub(/(..)/(..)/(....) (..):(..):(..)/, "\3 \1 \2 \4 \5 \6", 1, $1)), $2, $3, $4; }' datetime.csv > unix.csv

I know that the script is a little more verbose but it's hopefully more performant.
Hope that helps you or anyone else looking at the same thing.

Gilles 'SO- stop being evil' · Answer

GNU date has a -f option to convert dates read from a file, line by line. If your file is long, this will be faster than invoking date once per line. The date needs to be alone on the line; hence the plan is to isolate the first column (cut -d ; -f 1), run that through date -f - to perform the conversion, and paste the result with the remaining columns.

paste -d ; <(<input cut -d ; -f 1 | date -f - +%s) <(<input cut -d ; -f 2-)

This assumes that your shell supports process substitution (ksh93, bash, zsh). With plain sh, on a Unix variant that supports /dev/fd (most do), you can use file descriptor shuffling:

<input cut -d ; -f 2- | {
  exec 3<&0
  <input cut -d ; -f 1 | date -f - +%s | paste -d ; - /dev/fd/3
}

Murch · Answer

I finally had rubber ducked it while writing up the question. So here is my solution:

awk -F';' -v OFS=';'  '$1 {cmd="date -d ""$1"" +%s"; cmd | getline $1; close(cmd)} 1' datetime.csv > unix.csv

It had been the combination of two things: I had been missing the " on +%s" and there was a broken line in my input.

Reformat date to unix time stamp in csv table

3 Answers

Add your own answers!

Ask a Question