TeX - LaTeX Asked on May 17, 2021
Following up this answer, how can I have an optional argument of the command getValue
that specifies the column in which I look up the cell value so that it has the following syntax?
getValue<macro name>%
[<optional column number with default of 2, and the counting starts with 1 at the very first column>]%
{<search phrase>}%
{<data of the csv file>}
For example, getValueSndParThrd[4]{Second Parameter}{mydata}
(after tweaking the following code) should store 44
in SndParThrd
.
Also, getValueSndParFrst{Second Parameter}{mydata}
should store 3
in SndParFrst
.
begin{filecontents*}{test.csv}
Third Parameter , 7 , 9 ,
First Parameter , 5 , 10 ,
Second Parameter , 3 , 6 , 44
end{filecontents*}
documentclass{article}
usepackage{xparse}
ExplSyntaxOn
% Step 1: reading the file
ior_new:N l__diaa_csv_ior
ior_new:N l__diaa_csv_str_bool
NewDocumentCommand ReadCSV { s m >{ SplitArgument {1} {,} }O{ 1,2 } m }
{
IfBooleanTF {#1}
{ bool_set_true:N l__diaa_csv_str_bool }
{ bool_set_false:N l__diaa_csv_str_bool }
diaa_csv_read:nnnn {#2} #3 {#4}
}
cs_new_protected:Npn diaa_csv_read:nnnn #1 #2 #3 #4
{
prop_new:c { g__diaa_csv_#1_prop }
ior_open:NnTF l__diaa_csv_ior {#4}
{ __diaa_csv_read:cnn { g__diaa_csv_#1_prop } {#2} {#3} }
{ msg_error:nnn { diaa } { file-not-found } {#4} }
}
msg_new:nnn { diaa } { file-not-found }
{ File~`#1'~not~found. }
cs_new_protected:Npn __diaa_csv_read:Nnn #1 #2 #3
{
bool_if:NTF l__diaa_csv_str_bool
{ ior_str_map_inline:Nn }
{ ior_map_inline:Nn }
l__diaa_csv_ior
{
prop_put:Nxx #1
{ clist_item:nn {##1} {#2} }
{ clist_item:nn {##1} {#3} }
}
}
cs_generate_variant:Nn __diaa_csv_read:Nnn { c }
%
% Step 2: getting the values
NewDocumentCommand getValue { m m m }
{ tl_set:Nx #1 { diaa_csv_item:nn {#2} {#3} } }
NewExpandableDocumentCommand CSVItem { m m }
{ diaa_csv_item:nn {#1} {#2} }
cs_new:Npn diaa_csv_item:nn #1 #2
{ prop_item:cn { g__diaa_csv_#2_prop } {#1} }
ExplSyntaxOff
begin{document}
ReadCSV{mydata}{test.csv}
getValuerdPar{Third Parameter}{mydata}
rdPar
edefstPar{CSVItem{First Parameter}{mydata}}%
stPar
end{document}
This is a modified version of Phelype Oleinik's answer that you linked to in the question. The first method is shorter, the second is better. :-)
Column numbers in the CSV file start at 1. The new commands have the following syntaxes:
ReadCSV[*] {label} [key column] {value columns} {file}
where key column defaults to 1 and value columns is a comma-separated list of column numbers (only values from these columns are stored by ReadCSV
to allow later retrieval by getValue
and CSVItem
).
getValue {macro} [value column] {key} {label}
This is what you asked for in the question. value column defaults to 2. The contents of value column corresponding to the specified key is stored in macro
(which could be an expl3
token list variable).
CSVItem {key} [value column] {label}
This command is fully expandable. It recursively expands to the contents of value column corresponding to the specified key. value column defaults to 2, as in getValue
.
begin{filecontents*}{test.csv}
Third Parameter , 7 , 9 ,
First Parameter , 5 , 10 ,
Second Parameter , 3 , 6 , 44
end{filecontents*}
documentclass{article}
usepackage{xparse}
ExplSyntaxOn
% Step 1: reading the file
ior_new:N l__diaa_csv_ior
bool_new:N l__diaa_csv_str_bool
seq_new:N l__diaa_csv_tmp_seq
% str mode (bool/star), label, key column, value columns, file
NewDocumentCommand ReadCSV { s m O{1} m m }
{
IfBooleanTF {#1}
{ bool_set_true:N l__diaa_csv_str_bool }
{ bool_set_false:N l__diaa_csv_str_bool }
diaa_csv_read:nnnn {#2} {#3} {#4} {#5}
}
% label, key column, value columns, file
cs_new_protected:Npn diaa_csv_read:nnnn #1 #2 #3 #4
{
seq_set_split:Nnn l__diaa_csv_tmp_seq { , } {#3}
seq_map_inline:Nn l__diaa_csv_tmp_seq
{ prop_new:c { g__diaa_csv_#1_##1_prop } }
ior_open:NnTF l__diaa_csv_ior {#4}
{
__diaa_csv_read:nn {#1} {#2}
ior_close:N l__diaa_csv_ior
}
{ msg_error:nnn { diaa } { file-not-found } {#4} }
}
msg_new:nnn { diaa } { file-not-found }
{ File~`#1'~not~found. }
% label, key column
cs_new_protected:Npn __diaa_csv_read:nn #1 #2
{
bool_if:NTF l__diaa_csv_str_bool
{ ior_str_map_inline:Nn }
{ ior_map_inline:Nn }
l__diaa_csv_ior
{
seq_map_inline:Nn l__diaa_csv_tmp_seq
{
prop_put:cxx { g__diaa_csv_#1_####1_prop }
{ clist_item:nn {##1} {#2} }
{ clist_item:nn {##1} {####1} }
}
}
}
% Step 2: getting the values
% macro or tl var, value column, key, label
NewDocumentCommand getValue { m O{2} m m }
{ tl_set:Nx #1 { diaa_csv_item:nnn {#3} {#2} {#4} } }
% key, column containing the desired value, label
NewExpandableDocumentCommand CSVItem { m O{2} m }
{ diaa_csv_item:nnn {#1} {#2} {#3} }
% key, column containing the desired value, label
cs_new:Npn diaa_csv_item:nnn #1 #2 #3
{ prop_item:cn { g__diaa_csv_#3_#2_prop } {#1} }
ExplSyntaxOff
begin{document}
% Use default column for the key (1)
ReadCSV{mydata}{2,4}{test.csv} % discard the third column
getValuerdPar[4]{Second Parameter}{mydata}
rdPar % 44
getValuerdPar{Second Parameter}{mydata}
rdPar % 3
getValuerdPar{Third Parameter}{mydata}
rdPar % 7
edefrdPar{CSVItem{First Parameter}{mydata}}%
rdPar % 5
edefrdPar{CSVItem{First Parameter}[4]{mydata}}%
ifxrdParempty
textlangle emptytextrangle
else
rdPar
fi
end{document}
This is based on the previous code, but:
parsing of the CSV lines is now done with seq_set_split:Nnn
instead of clist
functions in order not to ignore empty cells;
it is more memory-efficient (only one l3prop
is used to store all selected columns);
column addressing is different: when retrieving values with getValue
or CSVItem
, column 1 is the first selected column, column 2 is the second selected column, etc. Accordingly, the default column number when retrieving values with these functions is 1 (instead of 2).
getValue
now has a star version that performs a global assignment.
Since you want autodetection of the number of columns, I had to change the syntax of ReadCSV
as follows:
ReadCSV[*] [key column] {label} [value columns] {file}
key column defaults to 1 as above. However, when value columns is blank, the first line of file is read to determine the number n of columns, and value columns are defined to be all columns from 2 to n (both inclusive).
Spaces on both sides of column-delimiter commas are ignored when reading the file, and one set of outer braces around each cell contents is removed, if any (this is illustrated by the cell containing foo, bar
in the sample CSV file below).
begin{filecontents*}{test.csv}
Third Parameter , 7 , 9 ,
First Parameter , 5 , {foo, bar} ,
Second Parameter , 3 , 6 , 44
end{filecontents*}
documentclass{article}
usepackage{xparse}
ExplSyntaxOn
% Step 1: reading the file
ior_new:N l__diaa_csv_ior
bool_new:N l__diaa_csv_str_bool
seq_new:N l__diaa_csv_tmp_seq
% str mode (bool/star), key column, label, value columns, file
NewDocumentCommand ReadCSV { s O{1} m O{} m }
{
IfBooleanTF {#1}
{ bool_set_true:N l__diaa_csv_str_bool }
{ bool_set_false:N l__diaa_csv_str_bool }
diaa_csv_read:nnnn {#3} {#2} {#4} {#5}
}
% label, key column, value columns, file
cs_new_protected:Npn diaa_csv_read:nnnn #1 #2 #3 #4
{
tl_if_blank:nTF {#3} % Detect number of columns and use 2 to last
{
ior_open:NnTF l__diaa_csv_ior {#4}
{
bool_if:NTF l__diaa_csv_str_bool
{ ior_str_get:NN }
{ ior_get:NN }
l__diaa_csv_ior l_tmpa_tl
ior_close:N l__diaa_csv_ior
seq_set_split:NnV l_tmpa_seq { , } l_tmpa_tl
seq_clear:N l__diaa_csv_tmp_seq
int_step_inline:nnn { 2 } { seq_count:N l_tmpa_seq }
{ seq_put_right:Nn l__diaa_csv_tmp_seq {##1} }
}
{ msg_error:nnn { diaa } { file-not-found } {#4} }
}
{ seq_set_split:Nnn l__diaa_csv_tmp_seq { , } {#3} } % explicit columns
ior_open:NnTF l__diaa_csv_ior {#4}
{
prop_new:c { g__diaa_csv_#1_prop }
__diaa_csv_read:nn {#1} {#2}
ior_close:N l__diaa_csv_ior
}
{ msg_error:nnn { diaa } { file-not-found } {#4} }
}
msg_new:nnn { diaa } { file-not-found }
{ File~`#1'~not~found. }
cs_generate_variant:Nn prop_put:Nnn { cxV }
% label, key column
cs_new_protected:Npn __diaa_csv_read:nn #1 #2
{
bool_if:NTF l__diaa_csv_str_bool
{ ior_str_map_inline:Nn }
{ ior_map_inline:Nn }
l__diaa_csv_ior
{
seq_set_split:Nnn l_tmpa_seq { , } {##1} % split one CSV row
tl_clear:N l_tmpa_tl
seq_map_inline:Nn l__diaa_csv_tmp_seq
{
tl_put_right:Nx l_tmpa_tl { { seq_item:Nn l_tmpa_seq {####1} } }
}
prop_put:cxV { g__diaa_csv_#1_prop }
{ seq_item:Nn l_tmpa_seq {#2} }
l_tmpa_tl
}
}
% Step 2: getting the values
% star → global assignment, macro or tl var, value column, key, label
NewDocumentCommand getValue { s m O{1} m m }
{
IfBooleanTF {#1} { tl_gset:Nx } { tl_set:Nx }
#2 { diaa_csv_item:nnn {#4} {#3} {#5} }
}
% key, value column, label
NewExpandableDocumentCommand CSVItem { m O{1} m }
{ diaa_csv_item:nnn {#1} {#2} {#3} }
cs_generate_variant:Nn tl_item:nn { f }
% key, value column, label
cs_new:Npn diaa_csv_item:nnn #1 #2 #3
{
tl_item:fn { prop_item:cn { g__diaa_csv_#3_prop } {#1} } {#2}
}
ExplSyntaxOff
begin{document}
% Use default column for the key (1). The second empty optional argument (list
% of value columns) means we want to autodetect the value columns; then, the
% first column is for keys and all other columns are used as value columns.
ReadCSV{mydata}{test.csv}
getValuerdPar{Second Parameter}{mydata}
rdPar % 3
getValuerdPar[2]{Second Parameter}{mydata}
rdPar % 6
getValuerdPar[3]{Second Parameter}{mydata}
rdPar % 44
getValuerdPar{Third Parameter}{mydata}
rdPar % 7
edefrdPar{CSVItem{First Parameter}{mydata}}%
rdPar % 5
edefrdPar{CSVItem{First Parameter}[2]{mydata}}%
rdPar % foo, bar
edefrdPar{CSVItem{First Parameter}[3]{mydata}}%
ifxrdParempty
textlangle emptytextrangle
else
rdPar
fi
end{document}
Correct answer by frougon on May 17, 2021
With the readarray
package...
Note, the leading/trailing spaces in the input data are retained, for good or ill.
begin{filecontents*}{test.csv}
Third Parameter , 7 , 9 ,
First Parameter , 5 , 10 ,
Second Parameter , 3 , 6 , 44
end{filecontents*}
documentclass{article}
usepackage{readarray,pgffor}
newcommandgetValue[3][2]{%
deftmpA{#2}%
foreachz in{1,...,nrows}{%
arraytomacro#3[z,1]tmpB
ifxtmpAtmpB #3[z,#1]fi
}%
}
begin{document}
readarraysepchar{,}
readdef{test.csv}mydata
readarraymydatamyarray[-,ncols]
nrows{} rows in data
Cell(2,2): myarray[2,2]
Cell(3,4): myarray[3,4]
getValue{Third Parameter }{myarray}
and
getValue[4]{Second Parameter }{myarray}
end{document}
Answered by Steven B. Segletes on May 17, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP