Mathematica Asked by KHAAAAAAAAN on October 27, 2020
I’m running into an annoying issue when importing a table of tab-separated-data. Several columns are numeric, while several are strings. Using Import[url,"TSV"]
basically works perfectly – however, some of the strings are “5d2”, “4e1” or things of that nature, which Import then interprets as scientific notation. For instance, ImportString["4d2", "TSV"]
yields {{400.}}
, which I do not want. However there are some columns which are properly in scientific notation (i.e. 2.3e+02) which I do want intepreted as numbers – is there a clean way to selectively import certain table columns as numbers, leaving others as strings?
Without knowing more, I would first say to look at using the "Numeric" -> False
option in
Import["data.tsv", "TSV", "Numeric" -> False]
This seems (I've never worked with this functionality until now and I got the idea from here.) to leave everything as strings.
This also takes care of the misinterpretation of scientific notation problem as
ImportString[#, "TSV", "Numeric" -> False] & /@ {"4e1", "5d2"}
InputForm@%
{{{4e1}},{{5d2}}}
{{{"4e1"}}, {{"5d2"}}}
Then, once everything is imported as strings, you can change the columns of scientific notation strings to numbers. For example,
data[[;;, column]] = Internal`StringToDouble /@ data[[;;, column]]
(Also stole the Internal`StringToDouble
from here.)
All together
data = Import["data.tsv", "TSV", "Numeric" -> False]
data[[-1]] = Internal`StringToDouble /@ data[[-1]];
data
{{"1"}, {"2"}, {"3"}, {"4"}, {"5"}, {"6"}, {"7"}, {"8"}, {"9"}, {"10"}, {"4e1"}, {"5d2"}, {"2.3e+02"}}
{{"1"}, {"2"}, {"3"}, {"4"}, {"5"}, {"6"}, {"7"}, {"8"}, {"9"}, {"10"}, {"4e1"}, {"5d2"}, {230.}}
Answered by NonDairyNeutrino on October 27, 2020
Suppose your file is like this (two different types of columns for simplicity):
"4d2" 2.3e+02
"5e1" -1.3e-05
I use here StringToStream
to simulate file, but you just place your file (path + name) instead (ReadList["path to file",{Word, Number}]
) with appropriate number and types of columns you have:
ReadList[StringToStream[""4d2"t2.3e+02n"5e1"t-1.3e-05"], {Word, Number}]
which gives
{{"4d2", 230.}, {"5e1", -0.000013}}
Answered by Alx on October 27, 2020
If your file.tsv has data such as:
"4d2" 2.3e+02 105.5
"5e1" -1.3e05 235
Then SemanticImport may help:
data = SemanticImport["file.tsv",{"String","Number","Number"}, "HeaderLines"-> 0]
Set HeaderLines
appropriately to reflect the presence of header row(s) in your file.
Answered by Lee on October 27, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP