TransWikia.com

Import an RDATA, SPSS, SAS, or STATA file into Mathematica

Mathematica Asked by Sulli on January 10, 2021

I would like to import the World Value Survey data file into Mathematica, but it’s only given in spss, sas or stata formats, which are statistical analysis softwares format I think.

I can’t get to import these files in Mathematica, they are not recognized.

I’ve found a Wolfram executable that should allow me to import SAS files but it can’t make it work on Linux. Have you already been able to import such files in Mathematica?

3 Answers

You are right - those formats are for statistical software packages.
I've noticed in Big Data discussions that Mathematica does not want to extend the range of acceptable data formats leaving the job to third-party software. I would recommend to use software like 'Stat/Transfer', which is a usual practice for such tasks. If you take data from WVS, say Stata's .dta which I prefer, you may easily transfer it to .CSV or .DAT file.
Just take an option ASCII - Delimited as shown below. enter image description here
Then .csv is imported to Mathematica as usual.
Another option is to use ASCII Fixed Format + All Programs which will produce .dat file also easily imported to Mathematica (I've checked both cases).

Answered by garej on January 10, 2021

This should do the trick - disclaimer the file is 1.4Gb so everything takes a veerryyy long time on my MacBook air, and you will need an active internet connection for InstallR to work. Note you have to escape internal R file paths and anything else that uses a double quote.

Firstly download the rdata version of the file from the link given.

Needs["RLink`"];

InstallR[];

REvaluate[
"load("~/Downloads/WVS_Longitudinal_1981_2014_R_v2015_04_18.rdata")"];

(*{".Traceback","WVS_Longitudinal_1981_2014_R_v2015_04_18"}*)

Writes the data into an R Dataset called WVS_Longitudinal_1981_2014_R_v2015_04_18

REvaluate["write.csv(WVS_Longitudinal_1981_2014_R_v2015_04_18,file=
"WVS_data_rlink.csv")"];

Writes the data back out to a CSV. Note in this case the file is a simple tabular one. This might be more problematic for some of the more complex data structures that .rdata format permits.

Theoreticall you can actually import the WVS_Longitudinal_1981_2014_R_v2015_04_18 data directly into MMA and work with it natively from that point but requires me working through the rLink tutorial a bit more than I have time for right now. :)

EDIT : Some useful tips to explore the data directly.

Get Column Names

REvaluate["names(WVS_Longitudinal_1981_2014_R_v2015_04_18)"] // Short

Get a single column in this case S001

REvaluate["WVS_Longitudinal_1981_2014_R_v2015_04_18$S001"]

Get the first 10 lines of the R data.frame note the R data objects in there. head is like the unix head not Head[] This data is so wide and deep its hard to get it in a readable format without subsetting and slicing and dicing.

REvaluate["head(WVS_Longitudinal_1981_2014_R_v2015_04_18,n=10)"]

Answered by Gordon Coale on January 10, 2021

With the ability to connect to Python I found this snippet of code helpful. It imports a STATA .dta file using pandas and then outputs it as a Wolfram Dataset.

     importDTAMP[fileName_] := Module[{}, 
 python37 = 
       StartExternalSession[<|"System" -> "Python", "Version" -> "3.7.3"|>];
      pythonFileName = StringReplace[fileName, "" -> "/"];
      
      ds = ExternalEvaluate[
        python37,
        "
         import pandas as pd;
         path = '" <> pythonFileName <> "';
         df = pd.read_stata(path);
         df
         "
        ];
      
      DeleteObject[python37];
      ds
      
      ]

Answered by Andy Krock on January 10, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP