TransWikia.com

Is it appropriated to use an 'Invariant' variable in multivariate test?

Cross Validated Asked by terauser on December 20, 2021

I have been assigned the task of finding out how (and if) the biological diversity of tropical insects (ants) is related to environmental variables. So I did a small field work to start with some data and design a preliminary analysis before putting the bulk of the resources into further field sampling.
So, now I have environmental data (humidity at leaf surface, temperature, % vegetation cover) and biological data (number of individuals of each ant species and their derived diversity indices) at 3 heights (0, 5, and 10 m height) and 3 times of the year (Jan-May-Nov).
In my field the usual way is to throw all of these data into a multivariate test such as PCA or RDA. At first I also did that and there are two variables which appear as important (ie a visible, "important arrow" in the RDA plot).

The problem is, one of these two variables is not really variable. For some reason the values for this variable (humidity) appear with 4 decimals (eg. 66.0344) in the database provided by the technical staff. And we know that humidity, for all our purposes and methods, is significant only to 1 or 2 decimals. This means that basically all values of this variable are functionally and statistically equivalent (for our purposes let’s say that the ants I study don’t care if the humidity is 66.0051 or 66.0056. Rather, they are affected by much larger differences, eg. 66 vs. 30, or at very least 66 vs. 64. You get my point. So biologically I can safely say that all humidity in my data set is eg mean=66.00 with SD=0.05

So my question is the following: given that we already know that the humidity is the same for all cases, does it make sense to still include this variable in the RDA only because the statistical package reports it as statistically relevant? Wouldn’t it be more appropriated to leave it out and assume its "statistical importance" must be coming from those non-signficant decimals which are meaningless for my purposes? Or would you include this variable anyway? If so, how to interpret the results? "a given percent of variability was explained by this factor which actually was the same for all the study"?

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP