Stack Overflow Asked by dnv89 on November 15, 2021
I have a table that looks like this. (In reality, there are over a million rows and 70-something columns)
ind1 ind2 ind3 ... indn
0.1 0.2 0.3 0.4 0.5
1.0 0.9 0.8 0.7 0.6
1.0 1.0 1.0 1.0 1.0
0.9 0.9 0.9 0.9 0.9
I want an automated procedure to create a histogram for every column in the table. This is the code I have, but it doesn’t work
for (i in 1:10){
plN <- ggplot(cdf, aes(x=colnames(cdf)[i])) + geom_histogram(binwidth = 0.01)
plot(plN)
}
How do I extract the column name for the ith column of the dataframe, so I can plug that in in place of x=…? The current setup doesn’t work.
(Answered in Duck’s comment below. Thanks for the help!)
I also like writing single functions and then using a loop to call the function.
I'll create your dataset with some random numbers.
library(tidyverse)
set.seed(123)
cdf <- data.frame(
A = rnorm(1000, -3, 1),
B = rnorm(1000, 1, 1),
C = rnorm(1000, 5, 10),
D = rnorm(1000, -3, 2)
)
Then write a function to graph a given dataset and function. I also include whether to print it.
myhist_function <- function(mydata, mycolumn, printit = F) {
mycolname <- colnames(mydata)[mycolumn]
coltograph <- mydata[mycolumn]
ggplot(mydata, aes(x = mydata[, mycolumn])) +
geom_histogram(binwidth = 0.1) +
xlab(mycolname) -> plN
if (printit == T) {
print(plN)
}
return(plN)
}
Calling the function as such once looks like:
myhist_function(mydata = cdf, mycolumn = 2, printit = T)
Then I can create an empty list, and loop over whatever columns I desire for any given dataset. And populate the list with ggplot2
graph objects.
mygraphs <- list()
columns_toplot <- names(cdf)
for (i in seq_along(columns_toplot)) {
mygraphs[[i]] <-
myhist_function(mydata = cdf, mycolumn = i, printit = F)
}
names(mygraphs) <- columns_toplot
You can visualize any graph you wish by mygraphs[[i]]
, but you can also use the ggarrange()
function from the ggpubr package to do things like this:
library(ggpubr)
ggarrange(mygraphs[[1]],
mygraphs[[2]],
mygraphs[[3]],
mygraphs[[4]],
ncol = 2,
nrow = 2)
Answered by akaDrHouse on November 15, 2021
We can store the output in a list
plN <- vector('list', ncol(cdf))
for(i in seq_along(cdf)) {
plN[[i]] <- ggplot(cdf, aes(x= !! rlang::sym(names(cdf)[i]) +
geom_histogram(binwidth = 0.01)
}
plN[[1]]
Answered by akrun on November 15, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP