Web Applications Asked by Wil on January 18, 2021
I am trying to calculate the correlation coefficient of a dataset with missing data. It’s difficult to describe why data is missing, so I posted an editable spreadsheet to help illustrate the problem:
https://docs.google.com/spreadsheets/d/1acO8hZARorwd8_NsAnAuKaulpemQe7W0AUzgzf3Mku8/edit?usp=sharing
Sheet 3 contains the dataset that I want to calculate the correlation coefficient for. Column B was collected from Sheet1, and column C was collected from Sheet2. The missing data results from combining the data from Sheet1 and Sheet2. Is there any way to interpolate the missing data while considering the dates and times posted in column A?
For instance, cell B5 is blank. The average of the cells above and below B5 would be "4", but because the dates/times between A4 and A5 are much closer, B5 should be filled in with a higher number, such as 4.7 or 4.8. Is this possible to do in Sheets?
Wil, this is tricky business. But I think I've come up with something that will work for you. While we could put it all into one mega-formula, I think it will be best as three separate array formulas (one per column) due to the size of the formulas.
I've set up a new sheet in your sample spreadsheet. It's called "Erik Help." In it, you'll find the following three formulas:
In A2:
=QUERY({Sheet1!A2:A;Sheet2!A2:A},"Select * Where Col1 Is Not Null Order By Col1 Asc")
In B2:
=ArrayFormula(IF(A2:A="","",IFERROR(VLOOKUP(A2:A,Sheet1!A2:B,2,FALSE),IF(ROW(A2:A)=2,"",IF(ROW(A2:A)=COUNTA(A:A),"",IFERROR((VLOOKUP(A2:A,Sheet1!A2:B,2,TRUE))+(VLOOKUP(VLOOKUP(A2:A,{Sheet1!A2:A,ROW(Sheet1!A2:A)},2,TRUE)+1,{ROW(Sheet1!A2:A),Sheet1!B2:B},2,FALSE)-VLOOKUP(A2:A,Sheet1!A2:B,2,TRUE))*((A2:A-VLOOKUP(A2:A,Sheet1!A2:A,1,TRUE))/(VLOOKUP(VLOOKUP(A2:A,{Sheet1!A2:A,ROW(Sheet1!A2:A)},2,TRUE)+1,{ROW(Sheet1!A2:A),Sheet1!A2:A},2,FALSE)-(VLOOKUP(A2:A,Sheet1!A2:A,1,TRUE))))))))))
In C2:
=ArrayFormula(IF(A2:A="","",IFERROR(VLOOKUP(A2:A,Sheet2!A2:B,2,FALSE),IF(ROW(A2:A)=2,"",IF(ROW(A2:A)=COUNTA(A:A),"",IFERROR((VLOOKUP(A2:A,Sheet2!A2:B,2,TRUE))+(VLOOKUP(VLOOKUP(A2:A,{Sheet2!A2:A,ROW(Sheet2!A2:A)},2,TRUE)+1,{ROW(Sheet2!A2:A),Sheet2!B2:B},2,FALSE)-VLOOKUP(A2:A,Sheet2!A2:B,2,TRUE))*((A2:A-VLOOKUP(A2:A,Sheet2!A2:A,1,TRUE))/(VLOOKUP(VLOOKUP(A2:A,{Sheet2!A2:A,ROW(Sheet2!A2:A)},2,TRUE)+1,{ROW(Sheet2!A2:A),Sheet2!A2:A},2,FALSE)-(VLOOKUP(A2:A,Sheet2!A2:A,1,TRUE))))))))))
The formulas in B2 and C2 are the same except for the references to either Sheet1 or Sheet2 throughout (i.e., all relative ranges are the same between the two).
If a cell that is missing data is either the first cell in the range or the last cell in the range, I left it blank, because there isn't enough data to extrapolate what might have happened before or after, respectively.
Correct answer by Erik Tyler on January 18, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP