Thursday, September 3, 2015

Google Sheets Query involving two groups of data


The easiest way to explain what I'm trying to do is to give an example of the data, so here's a simple example:

A                B          C            D
Name1        4          Name4     3
Name2        3          Name4     2
Name3        5          Name1     1.4
Name4        3          Name3     6
Name5        1          Name6     8
Name2        3          Name7     10
Name1        4          Name5     2
With that above data, I want to produce a pair of columns, the first that has a list of all the unique entries in A and C, the second having the average of the corresponding values in either B or D for the entry in A or C. The result should look something like:
Name1      3.133333
Name2      3
Columns A and C are guaranteed to be strings, B and D are guaranteed to be positive numbers. It is also guaranteed that A and C will never contain the same value.
Then I also want to do another query similar, except instead of taking the average of all the corresponding values, I want to count the number of times some specific value occurs with respect to the total. As an example:
A                B          C            D           E
Name1        4          Name4     3           Y
Name2        3          Name4     2           Y
Name3        5          Name1     1.4        N
Name4        3          Name3     6           Y
Name5        1          Name6     8           Y
Name2        3          Name7     10         N
Name1        4          Name5     2           Y
In this case, I want to count the percentage of times Y occurs in E over the total number of times each item in A and C occurs. This time, B and D are no longer important. For example, Name1 occurs 3 times, and has an E value of Y twice, so I want 0.66667. So the result should look like:
Name1    0.66667
Name2    0.5
Name3    0.5
Name4    1

If at any point any of this isn't clear, please ask so I can clarify. Thanks.


Screenshot of Spreadsheet:

I have the following formula in cell G1:
=query({A:B;C:D},"Select Col1,avg(Col2) where Col1<>'' group by Col1 label Col1 'Name', avg(Col2) 'Average' format avg(Col2) '0.00'")

and the following formula in cell J1:
=query(arrayformula(iferror(query({{A:A,E:E,E:E};{C:C,E:E,E:E}},"select Col1,count(Col3) where Col1<>'' group by Col1 pivot Col2")*1,query({{A:A,E:E,E:E};{C:C,E:E,E:E}},"select Col1,count(Col3) where Col1<>'' group by Col1 pivot Col2"))),"select Col1,(Col3/(Col2+Col3)) label Col1 'Name', (Col3/(Col2+Col3)) 'Percentage' format (Col3/(Col2+Col3)) '0.00'")

No comments:

Post a Comment