Excel question, deciles of various number of rows

Here is what I am trying to do. I have columns of data, each of varying numbers of rows, that I want to calculate decile averages for. One column could have three hundred rows, which would mean 10 deciles of 30 rows each. I could then manually create an average formula for each of the ten deciles. The problem is I have thousands of such columns of data, with different numbers of rows. It will take way too long to manually average the deciles. If they were all 30 rows to a decile it would be no problem. So, does anyone know of a simple solution where one formula will be able to grab the average of the first decile (and 2 - 10 in subsequent formulas), while automatically accounting for whether there are 30 or 50 or whatever rows in that decile. Then I could just copy that formula for the other columns.

Your repeated use of the word “deciles” is impressive. No idea what the question is, but I’m impressed none the less.

Right here bro: =IF(ISERROR(SUM(OFFSET(A$1,(ROW(A1)-1)*10,10,)))/10,"",SUM(OFFSET(A$1,(ROW(A1)-1)*10,10,))/10) If the columns are on a sheet starting at A1, just paste this formula beside the columns and drag it to make it apply to all columns And boom goes the dynamite

Uh, so if I understand correctly, if column A has 1000 rows, you need ten numbers? That is, the average of the first decile, average of the second decile, … etc.? Is the data sorted or non-sorted? If it’s unsorted, do you want to organize deciles by value or by the position in the column? In any case, it seems like you should just write a macro to do this. Do you know any VBA?

Use something besides excel. If that’s not acceptable, you could do something like this: 1) Create a new sheet just for the decile information 2) First column put 1 through 10, for each decile 3) You need to count the number of real entries in each column and then calculate the edges of each decile, so for the first one and decile 1, it would be something like, in cell B1, =small(sheet1!A:A,COUNT(A:A)*$A1/10) 4) after you copy this over, create another sheet (not really necessary), you can put in the 1-10, but not needed 5) in cell B1 (if you have the 1-10 in), type =sumif(sheet1!A:A,"<="&sheet2:B1) for B2, you could need to do the same thing but subtract what is above 6) You need another sheet to do a similar calculation but for the counts, (replace sumif with countif) 7) Divide the sums from the counts. You might be able to simplify this so you just need one averageif, but I can’t say for sure since I don’t recall if the logical arguments work with more than one statement. Should come out to what you want.

If you have thousands of columns, then excel is clearly not your tool. Even if you had a simple macro that could calculate it for a row, you’d still have to run it on thousands of rows. You could, I suppose, loop through the rows, but you’d still have to analyze a thousand different columns of rankings, which would be 1000 x 10 deciles or 10,000. Excel just doesn’t seem like the appropriate tool. If you have 30 columns of data, then maybe you can do it. 100 at the outside. The RANK function might come in handy here. TRUNC(10* (RANK(*,*) / COUNT(*))), should get you the decile for each entry. Depending on how the function is implemented, you might have to add 1. A wrinkle might come in with how you’ve coded missing data and how rank counts blank entries. I would make a separate worksheet that gives the decile for each data point in the corresponding master data sheet. You can then get the averages either by doing a SUMIF / COUNTIF equation, or perhaps by running pivot tables. This is a project more suited to R or Matlab, though.

Oh, that’s pretty close to jmh’s solution. We both agree that excel is not really suited to this.

1/ Insert 10 rows ABOVE all your data. 2/ enter the following ARRAY formula (ctrl-shift-enter) into A1 {=SUM(OFFSET(A$11,0,0,COUNT(A$11:A$65000),1)*(CEILING(RANK(OFFSET(A$11,0,0,COUNT(A$11:A$65000),1),OFFSET(A$11,0,0,COUNT(A$11:A$65000),1))/COUNT(A$11:A$65000)*10,1)=ROW()))/SUM(1*(CEILING(RANK(OFFSET(A$11,0,0,COUNT(A$11:A$65000),1),OFFSET(A$11,0,0,COUNT(A$11:A$65000),1))/COUNT(A$11:A$65000)*10,1)=ROW()))} 3/ Fill down to A10, then fill right A1:A10 will give you decile averages for col A, B1:B10 for col B… andd soo on and sooo on

I just saw this query, Actually you can use function ‘large’ so you insert one column at the left and 51 (1/10 of the maximum rows +1) rows at the top of your data A2:A51 = 1 to 50 b1 = =COUNT(B52:B7001), b2 = =IF($A2

Thanks for the help, I think I have a decent idea of how I am going to do this