I am thinking that the method of 0.4/0.6/0.8 is not very accurate.

The reason is because we have 6 question in 1 set of questions, and the scoring range is

----<=50: which means any combination of 0/6, 1/6, 2/6,3/6. The average of the score gives 0.25

—50-70: 4/6: I.e. 0.66

— >70: 5/6 and 6/6 : average is approx 0.92

if you can calculate a weighted score based on the above, that will be great!

if weighting is 18 marks, and had between 50-70, then the score is 12.

Then the total score is divided by 360.

Thanks

Based on 0.4/0.6/0.8 : 70.6%

Based on 0.25/0.66/0.92 : 78.4%

min: 58.9%

max: 86.7%

you are correct, people go through this analysis every year to get a more precise estimate

That’s a HUGE difference!

Think about it, when you are doing mock exams how many mocks you ahve to write to improve your scores from 70 to 78??

I didn’t do any mock exams from schweser. Just did the sample essay questions and item set questioms from CFAI ( which contains actual exam questions).

If people can post their scores based on 0.25/0.66/0.92, that will be great.

A proper method would take into account the expected score given being in one of the three bands.

92%? Really? You think that, conditional upon scoring over 70%, your expected score is 92%? That’s ridiculous. Why are you taking the arithmetic mean?

Gosh, the two obvious ways are to use a prior distribution or a Monte Carlo simulation.

The reason why I thought of this was because using 40/60/80, there were lots of people who failed and passed that had very similiar scores.

I think the 40/60/80 overestimate the score for people who got the “<=50” score, and underestimate the score for people who got the “>70” score.

I read the thread of 40/60/80 and there were people scoring at band 3 that have similiar score with people who passed.

In a set question of 6, you could be scoring 0,1,2 or 3 out of 6 for “<=50”.

In a set question of 6, the possibility of getting into the bracket of >70 is if you get 5/6 or 6/6 which will means at least 83.33%, which means using 80% understate the score that people get for >70.

Obviously each topic might have more than 1 set question, but that was the reasoning behind the 0.25/0.66/0.92.

Not sure what the logic of 0.4/0.6/0.8 is.

Of course, I think this system is not perfect, but I think it will help the people who failed a more realistic idea of how they scored compared to people who pass.

I also agree that this logic works better for a level 2 exams because level 3 exams has the essay component.

Hmm.

AM: (0.25 * 16 + 0.66 * 62 + 0.92 * 102) / 180 = 77%

PM: (0.25 * 18 + 0.92 * 162) / 180 = 85.3%

Average: 81.2%

I like this method

You’re not connecting the reasoning to the numbers.

If I flip a fair coin six times, and then tell you that I got either 5 or 6 heads, on that information alone what is the expected number of heads? 5.5? Of course not, it is about 5.1.

Thanks 1recho!

You did really well to have a huge portion of your scores in the “>70” bucket

+1 The outcomes are more likely binomially distributed and not uniformly distributed as what the OP is proposing, but it is still an improvement over the 40/60/80 method. Cue “what is your 38.10% / 66.67% / 85.71% score?” thread.

Since most of us model for a living, maybe we should create a separate thread just to discuss what is the best way to do this analysis.

I would do it based on a binomial distribution and then updated iteratively.

For any person, if you know p, the probability that they got a question selected at random right, the exercise becomes trivial.

So how to get p? Start with an “initial guess,” you can base it off the 40/60/80 method as that it close enough. Then, using the # of questions (to determine the possible percentages in that band), you can use Bayesian analysis to determine the probability that any of those particular percentages was attained. Then, use those probabilities to get the “expected score” on that section…do that for all sections to get an updated estimate for p.

Now, perform the same process using the updated p to get an even newer estimate…keep on doing this until p converges around a certain point. Then use that value of p to do a final binomial thing, and you have your answer.

^ Why can’t CFAI just give us the numbers?

It’s silly to model when the real information should be available freely.

Fking CFAI likes to make a big deal about everything. “Oooooooooh our exam questions are so so special, if you discuss them we will cut your X off. But we can publich level III AM questions, that’s OK. Only level I, II and PM questions are unmentionable.”

Palisoc_xb… Can you give a bit more background on how you got the “38.10% / 66.67% / 85.71%” ?

I started this thread immediately after the “25%/66.67%/92%” idea struck me just when I woke up this morning (still in bed). – so I havn’t put too much thoughts into this.

Kartelite : thanks for your suggestion - but difficult to calculate if the only score you have is your own score?

1recho , I heard from past candidates that they used to provide rankings (and scores?), but what happened was candidates begin to compare rankings and scores across years which may not be meaningful. If you compare a 70% score from one year, it might not mean the same as a 70% score the next year as the difficulty of exams changes.

Every year, threads like these end up being about which way is best and whatnot. haha

Btw, my score is:

AM: 83.6%

PM: 86.7%

Total: 85.1%

Sorry. How I got that was by assuming that the scores in the itemsets were binomially distributed with N trials and P probability of getting the answers correct then getting the conditional expectation per band. I just used N=6 and P=50% (a crude assumption). 38.10% is the conditional expectation of the lower band, 66.67% for the middle as there can only be 1 score for the middle band and 85.71% for the upper band.

I think this is also how kartelite got approx. 5.1 for the upper band, I just divided my result by 6 to get 85.71%.

This is only applicable for the PM section, though, as the AM section is quite different.

That’s true. But you could get a pretty rough idea if someone ran a huge simulation over lots of scores based on your “initial” 40/60/80 score. As in, given that a 40/60/80 score was X, we can obtain a confidence interval about the score using the iterated method (which should be a good estimate itself).

This would be less accurate as it doesn’t account for the number of questions that were in a section for each particular band.

Maybe we could get a large sample size for the vignette sections and estimate P by using Maximum Likelihood Estimation but still assuming a binomial distribution?

300hours, may we have a peek at some of your data?

Thanks palisoc_xb.

I like your 38.10% / 66.67% / 85.71% score breakdown.