Score reporting is among the most challenging aspects of test development facing testing agencies today. In many ways it is no longer enough to have a psychometrically sound instrument that provides a valid and reliable measure of student proficiency, nor is summarily labeling test performance with a single scale score satisfactory. Stakeholders, including the examinees themselves, want context for the scores test takers receive and increasingly seek information that connects the scores back to the purpose of the test and what it purports to measure (Ryan, 2006). Context can mean many things, depending on the purpose of the test and the intended uses of the data, but it includes (and is not limited to) comparison to/between reference groups, diagnostic performance data at the subdomain or perhaps even item level, narrative descriptions of strengths and weaknesses, and/or performance-level descriptions that elaborate on what examinees at different proficiency levels know and can do.