Test identification

Name of test Renaissance Star Maths
Version 2nd Edition
Previous version(s) 1st Edition
Subjects Maths
Summary A computer-adaptive assessment, using sophisticated item calibration and psychometrics to adjust dynamically to each child’s unique responses. The test can be taken any time throughout the year and as often as results are required.

Assessment screening

Subscales Single test comprised of 24 questions (selected adaptively) - the first 8 on numeration concepts, the next 8 on computation processes, and the final 8 on the other 6 areas.
Authors Dr Damian W Betebenner
Publisher Renaissance Star Assessments
Test source http://www.renlearn.co.uk/star-maths/
Guidelines available? Yes
Norm-referenced scores. Yes
Age range 6-8 years
Key Stage(s) applicable to KS1
UK standardisation sample Yes
Publication date 2019
Re-norming date n/a

Eligibility

Validity measures available? Yes
Reliability measures available? Yes
Reason for exclusion from shortlist shortlisted

Evaluation and Appraisal

Additional information about what the test measures Measures maths attainment including Numeration Concepts, Computation Processes, Word Problems, Approximation, Data Analysis and Statistics, Shape and Space, Measurement and Algebra.
Are additional versions available? This is the 2nd Edition.
Can subtests be administered in isolation? n/a
Administration group size small group, whole class
Administration duration 11-30 minutes (not time limited. Older participants tend to take slightly longer than younger)
Description of materials needed to administer test computer
Any special testing conditions? no

Response format

Response mode Electronic
What device is required computer and keyboard or mouse
Queston format. multiple choice
Progress through questions adaptive

Assessor requirements

Is any prior knowledge/training/profession accreditation required for administration? no
Is administration scripted? Yes

Scoring

Description of materials needed to score test automatic scoring
Types and range of available scores Scaled scores (Rasch ability scale, 0-1400). Criterion referenced scores for numeration and computation. Standardised score.Percentile rank. Student growth percentile (provides a measure of how a student changed from one star testing to the next relative to other students with similar starting scores). Only student growth percentile is based on US norms, other scores are based on UK norms.
Score transformation for standard score age standardised
Age bands used for norming 1 month
Scoring procedures computer scoring with direct entry by test taker/computer scoring with manual entry of responses from paper form/simple manual scoring key – clerical skills required
Automatised norming computerised

Construct Validity

Does it adequately measure literacy, mathematics or science?
Does it reflect the multidimensionality of the subject? Generic Maths
Construct validity comments (and reference for source) The technical manual indicates strong face validity (content validity) - constructs are based on UK national curriculum, TIMSS data, US curricular documents etc. Item Response Theory was used during development. Items that do not correlate well are excluded. Structural validity is indexed through exploratory and confirmatory factor analyses, which indicate excellent fit with single scale for year 2&3, year 4-9 and years 10-13.

Criterion Validity

Does test performance adequately correlate with later, current or past performance?
Summarise available comparisons Criterion validity is supported by correlations between Star Maths and progress in maths for years 2 to 9. 200-400 participants in each age group. Correlations ranged from 0.58 in year 2 to 0.77 in year 6. All were above 0.7 except for the youngest age groups (year 2 and 3). Correlations with Key Stage 2 maths tests for year 6 participants were 0.83 and 0.84. Star maths predicts which children will reach expected levels on their key stage 2 maths test with 89% accuracy in a sample of 815 pupils. Sensitivity was 0.89, specificity was 0.90. Several US studies looking at predicting readiness for college show similar levels of accuracy. There are many studies examining correlations with US tests in US samples which show very similar levels of correlations.

Reliability

Is test performance reliable?
Summarise available comparisons The technical manual reports multiple measures of reliability with very large UK samples. Internal (Generic) reliability calculated by using conditional standard error of measurement statistics. Estimates ranged from 0.87 for year 1 same to 0.94 for years 10, 11 and 13. Split half reliability estimates on the same sample showed a range between 0.88 for year 1 children and 0.94/0.95 for years 10, 11, 12 and 13.Correlations between two different forms of the test (note questions are different every time so this reflect equivalence reliability more than test-retest reliability) taken between 59 and 114 days apart. Year 1: 0.65; year 2: 0.75; year 3 0.75; year 4 0.80; year 5 0.82; year 6 0.83; year 7 0.83; year 8 0.85; year 9 0.85; year 10 0.83; year 11 0.81; year 12 0.84 0.84; year 13 0.74. total 0.90.IRT testing with children in the US stratified to be close to a nationally representative sample. Younger children took 36 item tests and older children took 46 item tests. 2471 items were tested. Items with item correlation with overall score <.30 were discarded. So were items too easy or too difficult, or not showing a Rasch curve. Also items where a distracter showed a positive discrimination. Standard error of measurement is calculated for each participant. These are smaller for participants in the middle of the distribution and larger for those at the edge of the distribution. Average is 36, where scaled score varies from 0 to 1400.

Is the norm-derived population appropriate and free from bias?

Is population appropriate and free from bias? No
If any biases are noted in sampling, these will be indicated here. all test takers were at the start of the school year (august to December). Data were collected from schools that chose to use the test, resulting in an oversampling from disadvantaged schools and schools in the South East. However it is a large sample.

Sources

Sources Renaissance. (2019). Star Assessments for Maths: Technical Manual. London, UK: Renaissance Learning, Inc.Sewell, J., Sainsbury, M., Pyle, K., Keogh, N., & Styles, B. (2007). Renaissance Learning Equating Study. Report.