Renaissance Star Maths

About the measure

VersionⓘCurrent version/edition number

2^nd Edition

Previous version(s)ⓘName and acronym of previous/original version(s) of the test (if applicable).

1^st Edition

Subject

Maths

Assessment screening

SubscalesⓘList of subscales (if applicable)

Single test comprised of 24 questions (selected adaptively) — the first 8 on numeration concepts, the next 8 on computation processes, and the final 8 on the other 6 areas.

Publisher

Renaissance Star Assessments

Test source

http://www.renlearn.co.uk/star-maths/

Guidelines available?ⓘAdministration guidelines available to the review team

Yes

Norm-referenced scores

Yes

Age rangeⓘSpecific population and age range that publisher states the test is intended/suitable for.

6 – 8 years

Key Stage

Key Stage 1

UK standardisation sample

Yes

Publication date

2019

Re-norming date

N/a

Eligibility

Validity measures available?

Yes

Reliability measures available?

Yes

note whether shortlisted, and reasons why not if relevant

Shortlisted

Administration format

Additional information about what the test measures

Measures maths attainment including numeration concepts, computation processes, word problems, approximation, data analysis and statistics, shape and space, measurement and algebra.

Are additional versions available?

This is the 2^nd Edition.

Can subtests be administered in isolation?

N/a

Administration Group Size

Small group, Whole class

Administration duration

11 – 30 minutes (not time limited; older participants tend to take slightly longer than younger).

Description of materials needed to administer test

Computer

Any special testing conditions?

Response format

Response mode

Electronic

What device is required?

Computer and keyboard or mouse

Question format

Multiple Choice

Progress through questions

Adaptive

Assessor requirements

Is any prior knowledge/training/profession accreditation required for administration?

Is administration scripted?

N/a

Assessor requirements

Description of materials needed to score test

Automatic scoring

Types and range of available scores

Scaled scores (Rasch ability scale, 0 – 1400); criterion referenced scores for numeration and computation; standardised score; percentile rank; student growth percentile (provides a measure of how a student changed from one star testing to the next relative to other students with similar starting scores). Only student growth percentile is based on US norms, other scores are based on UK norms.

Score transformation for standard score

Age standardised

Age bands used for norming

1 month

Scoring procedures

Computer scoring with direct entry by test taker/computer scoring with manual entry of responses from paper form/simple manual scoring key — clerical skills required.

Automatised norming

Computerised

Construct Validity

Rating Construct

Does it reflect the multidimensionality of the subject?

Generic maths

Construct validity comments (and reference for source)

The technical manual indicates strong face validity (content validity) — constructs are based on UK national curriculum, TIMSS data, US curricular documents etc. Item Response Theory was used during development. Items that do not correlate well are excluded. Structural validity is indexed through exploratory and confirmatory factor analyses, which indicate excellent fit with single scale for Years 2&3, Years 4 – 9 and Years 10 – 13.

Criterion Validity

Rating Criterion

Summarise available comparisons

Criterion validity is supported by correlations between Star Maths and progress in maths for Years 2 to 9. 200 – 400 participants in each age group. Correlations ranged from 0.58 in Year 2 to 0.77 in Year 6. All were above 0.7 except for the youngest age groups (Years 2 and 3). Correlations with Key Stage 2 maths tests for Year 6 participants were 0.83 and 0.84. Star Maths predicts which children will reach expected levels on their Key Stage 2 maths test with 89% accuracy in a sample of 815 pupils. Sensitivity was 0.89, specificity was 0.90. Several US studies looking at predicting readiness for college show similar levels of accuracy. There are many studies examining correlations with US tests in US samples which show very similar levels of correlations.

Reliability

Rating Reliability

Summarise available comparisons

The technical manual reports multiple measures of reliability with very large UK samples. Internal (generic) reliability calculated by using conditional standard error of measurement statistics. Estimates ranged from 0.87 for Year 1 same to 0.94 for Years 10, 11 and 13. Split half reliability estimates on the same sample showed a range between 0.88 for Year 1 children and 0.94/0.95 for Years 10, 11, 12 and 13. Correlations between two different forms of the test (note questions are different every time so this reflect equivalence reliability more than test-retest reliability) taken between 59 and 114 days apart. Year 1: 0.65; Year 2: 0.75; Year 3 0.75; Year 4 0.80; Year 5 0.82; Year 6 0.83; Year 7 0.83; Year 8 0.85; Year 9 0.85; Year 10 0.83; Year 11 0.81; Year 12 0.84; Year 13 0.74. Total 0.90. IRT testing with children in the US stratified to be close to a nationally representative sample. Younger children took 36 item tests and older children took 46 item tests. 2471 items were tested. Items with item correlation with overall score <.30 were discarded. So were items too easy or too difficult, or not showing a Rasch curve. Also items where a distracter showed a positive discrimination. Standard error of measurement is calculated for each participant. These are smaller for participants in the middle of the distribution and larger for those at the edge of the distribution. Average is 36, where scaled score varies from 0 to 1400.

Is the norm-derived population appropriate and free from bias?

Does the standardisation sample represent the target/general population well?

If any biases are noted in sampling, these will be indicated here.

All test takers were at the start of the school year (August to December). Data were collected from schools that chose to use the test, resulting in an oversampling from disadvantaged schools and schools in the South East. However, it is a large sample.

Sources

Renaissance (2019). Star Assessments for Maths: Technical Manual. London, UK: Renaissance Learning, Inc. Sewell, J., Sainsbury, M., Pyle, K., Keogh, N., & Styles, B. (2007). Renaissance Learning Equating Study. Report.

Renaissance Star Maths

About the measure

VersionⓘCurrent version/​edition number

Previous version(s)ⓘName and acronym of previous/​original version(s) of the test (if applicable).

Subject

Assessment screening

SubscalesⓘList of subscales (if applicable)

Publisher

Test source

Guidelines available?ⓘAdministration guidelines available to the review team

Norm-referenced scores

Age rangeⓘSpecific population and age range that publisher states the test is intended/​suitable for.

Key Stage

UK standardisation sample

Publication date

Re-norming date

Eligibility

Validity measures available?

Reliability measures available?

note whether shortlisted, and reasons why not if relevant

Administration format

Additional information about what the test measures

Are additional versions available?

Can subtests be administered in isolation?

Administration Group Size

Administration duration

Description of materials needed to administer test

Any special testing conditions?

Response format

Response mode

What device is required?

Question format

Progress through questions

Assessor requirements

Is any prior knowledge/training/profession accreditation required for administration?

Is administration scripted?

Assessor requirements

Description of materials needed to score test

Types and range of available scores

Score transformation for standard score

Age bands used for norming

Scoring procedures

Automatised norming

Construct Validity

Rating Construct

Does it reflect the multidimensionality of the subject?

Construct validity comments (and reference for source)

Criterion Validity

Rating Criterion

Summarise available comparisons

Reliability

Rating Reliability

Summarise available comparisons

Is the norm-derived population appropriate and free from bias?

Does the standardisation sample represent the target/general population well?

If any biases are noted in sampling, these will be indicated here.

Sources

Sources

VersionⓘCurrent version/edition number

Previous version(s)ⓘName and acronym of previous/original version(s) of the test (if applicable).

Age rangeⓘSpecific population and age range that publisher states the test is intended/suitable for.