Validation of the NumberSenseMMR® Framework

The consistency of the Dynamo Maths framework was conducted independently by the University of Oxford and led by Dr. Ann Dowker (2016).

The study used the assessment data collected from 3465 students in 368 schools across England, Wales, Scotland and Europe (English-speaking schools using the UK curriculum).

The correlation analysis was carried out on the assessment data that was grouped into Number Meaning, Number Magnitude and Number Relationship components (Visual Numbers, counting through to Multiplication) by adding scores.

The data was analysed from ages 7 through to 11, to avoid the influence of too many small and diverse age groups.

The analysis showed all the components in the NumberSenseMMR® framework correlated significantly at (p < 001) with one another.

The analysis shown in the table below showed Number Magnitude was a highly significant independent predictor (beta = 0.313; t = 12.92; p < 0.001).

Similarly, Number Meaning (beta = 0.091; t = 3.784; p < 0.001) was also a significant independent predictor and Age was also significant (beta = 0.48; t = 2.46; p < 0.014).

Arguably, Age could be excluded from the multiple regression analysis as the data did not contain a truly continuous age variable

Number Relationship
Standardised Coefficients
t p
Number Magnitude 0.313 12.92 p < 0.001
Number Meaning 0.091 3.784 p < 0.001
Age 0.48 2.46 p < 0.014

Table 1. Multiple Regression Analysis

The ANOVA analysis in Table 2 shows participant variance analysis with Age as the factor and the Number Meaning, Number Magnitude and Number Relationship components as the dependent variables.

The analysis showed that there was high significant effect of Age on Number Meaning score (F(4,2381) = 13.26; p = 0.001). The mean scores were 18.65 (s.d 2.113) for 7-year-olds; 18.99 (s.d 1.76) for 8-year-olds; 19.3 (s.d 1.39) for 9-year-olds; 14.038 (s.d 5.198) for 10-year-olds; and 14.54 (s.d 1.66) for 11-year-olds.

The Tamhane2 post hoc tests showed that there were no significant differences between 7- and 8-year-olds, 8- and 9-year-olds, 9- and 10-year-olds or 9- and 11-year-olds or 10-and 11-year-olds; but there were highly significant differences between 7- and 9-year-olds, 7- and 10-year-olds, 7- and 11-year-olds, 8- and 10-year-olds, 9- and 10-year-olds and 9- and 11-year-olds. All significant differences were in the direction of older pupils scoring higher.

The analysis showed that there was high significant effect of Age on Number Magnitude score (F(4,2381) = 24.467; p = 0.001). The mean scores were 12.18 (s.d 4.74) for 7-year-olds; 11.78 (s.d 4.75) for 8-year-olds; 12.87 (s.d 5.177) for 9-year-olds; 14.038 (s.d 6.63) for 10-year-olds; and 14.54 (s.d 5.48) for 11-year-olds.

The Tamhane2 post hoc tests showed that there were no significant differences between 7- and 8-year-olds, 7- and 9-year-olds, 8- and 9-year-olds or 10- and 11-year-olds; but there were highly significant differences between 7- and 10-year-olds, 7- and 11-year-olds, 8- and 9-year-olds, 8- and 10-year-olds, 8- and 11-year-olds, 9- and 10-year-olds and 9- and 11-year-olds. All significant differences were in the direction of older pupils scoring higher.

The analysis showed there was high significant effect of Age on Number Relationship score (F(4,2381) = 12.86; p = 0.001). The mean scores were 9.54 (s.d 6.976) for 7-year-olds; 9.776 (s.d 7.15) for 8-year-olds; 10.007 (s.d 6.74) for 9-year-olds; 19.41 (s.d 1.35) for 10-year-olds; and 19.28 (s.d 7.1) for 11-year-olds.

Factor p Mean s.d
Number Meaning F(4,2381)=13.26 p = 0.001
Age 7 18.65 2.113
Age 8 18.99 1.76
Age 9 19.3 1.39
Age 10 14.038 >5.198
Age 11 14.54
Number Magnitude F(4,2381)=24.467 p = 0.001
Age 7 12.18 4.74
Age 8 11.78 4.75
Age 9 12.87 5.177
Age 10 14.038 6.63
Age 11 14.54 5.48
Number Relationship F(4,2381)=12.86 p = 0.001
Age 7 9.54 6.976
Age 8 9.776 7.15
Age 9 10.007 6.74
Age 10 19.41 1.35
Age 11 19.28 7.1

Table 2. ANOVA Variance analysis with Age and dependent variables

The Tamhane2 post hoc tests showed that there were no significant differences between 7- and 8-year-olds, 7- and 9-year-olds, 8- and 9-year-olds or 10- and 11-year-olds; but there were highly significant differences between 7- and 10-year-olds, 8- and 11-year-olds, 9- and 10-year-olds and 9- and 11-year-olds. All significant differences were in the direction of older pupils scoring higher.

Therefore, this shows that there is a large difference in Number Meaning between 7-year-olds and older children and for age 9, there may have been a tendency for a ceiling effect. Similarly, for Number Magnitude and Number Relationship, the biggest difference was between ages 7 to 9 and 10 to 11.

Puffin Maths Assessment Standardisation

Standardised Scores

What are Standardised Scores?
Standardised scores allow comparison of an individual’s performance with well-defined reference groups. The normative scores are sometimes referred to as standardised scores.

The standardised score indicates the degree to which an individual’s score deviates from the average for people of the same age.

The scale is based on the ‘normal‘ distribution of scores that would be expected within the population, and is calculated on the basis that the overall mean (average) standardised score is 100 and the standard deviation is 15, so that about 68% of people will score between 85 and 115.

Test Construction

The Puffin Assessment has 647 test items that are derived from dyscalculia research and the UK national curriculum, taking into account the following areas:

  • Progression of strands.
  • Progression of questions within each strand.
  • Questions within each strand meet the strand’s objective.
  • An Individual Support Plan with signposts to the Puffin Intervention.
  • Progression of questions within the Meaning, Magnitude and Relationship areas.
  • Dynamic generation of questions based on responses and progression across the strands.
  • Random generation of questions within each strand.
  • Universality of language and operational symbols.
  • Questions are developmentally appropriate for the baselined ages.
  • Universality in the use of the illustrations.
  • Universality in the use of the language.
  • Measurement of the response time for each question.
  • Questions reviewed for gender and cultural balance.
  • The use of technology to ensure that when the questions are read, there is a common set of instructions for all students.
  • Accessibility settings so that the necessary screen adjustments can take place.
  • The provision of Support Tools:
  • Student Profile Questionnaire to gain a snapshot of the pupil’s current development and functioning so that the necessary adjustments could be offered during the assessment.
  • Working-out Notes for students to show their thinking on paper.
  • Observation Notes for the test administrator to observe the pupil’s methods, approaches and thinking during the assessment.

The data extract was confirmed as results from tests administered online independently carried out at schools by SEN Coordinators, Maths Coordinators, Class teachers and Higher-Level Teaching Assistants.

A sample of 3465 students was used for the standardisation of the Puffin Maths Assessment. The sample data represented primary schools and centres covering all four regions – England, Wales, Scotland and Europe.

The sample data was chosen to achieve the widest balance of content, both with the sets and throughout as a whole to represent the population.

The whole represented the collective three stages attainment in all of the NumberSenseMMR® framework components and the component measurements were taken as raw scores.

The data sets were grouped in seven sets for the purposes of analysing Standard Deviation and Mean. The seven groups were grouped in Ages 6, 7, 8, 9, 10, 11, and 12+ shown in Table 2 below.

The student results were stratified by the NumberSenseMMR® framework components and were not based on Key Stage results, as this information was not available in the same format.

About the Sample Data

The sample test results data set was extracted for the assessments carried out between the period September 2014 and April 2015. The data sets for analysis represented 3465 students in 368 schools in England, Wales, Scotland and Europe (English-speaking schools using the UK curriculum).

The distribution of number of schools and number of students to represent the population is shown in Table 1, and the percentage distribution of the sample data by regions is shown in Diagram 1.

A total of 5 student data sets that represented incomplete assessments or expired time were removed from the extracted sample data for analysis.

 Regions No. of
% No. of
% Population %
North  521  15 54 15  12570  17
Midlands 525 15 62 17 15384 21
South 2038 59 200 54 35515 49
Wales 91 3 18 5  3145 4
Europe 141 4 20 5  1811 3
Scotland 149 4 14 4 3642  5
Total: 3465 100 368 100 72067 100

Table 1. Number of schools and students in the standardised sample

Diagram 1. Distribution of schools and students by regions in the standardised sample

The sample data represented a distribution of students between 6 and 15 years old. The equivalent of school curricular measures used in England, Scotland and Europe is shown in Table 2.

The gender split in the data represented 52% male and 48% female students. The detailed spilt by age is shown in Diagram 2 below.

Diagram 2. Profile of male and female students in the standardised sample

Age England Europe British
Scotland Male Female No. of
No. of
6 – 7 years Year 1 Year 1 P1 270 248 518  16
7 – 8 years Year 2 Year 2 P2 369 321 690 36
8 – 9 years Year 3 Year 3 P3 364 344 708 45
9 – 10 years Year 4 Year 4 P4 281  319 600 76
10 – 11 years Year 5 Year 5 P5 197 207 404 96
11 – 12 years Year 6 Year 6 P6 119 83 202 49
 12 – 13 years Intervention  Year 7 Intervention  61  49  110 9
13 + years Intervention Year 8 Intervention 137 96 233 41
Total: 1798 1667 3465  368

Table 2. Number of students and schools participating in field testing

Intervention Validation


There have been many attempts to raise the performance of children with low numeracy skills, although not specifically for dyscalculia. In the United States, for example, evidence-based approaches have focused on children from deprived backgrounds, usually low socioeconomic status (C. Mussolin et al.2009; R. Price, D. Ansari, Curr. Biol. (2007)).

The 2003 Primary National Strategy in the United Kingdom gave special attention to children with low numeracy skills by:

(i) Diagnosing each child’s conceptual gaps in understanding
(ii) Giving the child more individual support in working through visual, verbal, and physical activities designed to bridge each gap.

Unfortunately, there is little quantitative evaluation of the effectiveness of these strategies: It has not been possible to tell whether identifying and targeting an individual’s conceptual gaps with a more individualized version of the same teaching is effective. A further problem is that these interventions are effective when there has been specialist training for teaching assistants, but not all schools can provide this (A. Dowker, 2009).

Standardised approaches depend on curriculum-based definitions of typical arithmetical development, and how children with low numeracy differ from the typical trajectory.

In contrast, neuroscience research suggests that rather than address isolated conceptual gaps, remediation should build the foundational number concepts first. It offers a clear cognitive target for assessment and intervention that is largely independent of the learners’ social and educational circumstances. In the assessment of individual cognitive capacities, set enumeration and comparison can supplement performance on curriculum-based standardized tests of arithmetic to differentiate dyscalculia from other causes of low numeracy (B. Butterworth, D. Laurillard, (2010), K. Landerl, Child Psychol. 2009).

B. Butterworth and D. Laurillard Science 2011, show that the intervention that strengthens the meaningfulness of numbers, especially the link between the maths facts and their component meanings, is crucial. Typical retrieval of simple arithmetical facts from memory elicits activation of the numerical value of the component numbers.

Without specialized intervention, most dyscalculic learners struggle with basic arithmetic in secondary school (R. S. Shalev Child Neurol 2005).

Effective early intervention may help to reduce the later impact on poor numeracy skills, as it does in dyslexia (Goswami 2006).

Although this approach is very expensive, it promises to repay 12 to 19 times of the investment (J. Gross, Every Child a Chance Trust 2009).

A further study was conducted to confirm the effectiveness of the purposeful Dynamo Intervention Program. The study took a sample size of 50 pupils between ages 6 and 15 from Dynamo Intervention.

These pupils had taken the first assessment at the beginning of the Spring school term 2015 and were independently provided with 12 weeks of intervention support followed by a second assessment at the end of the school term.

An analysis was carried out to compare the first and second assessments. The analysis showed that the percentage improvement for the combined MMR stages was 11.67% for the intervention period of 12 weeks.

Further analysis showed that the improvement in the Magnitude and Relationship stage was 21.44% and solely for the Relationship stage, a staggering 31.94% improvement.

This shows that a small improvement made in the Number Meaning and Number Magnitude stages brought a large improvement in the Number Relationship stages (Maths Foundation).

This analysis further provides confidence in the reliability of the NumberSenseMMR® framework to support the findings from neuroscience research.


We have enjoyed using Puffin and found it really easy to access. The results of the assessment were very straightforward to interpret and there was diagnostic information and teaching ideas. A really good and cost-effective assessment tool. Thanks. Gill Mitchell, East Bergholt Primary School

I have been very impressed by the quality of Puffin Maths Lesson Plans. This will help our teachers to deliver quality well targeted interventions based on the gaps identified by the assessment tools. Thomas Vandewiele, Northway Primary and Nursery School

Puffin Maths has been excellent with our SEN pupils. In one year a pupil with SEN, including Dyscalculia and Dyslexia, has doubled his maths results.  This is down to good differentiated quality teaching in his maths classes but also down to the use of the Puffin Maths Programme. We have rolled the programme out to other children at the school and it is proving popular with pupils, teachers and their parents. Yvette Unsworth, Head of Learning Support and Extension

We really like the programme.  The children enjoy it. Staff find it easy to use and we have seen a good impact, particularly with our pupils in Year 2 and Year 3. Ellen Smith, Apley Wood Primary School

The outcomes are excellent. We cater for pupils aged 3 to 19. All of them have the autism spectrum condition or severe learning difficulties. We wanted to improve standards in their number work. Since we bought Dynamo last September, the 66 students who are using it are making progress. Some of them are racing through it!  It targets specific areas of learning and provides teachers with a structured plan of materials and activities. I wish I’d found it earlier -we use it everyday and it makes such a difference.  Corinne Owen, Head of Education, The Beacon (Foxwood School)

Puffin Maths illustrates how technology can be used as a great tool to support children with mathematics learning difficulties since it can be accurate and specific in its assessment and also provide fruitful and enjoyable learning experiences to these learners who can in turn make much desired progress.  Dr Ann Dowker, University of Oxford & Esmeralda Zerafa, Doctorate Researcher

One of the great things about Puffin Maths is that it indicates whether pupils simply have gaps in their knowledge, whether they have a developmental delay where maths and number sense are concerned, or whether they have underlying dyscalculic tendencies – normally getting this kind of insight is very difficult to obtain unless you bring in a dyscalculic specialist.  After running the program for three months, we reassessed the pupils. All children improved significantly and over half the group doubled their scores. Puffin Maths is both teacher-and child-friendly and has worked well in our school. I cannot recommend it highly enough. Dawn Bradshaw, Huntingtower Community Primary Academy.

The program has exceeded our expectations. The children have come on in leaps and bounds. Jeanette Rimmer, Boldmere School