Learning and reasoning with mathematical symbols


Jeffrey Bye






UCLA Psychology


Postdoctoral Researcher



Patricia Cheng


The concept of an algebraic variable is both important in its own right and foundational for higher levels of mathematics, but many students struggle to understand the meaning and purpose of a variable. Common math education practices often fail to support students in making meaningful insights about the interpretation of a variable, its mathematical purpose, or its relevance in solving real-world problems. Such conceptual impoverishment prevents these students from appreciating algebra and from building on its concepts in more advanced math. We have created and begun assessing new educational materials, in the form of online multimedia videos, that encourage and support students’ discovery of the meaning and purpose of a variable, guided by principles from educational psychology and the cognitive psychology of learning.


Our primary objective in this research is to examine the efficacy of a framework we term ‘Purpose-Driven Progressive Formalization’ (P-PF) for teaching students the meaning and purpose of an algebraic variable. We have implemented this framework into interactive multimedia videos, which utilize educational techniques stemming naturally from an underlying focus on developing students’ conceptual understanding of algebraic variables, in particular, their meaning and purpose.

The Experimental condition’s (P-PF) version of these videos implements three primary manipulations: contextual facilitation of intuitive thinking, constructive struggling, and contrast comparisons; these manipulations are combined to gradually introduce students to increasingly abstract representations for unknown quantities, in a way that highlights the purpose of each component of variables’ symbolic form. Students learn through a sequence of story problems written to facilitate their intuitive, contextual thinking. They are encouraged to first attempt each novel problem without any instruction (constructive struggling), generating their own insights into the problems and the concepts and procedures which enable their solution, and increasingly specific ‘context-level’ hints help guide their intuitions if they fail to find the answer. The problems are ordered in a way that facilitates students’ inferences about their mathematical structures, and to this end, students are frequently asked to make contrast comparisons that elucidate differences in both problem structure and the increasingly abstract mathematical representations used in solution procedures. As students progress through the materials, they gradually and actively transition to the use of letters as symbols that can represent unknown numbers, after first using sentence descriptions and then word equations as intermediate representations.

In the Control version of these videos, content and time-on-task are matched as exactly as possible, but the materials differ in structure/schedule: students begin with a traditionally ‘formal’ introduction to variables, and each problem is preceded by a formal demonstration of how to solve the new problem using variable manipulation. Then students are allowed to practice each problem type, and are shown an explanation video if they are incorrect. A third condition consists of Khan Academy videos that cover the same topics as the Experimental and Control conditions, as a baseline measure against which to compare our materials.


Our goal is to assess whether the characteristics of the P-PF framework lead students to develop a deeper conceptual understanding of their interpretation and usefulness, as well as a more robust, longer-lasting ability to carry out solution procedures for algebraic problems.


Students will hopefully learn a more conceptually rich and intuitive understanding of the meaning and purpose of algebraic variables, as well as a more flexible ability to carry out algebraic procedures to solve problems. This experiment will assess the efficacy of our approach, and because the multimedia videos are easy to distribute across the internet, could lead to effective and unique lessons that can be shared with the public.


Experimental results from this research at the UCLA Lab School, along with other school sites, will be analyzed and written up for publication, ideally in an open-access journal which can be shared with the public. Additionally, the research will be submitted for presentation at education and psychology conferences around the country. Finally, we would be happy to share the results from our research with the UCLA Lab School, and would be willing to meet with UCLA Lab School faculty and staff to discuss the project results and implications.




Any student in 6th grade is eligible to participate, ideally at least 30.


The experiment consists of 3 separate sessions (ideally, the first 2 sessions would be consecutive days or at least during the same week, and the third would be approximately 1 month after the second, depending on your scheduling availability). Each session will last approximately 60-75 minutes, though some students are likely to finish sooner. For all three sessions, students will need access to an individual computer or tablet (with internet connection) and headphones (we can provide headphones if necessary).

Students will be randomly assigned a number, which will be used to group them into one of the three conditions (e.g., students 1-10 in Group 1, 11-20 in Group 2, 21-30 in Group 3) and also to protect their identity (i.e., they will use their number, not name, for all materials). For Sessions 1 and 2 (the learning phase), students in each group will be given a different URL to go to. Each URL leads to a section of our lab’s website, which hosts the video materials with an interface for students to enter answers. Both Sessions 1 and 2 take place entirely on the website, but students will be encouraged to take notes and record the math they do on scratch paper, which will be collected at the end of each session (and identified by their random number to match to their online data). Ideally, Session 2 will be held within a week of Session 1, since it is a continuation of the learning from.

Session 3, held approximately 1 month later, is a delayed measure of students’ retention of the material, both the procedures and conceptual information. Most of this session consists of a paper-and-pencil worksheet, which is given in two parts. After completing the first worksheet, which contains most of the assessment, students will watch a few more short videos on the website (all groups watch the same videos, which last a total of about 10 minutes) and then complete the second (shorter) worksheet. This third session is identical for all three groups. Both worksheets consist of standard algebra word problems analogous to those learned in Sessions 1-2 (to assess retention of problem solutions) as well as conceptual items, adapted from various mathematical understanding scales. The end of the second worksheet also contains a short survey to allow students to record their impression of the materials.


Warning: Array to string conversion in /opt/data/www/connect/wp-content/themes/ucla-connect-test/templates/content-single-project.php on line 29



Our measurements will assess students’ learning from each session, their retention of solution procedures after a 1-month delay, and their understanding of concepts such as the interpretation of an equals sign, evaluating relationships between variables, etc. .

Students are assessed based on both immediate and delayed post-tests. The immediate post-tests consist of 3 word problems given at the end of each learning session (Sessions 1 and 2), in order to assess what they learned in the given session. The delayed post-test makes up the bulk of Session 3, and consists of both procedural and conceptual measures. The procedural items are near- and far-transfer analogues of the problems learned during Sessions 1 and 2, while the conceptual items are adapted from various mathematical understanding scales (Falkner, Levi, & Carpenter, 1999; Linchevski & Herscovics, 1996; Collis, 1975; MacGregor & Stacey, 1997; Weinberg et al., 2004).

Additionally, we would like to request anonymized demographic information about the students who participate in the experiment, ideally matched to their random number (but still unidentifiable); these data would be used potentially as covariates, but also as a way to assess whether there are specific subgroups of students who benefit more or less from the materials. This would allow us to better assess our ability to make materials that benefit underserved and underrepresented demographics in math. We would like to be able to use each student’s grade, age, race/ethnicity, gender identity, whether they qualify for free or reduced lunch, the language(s) they speak in their home, and their math level or grade, if these data are available. These data are not required for our research to be carried out, but they would help us greatly to make more informed assessments of the materials.


The randomized experimental design will allow us to make causal inferences about the efficacy of each condition’s material, in particular whether the P-PF approach leads to more robust procedural and conceptual learning than the Control or Khan Academy conditions. The inclusion of the Khan Academy condition is necessary as a baseline against which to assess the Control condition, so that we can vouch for the efficacy of the Control. Thus, any improvement over and above the Control that the P-PF leads to is strongly indicative of an advantage of the structure of the P-PF approach over the more traditional Control.


We believe that the blanket consent at the Lab School will cover our project (this was approved last year), but if necessary, we have IRB-approved recruitment letter, parent permission form, and child assent form that we can use.


The interactive video data (answers provided on our own web server) will be collected from each subject individually on a computer or tablet device, and will not be identifiable (only their random number will link the data between sessions and to their scratch paper and worksheets). Each student will have individual access to a computer/tablet and their own packet of paper, and will complete the video, worksheet, and survey in the same amount of privacy as they would experience when taking a test in class. The administration and collection of the packets should follow the same privacy procedures that would be followed in giving the class a paper test, and since students’ names will not be recorded on these papers, there will be even less risk.

The materials for the experiment involve math problems to solve (video and worksheet), questions about the problems, survey items, and possible follow-up interview questions about the materials. The consequences of a loss of privacy for a math problem's solution would be no greater than is typically encountered in a classroom if one student manages to see another student's answers to a worksheet or test problem.


There is no deception. We would be happy to debrief / explain the research to participants after the final session has been completed, if the teacher would like us to.


The only identifiable data for this experiment will be each students' name, which will never be directly used on experimental materials; instead, a master list of the names and their random numbers will be kept only during the experiments, and would only be accessible to the teacher so that they can remind students of their random number at the beginning of each session. Names will be kept completely separate from the actual materials and data collected online or on paper. The researchers will only have access to each student's non-identifiable random number. As such, all data collected online and on paper will be unidentifiable by researchers. Unidentifiable data will be saved on our web server and downloaded to personal computers, while unidentifiable scratch paper and worksheets will be kept in Dr. Bye’s office for future reference, after research assistants have independently coded the students’ answers and methods. Additionally, if available, the demographic information requested in the Instruments section above, would be completely unidentifiable and kept as a digital file on Dr. Bye’s computer.


Please feel free to contact us if anything is unclear or you have any questions! Thank you!


We ran a previous version of this study last year, and we worked with Sandra Smith and Dr. Enyedy to receive approval. Kevin North was the 6th grade teacher who helped us.


Any teacher at the 6th grade level would be free to consent or not consent to participate in the research. We would only carry out the research in classes whose teachers have consented.


None yet, but any teacher at the 6th grade level can be involved if they wish to help us with the research. Additionally, we may need help from IT to ensure that the computers/tablets are able to access the videos on our servers without any issues.




As described in the Instruments section above, we would like to request the following anonymized demographic data from the UCLA Lab School, ideally matched to the student’s unidentifiable random number: grade, age, race/ethnicity, gender identity, whether they qualify for free or reduced lunch, the language(s) they speak in their home, and their SBAC math scores/levels. These data are not required for the research but would be greatly beneficial for our assessment.


We are not aware of any special requirements, other than students’ individual access to computers/tablets with internet connection and headphones.