Using a large language model to analyse problem solving and simulate student solutions in lower secondary mathematics
PDF

Keywords

mathematical problem solving
a priori didactic analysis
large language models
result verification
lower secondary education

How to Cite

Přibyl, J., & Tichá, M. (2026). Using a large language model to analyse problem solving and simulate student solutions in lower secondary mathematics. Scientia in Educatione, 17(1), 63–90. https://doi.org/10.14712/18047106.5220

Abstract

This study examines how a large language model (LLM) can support lower secondary mathematics education, not by measuring performance, but by analysing how complete and didactically useful its solution processes are. The focus is on whether the model explicitly verifies results, which is an important but often neglected phase of mathematical problem solving. Two exploratory experiments were conducted with Gemini 2.5 Pro using 24 lower secondary mathematics tasks presented in Czech. In Experiment 1, the model solved all tasks in a baseline condition and then again using a structured prompt inspired by Pólya’s four phases of problem solving. In Experiment 2, the model was asked to simulate solution attempts of three contrasting student profiles. The results show that verification is a fragile step. In the baseline condition, the model often produced correct answers but did not check them. When guided by the structured prompt, verification appeared in every task. In student simulation, the model produced plausible mistakes and omissions in routine tasks, but often became unrealistically advanced in non-routine and construction problems, reducing profile fidelity. Overall, LLM outputs can support a priori didactic analysis, but they require careful interpretation, especially when used to simulate student thinking in demanding tasks.

https://doi.org/10.14712/18047106.5220
PDF

References

Almarashdi, H. S., Jarrah, A. M., Abu Khurma, O., & Gningue, S.M. (2024). Unveiling the potential: A systematic review of ChatGPT in transforming mathematics teaching and learning. Eurasia Journal of Mathematics, Science and Technology Education, 20(12), em2555. https://doi.org/10.29333/ejmste/15739

Brousseau, G. (1997). Theory of Didactical Situations in Mathematics. N. Balacheff, M. Cooper, R. Sutherland, & V. Warfield (Eds.) Kluwer Academic Publishers, Dordrecht.

Brousseau, G., & Gibel, P. (2005). Didactical handling of students’ reasoning processes in problem solving situations. Educational Studies in Mathematics, 59(1–3), 13–58. https://doi.org/10.1007/s10649-005-2532-y

Carlson, M. P., & Bloom, I. (2005). The cyclic nature of problem solving: An emergent multidimensional problem-solving framework. Educational Studies in Mathematics, 58(1), 45–75. https://doi.org/10.1007/s10649-005-0808-x

Cifarelli, V. V., & Cai, J. (2005). The evolution of mathematical explorations in open-ended problem-solving situations. Journal of Mathematical Behavior, 24(3–4), 302–324. https://doi.org/10.1016/j.jmathb.2005.09.007

Guerra, E., Pena, M., & Araya, R. (2025). The inevitable and unpredictable role of large language models in education: A commentary on Huettig and Christiansen (2024). Cognitive Science, 49, e70105. https://doi.org/10.1111/cogs.70105

Hoth, J., Larrain, M., & Kaiser, G. (2022). Identifying and dealing with student errors in the mathematics classroom: Cognitive and motivational requirements. Frontiers in Psychology, 13, 1057730. https://doi.org/10.3389/fpsyg.2022.1057730

Jonassen, D. H. (1997). Instructional design models for well-structured and ill-structured problem-solving learning outcomes. Educational Technology Research and Development, 45(1), 65–94. https://doi.org/10.1007/BF02299613

Kasneci, E., Sessler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh, G., Günnemann, S., Hüllermeier, E., Krusche, S., Kutyniok, G., Michaeli, T., Nerdel, C., Pfeffer, J., Poquet, O., Sailer, M., Schmidt, A., Seidel, T., Stadler, M., Weller, J., Kuhn, J., & Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, 102274. https://doi.org/10.1016/j.lindif.2023.102274

Kontorovich, I. (2019). Why do students not check their solutions to mathematical problems? A field-based hypothesis on epistemological status. International Journal of Mathematical Education in Science and Technology, 50(7), 1050–1062. https://doi.org/10.1080/0020739X.2019.1650304

Liu, J., Huang, Z., Liu, Q., Ma, Z., Zhai, C., & Chen, E. (2025). Knowledge-centered dual-process reasoning for

math word problems with large language models. IEEE Transactions on Knowledge and Data Engineering, 37(6), 3457–3471. https://doi.org/10.1109/TKDE.2025.3556367

Malik, R., Abdi, D., Wang, R., & Demszky, D. (2025). Scaffolding middle school mathematics curricula with large language models. British Journal of Educational Technology, 56(3), 999–1027. https://doi.org/10.1111/bjet.13571

Matematická olympiáda. (2021). Czech Mathematical Olympiad: 71st year, category Z9, Round I (problem set). Retrieved February 8, 2026, from https://www.matematickaolympiada.cz/media/3492117/z71i-9.pdf

Matematická olympiáda. (2022a). Czech Mathematical Olympiad: 72nd year, category Z9, Round I (problem set). Retrieved February 8, 2026, from https://www.matematickaolympiada.cz/media/3494370/z72i-9.pdf

Matematická olympiáda. (2022b). Czech Mathematical Olympiad: 72nd year, category Z9, Round II (problem set). Retrieved February 8, 2026, from https://www.matematickaolympiada.cz/media/3516462/z72ii-9-zr.pdf

Matematická olympiáda. (2022c). Czech Mathematical Olympiad: 72nd year, category Z9, Round III (problem set). Retrieved February 8, 2026, from https://www.matematickaolympiada.cz/media/3516464/z72iii-9-zr.pdf

Matematická olympiáda. (2024a). Czech Mathematical Olympiad: 74th year, category Z9, Round II (problem set). Retrieved February 8, 2026, from https://www.matematickaolympiada.cz/media/3828323/z74ii-zadani.pdf

Matematická olympiáda. (2024b). Czech Mathematical Olympiad: 74th year, category Z9, Round III (problem set). Retrieved February 8, 2026, from https://www.matematickaolympiada.cz/media/3824271/z74iii-9.pdf

McGalliard, W., & Otten, S. (2025). AI Responses to Challenging Problems and Educator Responses to AI Availability. Digital Experiences in Mathematics Education, 11, 319–332. https://doi.org/10.1007/s40751-024-00167-4

Nováková, H. (2013). Analýza a priori jako součást přípravy učitele na výuku. Scientia in educatione, 4(2), 20–51. https://doi.org/10.14712/18047106.70

Ocak, C., Kopcha, T. J., Hodges, C. B., Sadik, O., & Ozogul, G. (2025). How artificial intelligence will reshape education: Conversations with the educational technology community. TechTrends, https://doi.org/10.1007/s11528-025-01149-y

Pando, M., & León, M. (2025). Mathematics disciplinary literacy: A case study of a bilingual teacher’s interaction with ChatGPT. Language and Education. https://doi.org/10.1080/09500782.2025.2601055

Pankow, L., Kaiser, G., Busse, A., König, J., Blömeke, S., Hoth, J., & Döhrmann, M. (2016). Early career teachers’ ability to focus on typical students’ errors in relation to the complexity of a mathematical topic. ZDM Mathematics Education. 48, 55–67, https://doi.org/10.1007/s11858-016-0763-2

Papadopoulos, I., & Dagdilelis, V. (2008). Students’ use of technological tools for verification purposes in geometry problem solving. The Journal of Mathematical Behavior, 27(4), 311–325. https://doi.org/10.1016/j.jmathb.2008.11.001

Pepin, B., Buchholtz, N., & Salinas-Hernández, U. (2025). A scoping survey of ChatGPT in mathematics education. Digital Experiences in Mathematics Education, 11, 9–41. https://doi.org/10.1007/s40751-025-00172-1

Pólya, G. (1945). How to solve it. Princeton University Press.

Rendl, M., Vondrová, N., Hříbková, L., Jirotková, D., Kloboučková, J., Kvasz, L., Páchová, A., Pavelková, I., Smetáčková, I., Tauchmanová, E., & Žalská, J. (2013). Kritická místa matematiky na základní škole očima učitelů. Univerzita Karlova, Pedagogická fakulta.

Schoenfeld, A. H. (1985). Mathematical problem solving. Academic Press.

Stannard, A., & Foster, C. (2025). Secondary school mathematics teachers’ accuracy at predicting student errors. School Science and Mathematics. https://doi.org/10.1111/ssm.18404

Suyitno, A., Suyitno, H., Rochmad, & Dwijanto. (2018). Use of open-ended problems as the basis for the mathematical creativity growth disclosure of student. Journal of Physics: Conference Series, 983(1), 012110. https://doi.org/10.1088/1742-6596/983/1/012110

Turmuzi, M., Azmi, S., & Kertiyani, N. M. I. (2026). ChatGPT in school mathematics education: A systematic review of opportunities, challenges, and pedagogical implications. Teaching and Teacher Education, 170. https://doi.org/10.1016/j.tate.2025.105286

Urban, M., Lukavský, J., Brom, C., Hein, V., Svacha, F., Děchtěrenko, F., & Urban, K. (2025). Prompting for creative problem-solving: A process-mining study. Learning and Instruction, 99. https://doi.org/10.1016/j.learninstruc.2025.102156

Zhang, X., Zhang, C., Sun, J,. & Xiao, J. (2025). EduPlanner: LLM-based multi-agent systems for customized and intelligent instructional design. IEEE Transactions on Learning Technologies, 1–11. https://doi.org/10.1109/TLT.2025.3561332

Zhuang, Y., & Zhang, S. (2025). Integrating ChatGPT in mathematics teacher education: AI-based simulation role-playing to support practice-based teaching. International Journal of Artificial Intelligence in Education, 35, 3873–3895. https://doi.org/10.1007/s40593-025-00519-0

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Copyright (c) 2026 Jiří Přibyl, Michaela Tichá