Abstract
This study examines how a large language model (LLM) can support lower secondary mathematics education, not by measuring performance, but by analysing how complete and didactically useful its solution processes are. The focus is on whether the model explicitly verifies results, which is an important but often neglected phase of mathematical problem solving. Two exploratory experiments were conducted with Gemini 2.5 Pro using 24 lower secondary mathematics tasks presented in Czech. In Experiment 1, the model solved all tasks in a baseline condition and then again using a structured prompt inspired by Pólya’s four phases of problem solving. In Experiment 2, the model was asked to simulate solution attempts of three contrasting student profiles. The results show that verification is a fragile step. In the baseline condition, the model often produced correct answers but did not check them. When guided by the structured prompt, verification appeared in every task. In student simulation, the model produced plausible mistakes and omissions in routine tasks, but often became unrealistically advanced in non-routine and construction problems, reducing profile fidelity. Overall, LLM outputs can support a priori didactic analysis, but they require careful interpretation, especially when used to simulate student thinking in demanding tasks.
References
Almarashdi, H. S., Jarrah, A. M., Abu Khurma, O., & Gningue, S.M. (2024). Unveiling the potential: A systematic review of ChatGPT in transforming mathematics teaching and learning. Eurasia Journal of Mathematics, Science and Technology Education, 20(12), em2555. https://doi.org/10.29333/ejmste/15739
Brousseau, G. (1997). Theory of Didactical Situations in Mathematics. N. Balacheff, M. Cooper, R. Sutherland, & V. Warfield (Eds.) Kluwer Academic Publishers, Dordrecht.
Brousseau, G., & Gibel, P. (2005). Didactical handling of students’ reasoning processes in problem solving situations. Educational Studies in Mathematics, 59(1–3), 13–58. https://doi.org/10.1007/s10649-005-2532-y
Carlson, M. P., & Bloom, I. (2005). The cyclic nature of problem solving: An emergent multidimensional problem-solving framework. Educational Studies in Mathematics, 58(1), 45–75. https://doi.org/10.1007/s10649-005-0808-x
Cifarelli, V. V., & Cai, J. (2005). The evolution of mathematical explorations in open-ended problem-solving situations. Journal of Mathematical Behavior, 24(3–4), 302–324. https://doi.org/10.1016/j.jmathb.2005.09.007
Guerra, E., Pena, M., & Araya, R. (2025). The inevitable and unpredictable role of large language models in education: A commentary on Huettig and Christiansen (2024). Cognitive Science, 49, e70105. https://doi.org/10.1111/cogs.70105
Hoth, J., Larrain, M., & Kaiser, G. (2022). Identifying and dealing with student errors in the mathematics classroom: Cognitive and motivational requirements. Frontiers in Psychology, 13, 1057730. https://doi.org/10.3389/fpsyg.2022.1057730
Jonassen, D. H. (1997). Instructional design models for well-structured and ill-structured problem-solving learning outcomes. Educational Technology Research and Development, 45(1), 65–94. https://doi.org/10.1007/BF02299613
Kasneci, E., Sessler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh, G., Günnemann, S., Hüllermeier, E., Krusche, S., Kutyniok, G., Michaeli, T., Nerdel, C., Pfeffer, J., Poquet, O., Sailer, M., Schmidt, A., Seidel, T., Stadler, M., Weller, J., Kuhn, J., & Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, 102274. https://doi.org/10.1016/j.lindif.2023.102274
Kontorovich, I. (2019). Why do students not check their solutions to mathematical problems? A field-based hypothesis on epistemological status. International Journal of Mathematical Education in Science and Technology, 50(7), 1050–1062. https://doi.org/10.1080/0020739X.2019.1650304
Liu, J., Huang, Z., Liu, Q., Ma, Z., Zhai, C., & Chen, E. (2025). Knowledge-centered dual-process reasoning for
math word problems with large language models. IEEE Transactions on Knowledge and Data Engineering, 37(6), 3457–3471. https://doi.org/10.1109/TKDE.2025.3556367
Malik, R., Abdi, D., Wang, R., & Demszky, D. (2025). Scaffolding middle school mathematics curricula with large language models. British Journal of Educational Technology, 56(3), 999–1027. https://doi.org/10.1111/bjet.13571
Matematická olympiáda. (2021). Czech Mathematical Olympiad: 71st year, category Z9, Round I (problem set). Retrieved February 8, 2026, from https://www.matematickaolympiada.cz/media/3492117/z71i-9.pdf
Matematická olympiáda. (2022a). Czech Mathematical Olympiad: 72nd year, category Z9, Round I (problem set). Retrieved February 8, 2026, from https://www.matematickaolympiada.cz/media/3494370/z72i-9.pdf
Matematická olympiáda. (2022b). Czech Mathematical Olympiad: 72nd year, category Z9, Round II (problem set). Retrieved February 8, 2026, from https://www.matematickaolympiada.cz/media/3516462/z72ii-9-zr.pdf
Matematická olympiáda. (2022c). Czech Mathematical Olympiad: 72nd year, category Z9, Round III (problem set). Retrieved February 8, 2026, from https://www.matematickaolympiada.cz/media/3516464/z72iii-9-zr.pdf
Matematická olympiáda. (2024a). Czech Mathematical Olympiad: 74th year, category Z9, Round II (problem set). Retrieved February 8, 2026, from https://www.matematickaolympiada.cz/media/3828323/z74ii-zadani.pdf
Matematická olympiáda. (2024b). Czech Mathematical Olympiad: 74th year, category Z9, Round III (problem set). Retrieved February 8, 2026, from https://www.matematickaolympiada.cz/media/3824271/z74iii-9.pdf
McGalliard, W., & Otten, S. (2025). AI Responses to Challenging Problems and Educator Responses to AI Availability. Digital Experiences in Mathematics Education, 11, 319–332. https://doi.org/10.1007/s40751-024-00167-4
Nováková, H. (2013). Analýza a priori jako součást přípravy učitele na výuku. Scientia in educatione, 4(2), 20–51. https://doi.org/10.14712/18047106.70
Ocak, C., Kopcha, T. J., Hodges, C. B., Sadik, O., & Ozogul, G. (2025). How artificial intelligence will reshape education: Conversations with the educational technology community. TechTrends, https://doi.org/10.1007/s11528-025-01149-y
Pando, M., & León, M. (2025). Mathematics disciplinary literacy: A case study of a bilingual teacher’s interaction with ChatGPT. Language and Education. https://doi.org/10.1080/09500782.2025.2601055
Pankow, L., Kaiser, G., Busse, A., König, J., Blömeke, S., Hoth, J., & Döhrmann, M. (2016). Early career teachers’ ability to focus on typical students’ errors in relation to the complexity of a mathematical topic. ZDM Mathematics Education. 48, 55–67, https://doi.org/10.1007/s11858-016-0763-2
Papadopoulos, I., & Dagdilelis, V. (2008). Students’ use of technological tools for verification purposes in geometry problem solving. The Journal of Mathematical Behavior, 27(4), 311–325. https://doi.org/10.1016/j.jmathb.2008.11.001
Pepin, B., Buchholtz, N., & Salinas-Hernández, U. (2025). A scoping survey of ChatGPT in mathematics education. Digital Experiences in Mathematics Education, 11, 9–41. https://doi.org/10.1007/s40751-025-00172-1
Pólya, G. (1945). How to solve it. Princeton University Press.
Rendl, M., Vondrová, N., Hříbková, L., Jirotková, D., Kloboučková, J., Kvasz, L., Páchová, A., Pavelková, I., Smetáčková, I., Tauchmanová, E., & Žalská, J. (2013). Kritická místa matematiky na základní škole očima učitelů. Univerzita Karlova, Pedagogická fakulta.
Schoenfeld, A. H. (1985). Mathematical problem solving. Academic Press.
Stannard, A., & Foster, C. (2025). Secondary school mathematics teachers’ accuracy at predicting student errors. School Science and Mathematics. https://doi.org/10.1111/ssm.18404
Suyitno, A., Suyitno, H., Rochmad, & Dwijanto. (2018). Use of open-ended problems as the basis for the mathematical creativity growth disclosure of student. Journal of Physics: Conference Series, 983(1), 012110. https://doi.org/10.1088/1742-6596/983/1/012110
Turmuzi, M., Azmi, S., & Kertiyani, N. M. I. (2026). ChatGPT in school mathematics education: A systematic review of opportunities, challenges, and pedagogical implications. Teaching and Teacher Education, 170. https://doi.org/10.1016/j.tate.2025.105286
Urban, M., Lukavský, J., Brom, C., Hein, V., Svacha, F., Děchtěrenko, F., & Urban, K. (2025). Prompting for creative problem-solving: A process-mining study. Learning and Instruction, 99. https://doi.org/10.1016/j.learninstruc.2025.102156
Zhang, X., Zhang, C., Sun, J,. & Xiao, J. (2025). EduPlanner: LLM-based multi-agent systems for customized and intelligent instructional design. IEEE Transactions on Learning Technologies, 1–11. https://doi.org/10.1109/TLT.2025.3561332
Zhuang, Y., & Zhang, S. (2025). Integrating ChatGPT in mathematics teacher education: AI-based simulation role-playing to support practice-based teaching. International Journal of Artificial Intelligence in Education, 35, 3873–3895. https://doi.org/10.1007/s40593-025-00519-0

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Copyright (c) 2026 Jiří Přibyl, Michaela Tichá
