Assessing student reasoning in STACK

George Kinnear, The University of Edinburgh

Re-posted from https://blogs.ed.ac.uk/georgekinnear/2026/06/23/assessing-student-reasoning-in-stack/

One way to assess small steps of reasoning is using a multiple-choice question (MCQ), where you provide a list of reasons for students to choose from.

In this example, I was asking for the reason why a given series does not converge. I provided various plausible-looking incorrect reasons, along with one correct answer ("s_n diverges"). Notice that I also included a "some other reason" option, where students could enter a reason of their own.

Does the series converge? Multiple-choice question with "some other reason" as an option.

How to add an "other" option

I implemented this by appending an additional option in the definition of the MCQ in the question variables -- the list of given reasons are permuted, but the "other" option always comes last:

ta6: append(
  random_permutation([
    [castext("{@a*k+b <= (a+b)*k^2@}, so the series converges by comparison with a \\(p\\)-series"), false],
    ["\\(s_n\\) converges to 0", false],
    ["\\(s_{n+1}\\geq s_1\\) for all \\(n\\in\\mathbb{N}\\) so the partial sums are bounded below", false],
    ["\\(s_n\\) is an increasing sequence", false],
    ["\\(s_n\\) diverges", true]
  ]),
  [["some other reason (please type it below)", false]]
);

Then in the question text, I used a reveal block to only show the text box (ans7) when the student selects the "other" option in the MCQ (it's the 6th item in the list of options, hence the check for value="6"):

Does the series converge?

[[input:ans5]][[validation:ans5]] because
[[input:ans6]][[validation:ans6]]
[[reveal input="ans6" value="6" ]] [[input:ans7]][[validation:ans7]] [[/reveal]]
[[feedback:prt5]]

Grading the "other" answers

To give students a textarea to type in, I had made the text box (ans7) a "Notes" input. Unfortunately, I only realised after the students had completed the quiz, that you can't refer to any Notes inputs in the PRTs for grading! (This means there is no way to automatically grade the Notes input.)

From Stack 4.13.0 There is a "Free-text" input type. With that input type, students get a textarea to type in, and you can access its value as a string in the PRT.

To assess the "other" reasons given by my students, I've constructed some regular expressions - here is what I came up with (explained in more detail below!)

valid_reasons: [
  "Sn is not bounded.*",
  ".*(s_n|S_n|sn|Sn|partial sum).*(go to infinity|do not have bound|not bounded above|unbounded|unbouneded|grow without a bound|not bounded|no upper bound).*",
  ".*(s_n|S_n|sn|Sn|partial sum).*(\\&rarr\\;|go(es)? to|tends? to|approach(es)?|diverge(s)?|going towards).*(infinity|inf)?.*",
  ".*(terms|Terms|sequence|Sequence).*(does not|do not|doesn.?t).*(approach|approaches|tends? to).*(0|zero).*"
];
valid_reason_given: maplist(lambda([pat], regex_match_exactp(pat, ans7)), valid_reasons);

Then in the PRT, I check if the student has selected the "other" option, and if they have, I give them full points if apply("or",valid_reason_given) is true.

How to come up with the regular expression?

For my quiz, students did not get immediate feedback, so I could see all of the students' answers when I developed the grading approach.

I went to the "Analyze responses" page on the STACK question's dashboard, where the "Inputs" tab gives a handy summary of all of the distinct responses for each of the inputs. For ans7, I had a big list with lines like the following:

1 (  0.29%); &quot;Sn is not bounded, by the boundedness test for series, the series is therefore divergent. &quot;
1 (  0.29%); &quot;The partial sums go to infinity as n goes to infiinty&quot;
1 (  0.29%); &quot;Sn is an increasing sequence that is not bounded above&quot;
1 (  0.29%); &quot;The partial sums do not have bound.&quot;

I used our in-house LLM service, ELM, to help with the tedious job of reformatting this. I pasted in the full list of lines like the above, and asked

please turn this into a list of maxima strings. e.g. it should start: ["Sn is not bounded, by the boundedness test for series, the series is therefore divergent.", "The partial sums go to infinity as n goes to infiinty", .... ];

I then pasted that list into Maxima on my desktop, and read through it to separate the entries into two lists - one called true_reasons and one called false_reasons - based on whether I thought the student's reasoning was correct.

I iteratively developed the list of valid_reasons by running the following code, and adding entries to the valid_reasons to cover any cases that were appearing in the not_matched list:

valid_reasons: [
  "Sn is not bounded.*",
  ".*partial sums.*(go to infinity|do not have bound|not bounded above).*"
]$
not_matched :
  sublist(
    true_reasons,
    lambda([ans7],
      not apply(
        "or",
        maplist(lambda([pat], regex_match_exactp(pat, ans7)), valid_reasons)
      )
    )
  );

Note that this relies on STACK's regex_match_exactp function which you can find here: https://github.com/maths/moodle-qtype_stack/blob/f9c01cc70233137d6223233384a570062c124856/stack/maxima/stackstrings.mac#L341

It may also be helpful to give an LLM the list of true_reasons and false_reasons, and ask for a list of regular expressions that matches the true ones but not the false ones - at least as a starting point.

Concluding thoughts

This was very much a proof of concept, and I only had a small sample of student responses (since out of ~350 students, 80% chose the correct MCQ option and only 27 students chose the "other" option). But I was encouraged by how readily I could make the regular expressions fit to what the students wrote, and I'm interested to see how well this approach would scale up (e.g., if the correct answer is "other"...). It may be useful in that case to borrow some ideas from the existing pattern-match question type, which has already been quite successful (e.g., in physics education).