Assessment March 31, 2025

Revolutionizing Assessment in the AI Era: How Togeder's Platform Transforms Math Education Evaluation

By Prof. Baruch Schwarz

The Assessment Crisis I'm Facing

Like many of my colleagues in higher education, I've been struggling with a fundamental challenge: How do I fairly evaluate my students in an era where AI tools can instantly generate answers to almost any question I pose? The traditional exams I've relied on for years suddenly feel obsolete, especially in my online courses where monitoring is nearly impossible.

I found myself in a paradoxical position - discouraging students from using AI during assessments while knowing full well they'll need these same tools in their professional lives. After discussions with fellow faculty members facing the same dilemma, I began to question our approach: Shouldn't we be measuring students' abilities to leverage the cultural tools available to them rather than artificially restricting them?

My New Assessment Solution

This semester, I decided to try something different in my graduate mathematics education course. With 16 motivated students and a topic I'm passionate about, it seemed like the perfect testing ground for a new assessment strategy.

For years, I've taught this course covering trends in mathematics education research - from mathematical abstraction to problem-solving approaches. Like many of us post-pandemic, I've settled into a hybrid format, combining in-person sessions with synchronous Zoom meetings that I record for students who can't attend.

My typical online session follows a pattern I've refined over time:

I begin with a 30-minute lecture on a research topic in mathematics education
Students then break into groups to apply these concepts to specific solve problems
We reconvene for reflection, with groups sharing their work and me providing guidance

Discovering Togeder's Potential for Assessing Student's Understanding

When covering mathematical problem-solving, I introduced my students to Polya's model of problem solving (the general stages of problem solving, the importance of heuristics), and metacognitive processes. For our group activities, I began using Togeder, initially just to help me navigate between virtual rooms.

What started as a simple classroom management tool soon revealed much greater potential. I found I could create private channels to communicate with student guides, monitor discussions without interrupting the flow, and - most importantly for my assessment challenge - automatically capture detailed data about each group's interactions.

Reimagining My Final Exam

After seeing the capabilities of this platform, I decided to transform my final exam. I maintained a traditional written component (worth 75%) but dedicated 25% to a group problem-solving activity that would truly test application of knowledge.

For this portion, I selected "The Gardener" problem, a challenging problem with an infinite number of solutions. I was transparent with my students about how I would evaluate them:

"Your participation will be measured individually," I explained, "but most criteria will be evaluated at the group level. After all, in real-world settings, your team succeeds or fails together."

I added:

"The quality of group work depends on you. I will also evaluate the contribution of each of you to the group work"

I developed criteria that mattered to me as an educator:

What I Evaluated

Individual participation - Personal factor (0-1) - Automatically tracked by Togeder
Group balance - 10% - How equitably students contributed
Problem-solving stages - 20% - Did they follow Polya's four steps?
Articulation of methods - 30% - How clearly they explained their approaches
Reasoning processes - 20% - Quality of inductive/deductive reasoning
Solution attainment - 20% - Assessment of the final outcome

I chose a difficult problem with an infinite number of solutions: the problem of the gardener (see figure 1). The complexity of the problem is too high to fit isolate work. Collaboration is particularly suitable for solving it.

My Assessment Experience

As my students worked in groups during the exam, I used Togeder's Arsements Dashboard to move between groups, offering guidance without directly helping with solutions when required. Togeder Arsements Dashboard also was captured everything - conversations, participation metrics, and even summarizing key points.

When it came time to grade, I found the process remarkably efficient. The platform had already:

Calculated each student's participation level
Analyzed the balance of contributions within each group
Prepared transcripts and summaries of each group's work

I still reviewed everything personally - looking at transcripts, examining solution approaches, and making final judgments about the quality of reasoning. But what would have taken me hours of manual note-taking and analysis was now streamlined to about five minutes per group per criterion.

Togeder's Assessment Dashboard showing group analysis

What I've Learned

This experimental approach has changed how I think about assessment. I found that:

My students engaged more authentically when solving problems collectively
The evaluation captured skills that matter in professional settings - collaboration, communication, and critical thinking
The assessment was inherently "AI-resilient" - no AI tool could replace the dynamic interaction and reasoning
I gained insights into my students' thinking processes that traditional exams never revealed. For example, in contrast with traditional exams that measure a yes/no acquisition of knowledge, I could see potentialities, half-baked ideas, good mathematical ideas although computations went wrong, etc.
The time I saved on grading logistics could be invested in providing better feedback

My Path Forward

I've shared this approach with several colleagues in my department, and many are eager to adapt it for their own courses. We're discovering that the assessment crisis brought on by AI can actually push us toward more meaningful evaluation methods.

For my part, I'm planning to expand this approach in my future courses. I'm convinced that embracing collaborative, process-focused assessment offers a path forward that acknowledges the reality of AI tools while still measuring what truly matters - a student's ability to think, collaborate, and apply knowledge in context.

As one of my colleagues remarked after seeing my results, "Maybe the AI revolution isn't threatening our assessment methods - it's finally pushing us to improve them."

AI in Education Assessment Mathematics Education Collaborative Learning