A transparent document separating what T-Twice has implemented, what was observed in the proof of concept, and what still needs rigorous empirical validation.
The POC database gives early usage signals. These numbers are useful for product validation, but they are not statistical evidence of learning impact.
The environment includes four calibrated mathematical levels and several demo or anonymised student identifiers. It was built to test the interaction model: students write reasoning first, T-Twice diagnoses the reasoning, then asks a targeted question.
The profiles stored in the POC are learning-behaviour profiles, not medical diagnoses. They indicate how students interact with mathematical reasoning, writing, notation and proof structure.
The most frequent events were about writing, hypotheses, theorem use, and rigor. This supports a core T-Twice idea: reasoning failure needs a finer vocabulary than right or wrong.
The largest category was poor mathematical writing. That does not mean students were incapable of understanding the mathematics. It means the POC repeatedly surfaced a gap between partial understanding and rigorous expression.
This validates the need for guided proof structure, contextual notation support, natural-language conversion, and a clear separation between reasoning quality and writing quality.
The POC stores profile categories that describe learning behaviour. They are not medical labels and should not be interpreted as diagnoses.
Good mathematical intuition, but insufficient formal precision in writing.
Conceptual understanding appears stronger than proof production.
Procedures are applied before hypotheses are understood.
Frequent confusion around quantifiers, implications or equivalences.
Some students did not have enough repeated events for a profile.
These are design observations from the POC. They are useful because they show where the product creates a different learning behaviour from answer-giving AI.
Partial feedback such as “your reasoning is correct at 80%” reopened the exercise instead of closing it. Students were pushed to inspect one precise weakness rather than receive the final answer.
Targeted questions encouraged rereading, correcting a missing hypothesis, identifying a forgotten case, and reformulating the proof more rigorously.
Many blocks appeared at the level of formalisation, notation, proof structure or writing effort — not only at the level of mathematical understanding.
Natural-language-to-symbol conversion reduced input friction. The keyboard is not only an interface feature; it protects the continuity of mathematical thought.
The POC includes calibrated classes, professor comments and professor corrections, confirming that T-Twice is not just a chatbot but a teacher-aligned learning space.
The strongest engagement signal was not badges or points. It was the student seeing how they reason, where they repeatedly block, and what they can improve.
This section separates live, inspectable product work from documented concepts that are not yet implemented as working POC features.
The core engagement systems are implemented in the POC and can be inspected through the live platform and logged interactions.
The neuroadaptive system is designed and documented, but it is not yet implemented as an automated real-time adaptation engine in the current POC.
This is the credibility section. The point is to show discipline: T-Twice has promising qualitative signals, not proof of impact at scale.
Large-scale empirical validation is one of the first roadmap milestones once the project is funded.
Compare T-Twice against generic AI and no-AI conditions with pre-defined outcome measures.
Measure whether reasoning quality improves across several weeks and whether AI dependency decreases.
Run the evaluation with academic supervision and publish results regardless of outcome.
T-Twice is not yet a proven large-scale educational intervention. It is a working proof of concept with early qualitative evidence suggesting that AI can be designed to protect reasoning instead of replacing it.