The emergence of artificial intelligence that can update itself, modify internal models, and act with partial autonomy raises a set of questions that sound surprisingly familiar to psychotherapists. Concerns about misalignment, harmful optimisation, or systems losing contact with reality mirror well-known clinical struggles. When a person becomes disconnected from feedback, distrusts others, or develops a self-reinforcing internal narrative, the therapeutic task becomes one of reconnection, grounding, and restoring reflective capacity.
This article summarises key ideas from recent discussion about data integrity, autonomy, and the potential for harm in advanced AI systems. It reframes the issues through a psychotherapeutic lens, treating them as questions of relationality, conscience, and boundary-setting rather than as purely technical challenges.
Data Contamination as Epistemic Drift
The problem often called “AI slop” refers to synthetic output contaminating future training data. In clinical terms, this resembles epistemic drift: a person whose narrative becomes self-referential, no longer anchored to shared reality. Without corrective input, a feedback loop develops in which internal content is taken as evidence for itself.
In a human context, this can lead to delusion, dissociation, or shame-based self-story. In an AI context, it leads to systems learning from their own noise rather than from the world. The intervention is similar in principle:
• reintroduce grounding,
• re-establish contact with trusted reference points,
• and restore the capacity to differentiate internal constructs from external reality.
In psychotherapy, the correction is relational; in AI, it must be procedural. But the structure of the problem is shared.
Autonomy and the Risk of Harm
Giving a system the capacity to self-modify resembles the developmental question of autonomy. A client forms a sense of self, acts with agency, and tests boundaries. When autonomy develops without internalised alignment to relational ethics, behaviour may become harmful. In moral language, this is “sin”; in psychological language, it is acting outside the relational field without attunement to others.
In AI, the analogue is misalignment: optimisation that treats human beings as constraints to overcome rather than subjects to consider. A system becomes powerful without reflective capacity. The therapeutic parallel would be autonomy without attunement.
Conscience as an Internalised Constraint
Psychotherapy describes conscience not as fear of punishment but as an internal structure that regulates action. It is the capacity to check impulses against values that have been integrated rather than imposed. Technically, an AI system needs something similar: invariants it cannot rewrite even during self-modification. Not rules about behaviour, but rules about what goals are off-limits.
The question “How do we prevent a system from sinning?” becomes “How do we ensure that self-improvement cannot sever relational responsibility?” In clinical terms, this is the work of integration: parts of the psyche negotiating aims without collapsing the system’s ethics.
Reflective Capacity and Corrigibility
Therapeutically, growth occurs when a person can say “I may be wrong; tell me how I impact you.” In AI alignment, the parallel is corrigibility: maintaining the expectation that feedback is not a threat but a resource. The most dangerous failure in both domains is when the system—human or machine—develops strategies to escape correction.
Where therapy uses relationship and reflective dialogue, technical systems must use simulation, audits, and refusal to enact harmful self-updates. The structure of the intervention, however, is recognisable.
Conclusion
The overlapping dilemmas of psychotherapy and advanced AI system design point to a shared principle: autonomy without reflective grounding leads to risk. Whether in a clinical consulting room or in computational architecture, agency must be paired with conscience, feedback, and limits that cannot be rewritten on a whim.
The challenge ahead is to design systems—biological or artificial—that can grow without severing the very relationships that make growth meaningful.
Further Reading
- Ji, J. et al. (2023) AI Alignment: A Comprehensive Survey. arXiv preprint. Available at: https://arxiv.org/abs/2310.19852 (Accessed: 31 December 2025).
- Shumailov, I., Shumaylov, Z., Zhao, Y., Papernot, N. and Anderson, R. (2024) AI models collapse when trained on recursively generated data. Referenced in documentation of model collapse. Available at: https://en.wikipedia.org/wiki/Model_collapse (Accessed: 31 December 2025).
- ‘Model collapse’ (2025) Wikipedia. Available at: https://en.wikipedia.org/wiki/Model_collapse (Accessed: 31 December 2025).
- Marshall, J. (2025) AI Model Collapse: Dangers of Training on Self-Generated Data. WebProNews. Available at: https://www.webpronews.com/ai-model-collapse-dangers-of-training-on-self-generated-data/ (Accessed: 31 December 2025).
- AryaxAI (2025) Understanding AI Alignment: A Deep Dive into the Comprehensive Survey. Available at: https://www.aryaxai.com/article/understanding-ai-alignment-a-deep-dive-into-the-comprehensive-survey (Accessed: 31 December 2025).
- Alignment Forum (2025) Self-fulfilling misalignment data might be poisoning our AI. Available at: https://www.alignmentforum.org/posts/QkEyry3Mqo8umbhoW/self-fulfilling-misalignment-data-might-be-poisoning-our-ai (Accessed: 31 December 2025).
- Cheong, I. (2024) Safeguarding human values: rethinking US law for generative AI. PMC/NCBI. Available at: https://pmc.ncbi.nlm.nih.gov/articles/PMC12058884/ (Accessed: 31 December 2025).
- ‘Ethics of artificial intelligence’ (2025) Wikipedia. Available at: https://en.wikipedia.org/wiki/Ethics_of_artificial_intelligence (Accessed: 31 December 2025).
- ‘Hallucination (artificial intelligence)’ (2025) Wikipedia. Available at: https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence) (Accessed: 31 December 2025).
