Skip to content
Published on

Stoic Philosophy and Engineering Decision Making: Better Decisions Through the Dichotomy of Control, Premortem, and Negative Visualization

Authors
  • Name
    Twitter

Stoic Philosophy and Engineering Decision Making

Introduction: Why Stoic Philosophy for Engineers

Software engineers make dozens of decisions every day. Which tech stack to choose, whether to refactor or rewrite, whether to rollback or hotfix during an incident. These decisions require more than technical knowledge alone. We need a thinking framework that enables judgment under uncertainty, emotional stability, and long-term perspective.

About 2,000 years ago, Greek and Roman Stoic philosophers grappled with exactly these problems. Do not waste energy on what you cannot control (Epictetus). Visualize the worst-case scenarios in advance (Seneca). Record and reflect on your judgments daily (Marcus Aurelius). These principles are remarkably applicable to modern engineering decision-making.

This article maps core Stoic principles to software engineering and provides practical frameworks for everyday decision-making.

"It is not things that disturb us, but our judgments about things." - Epictetus, Enchiridion


Chapter 1: The Dichotomy of Control

The Core Teaching of Epictetus

The most fundamental principle of Stoic philosophy is the Dichotomy of Control. In his Discourses, Epictetus states:

"Some things are within our power, while others are not. Within our power are opinion, motivation, desire, aversion - in a word, whatever is of our own doing. Not within our power are our body, our property, reputation, office - in a word, whatever is not of our own doing."

Applied to engineering, we can categorize things as follows.

What Engineers Can Control

  • Code quality: Test coverage, code review processes, refactoring decisions
  • Architecture design: System structure, technology choices, interface definitions
  • Communication: Documentation, RFC writing, stakeholder engagement
  • Learning: Exploring new technologies, post-incident analysis, knowledge sharing
  • Process improvement: CI/CD pipelines, monitoring, alert configuration

What Engineers Cannot Control

  • External service outages: AWS region failures, third-party API disruptions
  • Requirement changes: Business pivots, regulatory changes
  • Team composition changes: Key personnel departures, organizational restructuring
  • Hardware failures: Disk failures, network equipment issues
  • User behavior: Unexpected usage patterns, traffic spikes

Practical Application: Incident Response

When an incident occurs, many engineers panic. Applying the Dichotomy of Control enables a structured response.

Situation: Production database performance degradation alert

Controllable:
  - Query analysis and optimization
  - Connection pool configuration adjustment
  - Traffic distribution to read replicas
  - Status communication to team members
  - Rollback decision

Not controllable:
  - The performance degradation that already occurred
  - The latency users already experienced
  - Cloud provider's underlying infrastructure state
  - Business stakeholder reactions

Action plan: Focus on controllable items. Accept the rest.

Dichotomy of Control Decision Matrix

DomainControllableNot ControllableStoic Approach
System outageRecovery procedures, communicationThe outage itselfFocus on recovery, no blame
Technical debtRefactoring prioritiesPast design decisionsBest possible choice now
Team conflictMy attitude and communication styleOthers' reactionsClear communication, then acceptance
Deadline pressureScope adjustment, communicationThe deadline itselfPropose realistic scope
Performance issuesProfiling, optimizationUser traffic patternsData-driven improvements

Daily Practice

Spend 5 minutes each morning answering these questions.

What challenges might I face today?
  - [List situations]

Which of these can I control?
  - [Controllable items]

Am I spending energy on things outside my control?
  - [Check]

What actions should I focus on today?
  - [Specific action plan]

Chapter 2: Premortem and Negative Visualization (Premeditatio Malorum)

Seneca's Teaching

Seneca advises his student Lucilius in his Letters:

"We should think about bad things before they happen. The blow that has been anticipated falls more gently."

This is the ancestor of what modern management calls the premortem technique - assuming a project has failed and working backward to identify the causes.

The Premortem Technique: Engineering Application

A typical postmortem is conducted after failure has already occurred. A premortem assumes "this project has completely failed 6 months from now" before the project even starts, and lists possible causes of failure.

Premortem Workshop Guide

Step 1: Setup (5 min)
  "It is now 6 months in the future.
   Our project has completely failed.
   Why did it fail?"

Step 2: Individual Writing (10 min)
  Each person writes failure causes on sticky notes.
  Write at least 5 causes.
  Include technical, organizational, and external factors.

Step 3: Sharing and Categorization (15 min)
  Share all causes and categorize them.
  Merge duplicates and add missing items.

Step 4: Prioritization (10 min)
  Rank by Likelihood and Impact.
  Select the top 5 risks.

Step 5: Mitigation Planning (20 min)
  Develop specific mitigation plans for each risk.
  Assign owners and deadlines.

Negative Visualization for System Design

Negative Visualization systematically explores the question "What if the worst happens?" This is also the philosophical foundation of Chaos Engineering.

Negative Visualization System Design Checklist

Database:
  - What if the master DB goes down?
  - What if replication lag exceeds 30 minutes?
  - What if disk is 100% full?
  - What if backups are corrupted?

Network:
  - What if DNS stops responding?
  - What if the load balancer goes down?
  - What if an entire availability zone fails?
  - What if inter-region networking breaks?

Application:
  - What if a memory leak occurs?
  - What if the code enters an infinite loop?
  - What if a third-party API has no response for 30 seconds?
  - What if the application crashes immediately after deployment?