analysis Feb 16, 2026

Code Smells from First Principles

Why code smells exist, grounded in cognitive science and information theory.

156 lines

1	# Code Smells from First Principles
2
3	Let me rebuild this from the ground up, thinking about what code fundamentally is and why certain patterns make it difficult to work with.
4
5	## First Principle: Code Is Communication Across Time
6
7	### The Core Truth
8
9	Code has two audiences:
10
11	1. The computer (which executes it)
12	2. Humans (who read, modify, and reason about it)
13
14	The computer doesn't care about variable names, comments, or structure. It executes bytecode. Every code smell exists because it makes the human's job harder.
15
16	### Why This Matters for Review
17
18	Code review is fundamentally a transfer of mental model. The author has a complete understanding of:
19
20	- What problem they're solving
21	- Why they chose this approach
22	- What alternatives they considered
23	- What edge cases they're handling
24	- What assumptions they're making
25
26	The reviewer starts with zero context and must reconstruct this entire mental model by reading the code.
27
28	This is why "file too long" is a smell: It's not arbitrary. It's about working memory limits.
29
30	## Cognitive Load and Working Memory
31
32	### The Human Constraint
33
34	George Miller's famous paper "The Magical Number Seven, Plus or Minus Two" established that humans can hold roughly 7±2 items in working memory simultaneously.
35
36	Modern research (Cowan, 2001) suggests it's closer to 4 chunks for complex information.
37
38	This is the fundamental constraint that generates most code smells.
39
40	### Long Files/Functions: A Working Memory Problem
41
42	When you review a 2000-line file, you can't hold the entire thing in your head. You must:
43
44	1. Load context (what does this file do?)
45	2. Navigate (where is the relevant code?)
46	3. Understand (what does this section do?)
47	4. Remember (what did I see 500 lines ago?)
48	5. Connect (how does this relate to that other thing?)
49
50	Each step consumes working memory. By the time you're on step 5, steps 1-2 have been pushed out.
51
52	From first principles: A function should fit in one "chunk" of understanding.
53
54	## Cyclomatic Complexity: Exponential Path Explosion
55
56	### The Fundamental Problem
57
58	Cyclomatic complexity isn't just about "number of if statements." It's about the number of possible execution paths through code.
59
60	Mathematical reality:
61
62	- 1 if statement = 2 paths
63	- 2 nested if statements = 4 paths
64	- 3 nested if statements = 8 paths
65	- n nested conditions = 2^n paths
66
67	This is exponential growth.
68
69	### Why This Destroys Reviewability
70
71	When you review code, you're mentally executing it with different inputs. You're asking:
72
73	- "What happens if this is null?"
74	- "What if this array is empty?"
75	- "What if the user is not logged in?"
76
77	With high cyclomatic complexity, you cannot mentally trace all paths.
78
79	## Naming: Compression and Decompression
80
81	### The Information Theory Perspective
82
83	A variable name is compressed information about what the variable represents.
84
85	When you read code, you're constantly:
86
87	1. Decompressing names into concepts
88	2. Holding those concepts in working memory
89	3. Using those concepts to understand logic
90
91	Bad names have high decompression cost.
92
93	### The Context Window Problem
94
95	Single-letter variables work in tiny scopes. 'i' is immediately understood as a loop index. But what is 'i' 50 lines later?
96
97	From first principles: Variable name quality should scale with scope size. Larger scope = more descriptive name required, because the name must survive longer in memory or be re-decompressed when encountered again.
98
99	## Deep Nesting: The Indentation Tax
100
101	### The Cognitive Cost of Indentation
102
103	Each level of indentation represents a context you must remember.
104
105	At each level, you're asking: "Under what conditions does this code execute?"
106
107	By 4 levels deep, you're juggling 4+ conditions in working memory while also trying to understand what the code does.
108
109	### The Early Return Pattern: Reducing Cognitive Load
110
111	From first principles: Each early return eliminates a dimension of conditional space. Instead of 2^n paths, you have n+1 paths (one for each guard plus the success path).
112
113	## Separation of Concerns: Modularity as Comprehension
114
115	### Why Mixed Concerns Are Impossible to Review
116
117	When code does multiple things, you cannot understand or verify any single thing in isolation.
118
119	From first principles: Separation of concerns converts multiplication of complexity into addition of complexity.
120
121	- Mixed: Must understand validation AND crypto AND database AND email simultaneously (multiplicative)
122	- Separated: Can understand validation, then crypto, then database (additive)
123
124	Mathematical: Understanding cost of N concerns:
125	- Mixed: O(concern₁ × concern₂ × ... × concernₙ)
126	- Separated: O(concern₁ + concern₂ + ... + concernₙ)
127
128	## The Fundamental Theorem of Code Review
129
130	Code is reviewable if and only if:
131
132	1. It fits in working memory (size constraints)
133	2. Its behavior is traceable (complexity constraints)
134	3. Its intent is clear (naming/documentation constraints)
135	4. Its dependencies are explicit (coupling constraints)
136	5. Its correctness is verifiable (testability constraints)
137
138	Every code smell is a violation of one or more of these principles.
139
140	## The Economics of Code Quality
141
142	Code is written once but read many times. Research suggests code is read 10x more often than it's written.
143
144	Poor code has a compounding cost:
145
146	Total Cost = Writing Cost + (Reading Cost × Number of Reads × Number of Readers)
147
148	The code that took 2x longer to write saves 3x total time.
149
150	## Toward Better Code: Design Principles from First Principles
151
152	1. Minimize Cognitive Load — Code should require minimal working memory to understand.
153	2. Maximize Locality — Related concepts should be close together.
154	3. Make Dependencies Explicit — What code depends on should be obvious.
155	4. Optimize for Reading — Code is read 10x more than written. Choose clarity over cleverness.
156	5. Enable Verification — Correct code should be provably correct.

Back to prompts