I’ve been tracking the "AI is taking all remote work" narrative, but it felt like the raw data was missing a major variable: Structural Complexity.
I audited 87 micro-tasks using the Scale AI RLI benchmarks and O*NET labor baselines. By decomposing 10 large projects into their atomic requirements, I found a sharp "Complexity Kink" at 10^3 instruction entropy. Below this line, agents are highly applicable; above it, their marginal productivity collapses, and the "Human Agency Moat" actually widens.
Technical details:
- Metric: Instruction Entropy (E) = Solution_Tokens / Requirement_Tokens
- Model: Heckman Two-Step (Correcting for benchmark selection bias, p=0.012)
- Robustness: Rosenbaum Sensitivity Analysis (Stable up to Gamma = 1.8)
- Standardization: Applied 5% Winsorization to normalize for "human slop" in self-reported freelance durations.
The full paper (PDF) and code are in the repo.
I'd love to hear from people building production agents—are you seeing this 10^3 threshold in your own error logs, or am I over-simplifying the "connective tissue" problem?
1 Comments
I audited 87 micro-tasks using the Scale AI RLI benchmarks and O*NET labor baselines. By decomposing 10 large projects into their atomic requirements, I found a sharp "Complexity Kink" at 10^3 instruction entropy. Below this line, agents are highly applicable; above it, their marginal productivity collapses, and the "Human Agency Moat" actually widens.
Technical details:
The full paper (PDF) and code are in the repo.I'd love to hear from people building production agents—are you seeing this 10^3 threshold in your own error logs, or am I over-simplifying the "connective tissue" problem?