Jack Welch made the bell curve famous in performance management. GE's "rank and yank" system forced managers to bucket employees into top 20%, middle 70%, and bottom 10%, with the bottom exiting each year. That model spread across the Fortune 500 in the 2000s, then quietly collapsed. Microsoft dropped stack ranking in 2013. GE itself retired it in 2015. The research didn't support the theory, the effects on teams turned out to be destructive, and the assumption that workforce performance naturally fits a bell curve was never actually true in most organizations.
How the Bell Curve Applies to Performance Ratings The statistical bell curve (normal distribution) has most observations clustered around a mean with fewer at the extremes. Forced distribution (also called stack ranking or forced ranking) imposes that pattern on performance ratings by requiring managers to rate a fixed percentage of employees at each level: for example, 20% top performers, 70% average, 10% low performers.
The design logic is straightforward. Without a forced distribution, most managers rate everyone above average (the "Lake Wobegon effect"), which makes the rating system useless for differentiation. Forcing distribution pushes managers to actually discriminate between performers.
Why Most Large Companies Abandoned Forced Distribution The theory sounds clean; the practice created predictable problems. In a high-performing team, forced distribution required managers to label strong performers as "needs improvement" solely because someone had to fill the bottom slot. In a weak team, the reverse: managers had to label mediocre performers as "top." Either case undermined the rating's meaning.
Research on forced distribution systems has documented negative effects on collaboration (employees compete rather than cooperate), engagement scores, and voluntary turnover among high performers who don't want to work under the system. McKinsey's own research on workforce performance, among others, pushed back on the assumption that employee performance follows a normal distribution at all; more recent studies suggest it often follows a power law, with a small number of very high performers and a long tail.
Does Anyone Still Use Stack Ranking? A small number of large companies still use versions of forced distribution, often modified (soft distributions, ranges rather than hard percentages). Public-sector organizations, some financial services firms, and certain consulting firms maintain stack-ranking elements. Most Fortune 500 companies that tried it in the 2000s have either dropped it or significantly softened it by 2026.
What Replaced the Bell Curve in Modern Performance Management Most modern performance review systems use rating scales without forced distribution, often paired with calibration sessions where managers discuss ratings across teams to identify inconsistencies without imposing artificial quotas. Some companies have moved away from numerical ratings entirely, replacing them with narrative reviews and ongoing feedback.
The core insight from post-bell-curve systems is that differentiation is valuable when it reflects real performance differences, not when it's manufactured to fit a statistical pattern. Strong calibration, clear competency anchors like BARS , and manager training on how to rate without bias do more to produce meaningful ratings than any forced distribution ever did.
Using Distribution Data Without Forcing the Bell Curve Distribution data is still useful diagnostically even when not imposed. If a manager rates 95% of their team as top performers and peer managers rate 30%, that's useful information for HR and the manager's own leadership. The response isn't to force redistribution; it's to investigate whether the manager is actually developing stronger talent, failing to differentiate, or being overly generous for conflict-avoidance reasons.
Ongoing calibration, external benchmarks, and tying ratings to observed behaviors (not aggregate percentages) produce the differentiation forced distribution was supposed to create, without the collateral damage. The bell curve remains a useful statistical lens on rating data; imposing it as a rating target is what the research has moved away from.