Category Archives: Science

High vs. Low Load Training NOT to Failure

This is a pretty cool study. It’s the first paper comparing hypertrophy in high rep/low load and low rep/high load training NOT to failure.  This is important because people often claim that muscle growth is the same between high and low load training only if the low load sets are taken to failure.  However, as far as I’m aware, no one had actually investigated that idea.  We know that if sets are taken to failure, hypertrophy is similarly.  However, we don’t know that hypertrophy wouldn’t be similar if the sets stopped near (but not at) failure.  One paper I recently reviewed in MASS came close to investigating this question, but the groups in this study not going to failure still went to what we’d call an RPE 10 (you didn’t fail a rep, but you couldn’t have done one more), which is how most people in the gym conceptualize going to failure anyways.

Briefly:  A group of untrained young men performed knee extensions for 8 weeks.  One group did 3 sets of 8 with 80% of their 1RM (something that should be pretty challenging, but not all the way to failure), while one group did 12 sets of 8 with 30% of their 1RM (a ton of sets, but each individual set should have been pretty easy).

The researchers looked at isometric and dynamic strength increases and thickness of the rectus femoris (assessed via ultrasound).  They took a few other measures, but those are the ones that are most important to us.  1RM tests occured once every two weeks.

They found no statistically significant differences in hypertrophy, but the raw percentage change seemed to favor the group using heavier loads (20.4% for the high load group vs. 11.3% for the low load group), so there may actually be a meaningful difference that couldn’t be detected due to low statistical power.  Furthermore, strength gains were almost identical (40.9% for the low load group and 36.2% for the high load group for 1RM; 24% for the low load group and 25.5% for the high load group for isometric strength).

Let’s unpack this a bit.

I’m personally not a fan of how they went about keeping the high rep group from failure.  There would have been a few “good” options (ordering is just my opinion):

Good:  Match reps per set and relative volume.  In this study, the 30% group did 4x as many sets, leading to a much higher relative volume.  80%×3×8=19.2.  30%×12×8=28.8.  They could have, instead, compared 40% to 80%, and had the 40% group do 6 sets of 8.  40%×6×8=19.2.

Better:  For each individual, have them perform a rep max test with their assigned load, then match the number of sets between groups, and assign reps for each individual based on a given percentage of the maximum number of reps possible with the training load.  For example, if you put that percentage at 75% (doing 75% as many reps as they could do with a rep max test), and someone did 8 reps with 80%, you could have them do 8×0.75=6 reps per set.  If someone did 28 reps with 30%, you could have them do 28×0.75=21 reps per set.  Each set would get progressively closer to failure, but using something like 70-75% of max reps should keep people away from failure over 3 sets.  Loads and reps could be reassessed after every max test.

Best:  Use reps in reserve.  People can accurately assign reps in reserve, even with low loads, within 1-2 reps to failure.  They could have just matched the number of sets and kept people with an RIR of 1-2.

As it was, they didn’t equate any training parameter except for reps per set (which was probably the least important thing to equate), and didn’t even use a protocol that was ecologically valid (I doubt many people do 12 sets of knee extensions).

All of that being said, there was one thing that really interested me about this study. This study, much like the 50% vs. 80% Morton paper last year, included fairly frequent 1RM tests and, like the Morton paper, found similar strength gains in spite of big differences in loading. It would be cool to see a study specifically designed to investigate whether semi-frequent 1RM tests are enough to mitigate the strength advantage of high load over low load training.

Such a study could include 4 groups:

  1. A high load group only testing 1RM pre- and post-training (maybe week 0 and 12)
  2. A low load group only testing 1RM pre- and post-training
  3. A high load group testing 1RM semi-frequently (maybe week 0, 3, 6, 9, and 12), and…
  4. A low load group testing 1RM semi-frequently.

If it turned out that simply hitting a 1RM every 2-4 weeks was enough to ensure a solid rate of progress, that would be very useful to know.  It would allow coaches and athletes a LOT more flexibility in designing training programs, especially in situations where strength is an important goal but not the main goal.

p.s. I plan on doing more of these short study write-ups since my time to write full articles has been severely curtailed with grad school.  Any time I DO have to write is devoted to MASS at this point.

p.p.s. That probably won’t happen, but a boy can dream.

Group Data Don’t Tell You Much About Individuals

I’ve been poking around in the USAPL dataset my last article was based on, and I came across something that’s worth a quick share.

I was interested in seeing whether your current level of strength is predictive of the rate you could expect to make gains in the short-to-medium term.  To investigate, I picked out everyone who showed up in the database multiple times.  Since the date of each competition was reported, I could calculate the rate of strength gain or loss per day between meets using this formula:  (second total – first total)/(first total × days between meets) = rate of strength gains.

For example, if someone totalled 1200 on March 4th, and then totalled 1260 on May 13th of the same year, their rate of strength gains would be (1260 – 1200)/(1200 × 70) = 0.07% stronger per day.

The only people I removed from the dataset were very clear outliers (i.e. z-scores of ±8 or more) who pretty obviously either had a good meet and then got injured in their next meet (so they posted a total way below what they were capable of in the second meet, showing an unrealistic loss in strength), or vice versa (they posted a total way below what they were capable of in the first meet, showing an unrealistic gain in strength).

I used allometrically scaled strength instead of absolute strength to account for body size (i.e. you may expect that someone who weighs 150lbs would struggle to add to a 1500lb total, whereas someone who’s 300lbs would have no problem adding to a 1500lb total).

Here were the results for men:

And here were the results for women:

If you looked at these two graphs and thought to yourself, “hey, the relatively weaker people may have made slightly faster progress, but there’s really not much of a relationship at all here,” you’d be correct.  The r2 value with a simple linear regression was only 0.06 for men, and 0.12 for women, meaning variation in strength only explained 6-12% of the variation in rate of progress.  That’s counterintuitive because we assume that weaker people will predictably make faster gains than stronger people – while that relationship did show up, I’d wager that most people wouldn’t expect initial relative strength to be such a weak predictor

Now, however, let’s look at the data expressed another way.  In the graphs below, I grouped people based on their initial relative strength levels.

Here’s how it looks for men:

And here are the women:

These graphs come from the exact same data, without any sort of underhanded manipulation.  I decided how large to make each range of strength to group people together before analysis, and I didn’t tinker with those ranges or go back and fiddle with any data to get a better fit.  On a group level, rate of progress declines almost perfectly linearly for men as relative strength increases.  For the women, there’s an almost perfect exponential decrease in rate of strength gains as relative strength increases.

However, these very clear group trends mask the tremendous variability between individuals.

This is a challenge we need to deal with when approaching scientific data.  Studies tend to report changes at the group level and differences between groups, but as you can see, there’s a lot of individual variation lurking beneath the surface.

Assuming that the characteristics of a group accurately describe all the individuals in that group is (depending on the circumstance) either a fallacy of division or an ecological fallacy.  They’re easy traps to fall into, even among bright people.  When you discuss the results of a study, I think it’s important to work the phrase “on average” into your discussion pretty liberally when describing group-level results.  As a reader/listener, you should assume those “on average”s are peppered throughout the discussion, even when they’re not explicitly stated.  That will help keep you from falling into this very pervasive and sneaky trap.

Now, don’t get me wrong:  it’s not that there’s anything wrong with knowing/reporting group averages.  They’re extremely valuable as a starting point.  As a coach, knowing that relatively weaker people tend to progress faster than relatively stronger people is worthwhile when setting up a program with a set progression scheme, for example.  You may want to start weaker people off with a faster progression scheme, and stronger people with a slower progression scheme.  However, you need to be aware that the weaker person may need to progress slower, or the stronger person may be able to handle a faster progression scheme – they can be different from the group averages without being particularly abnormal.  The same applies to essentially any training variable, from volume to intensity to frequency to exercise selection, etc.  Group data are great for establishing a starting point, but individual experimentation is needed past that point.

If you’re interested in science and value thorough and honest data analysis, you’ll probably like the research review I’m launching with Eric Helms and Dr. Mike Zourdos.  You can pick up the first copy for free here, or by clicking the image below.

Thanks again to /u/ferruix for curating the data over at OpenPowerlifting, and /u/TechnoAllah for hooking me up with the complete dataset.  Also, thanks to Andrew Vigotsky for inspiring this article.

What Everyone Gets Wrong About FFMI and the “Natty Limit”

I constantly see the claim that an FFMI of 25 is the “natty limit” of muscularity, and that it’s impossible (or at least unbelievably unlikely) that you can get more muscular than that without the use of steroids.

To backtrack a bit for people who feel like they’re stepping into the middle of a conversation, the Fat-Free Mass Index (FFMI) is a measure of muscularity.  You calculate it by dividing lean body mass (in kg) by height (in meters) squared.

FFMI formula

It’s essentially the same formula as Body Mass Index (BMI), but for lean body mass instead of total body mass.  The higher your FFMI, the more jacked you are.

It’s been proposed by several prominent members of the online fitness community that no drug-free lifters can attain an FFMI above 25 – if someone has an FFMI over 25, you know for sure they’re on drugs.  The less extreme view is that one or two rare outliers may be able to attain an FFMI over 25 without drugs, but doing so would be so incredibly unlikely, that you can still be 99% sure someone’s on the sauce if their FFMI exceeds 25.

In this article, I want to explain why that position is probably wrong or, at the very least, why there’s insufficient evidence to make such a statement.

This is a topic I’ve addressed before, but:

  1. It was in a rather dry methodology section in a previous article.  It wouldn’t surprise me if most people simply skipped this discussion to get to the more exciting stuff.
  2. This is a claim I still see all the time (like, seriously at least once or twice a day), so I think it deserves its own article to debunk it once and for all.

The claim that an FFMI of 25 is the “natty limit” can be traced back to this study:  “Fat-Free Mass in Users and Nonusers of Anabolic-Androgenic Steroids” by Kouri, 1995.

To pull a quote from the discussion of this article:  “In an examination of 157 athletes, comprising 83 steroid users and 74 nonusers, we calculated normalized FFMI using height, weight, and body fat based on skinfold measurements.  With this simple measurement, we found that athletes who had not used steroids all had values of <25.0, whereas a large proportion of steroid-using athletes easily exceeded this limit.”

That seems pretty cut and dry, right?  As I’m sure you can surmise from the introduction, I think we need to dig a bit deeper.  I’ll be pulling a lot of direct quotes from the study, but the full text is available for free (and it’s not overly technical), so I’d encourage you to read it for yourself.

What the Researchers Did

From the study:

“One hundred fifty-six men in a large controlled study of athletes recruited at gymnasiums in the Boston and Los Angeles areas, were administered physical examinations as part of a larger study (14).  These physical examinations included determinations of height, body weight, and body fat, the latter computed from the sum of six skinfold measurements using an equation derived from the data of Jackson and Pollock.”

That doesn’t tell you all that much about the people included in the trial, so I tracked down the prior study that expanded upon the inclusion criteria:

“We advertised in four gymnasiums in the Boston, Mass, area and in three gymnasiums in the Santa Monica, Calif, area to recruit subjects.  We offered $60 for a confidential interview to any male aged 16 years or older who had lifted weights for at least 2 years.”

This is our first red flag:  If you’re designing to study to see what the limits of drug-free muscularity are, you’d want to make sure your subjects are actually at least near their own genetic ceilings.  As it is, the only requirements were being at least 16 years old, and lifting weights for at least two years.  I hope we can all agree that a) most gym-goers don’t train particularly effectively and b) most people aren’t closing in on their genetic limits after just 2 years of training.

Now, it’s likely that there were a few subjects who were actually pretty close to their muscular limits. 1  However, odds are very good that most of the participants were just typical gym-goers – not the population you want to study if you’re interested in the limits of drug-free muscularity.  At the very least, there was an incentive for anyone to participate (getting paid $60), and no methods in place to specifically screen for people who were nearing their limits.

It’s not uncommon to re-analyze data that had been collected for a separate study.  However, it’s important to make sure the data are equipped to answer the research question proposed in the new study.  In this case, they aren’t.

The next few paragraphs discuss how some of the men didn’t have skinfold measurements and couldn’t be included in the analysis, and how an extra batch of subjects from a study in progress were added, leaving them with a pool of 157 subjects.  “Of these, 74 (47%) had never used steroids (henceforth called ‘nonusers’) and 83 (53%) had used steroids (‘users’).”

This is our second red flag:  If you’re designing a study to assess the limits of any human trait, you’d better make sure your sample size is larger than 74 individuals.  Even if you have a sample of 74 exceptionally tall people, you’re probably not going to find any 8′ people (the world record is 8’11.1″).  Even if you have a sample of 74 fast people, you’re probably not going to find anyone who runs a 9.8s 100m (the world record is 9.58s).  Even if you have a sample of 74 exceptionally strong people, you’re probably not going to find any 600lb benchers (the world record is 738.5lbs).  People who are 8′ tall, run a 9.8s 100m, and bench press 600lbs are freaks, but not particularly close to the highest level of human attainments in those domains.

In short, if you want to know how jacked someone can possibly get without drugs, you’re going to need more than 74 subjects, regardless of who those subjects are.

To the researchers’ credit, they acknowledge this.  From the conclusion:

“Admittedly, one cannot definitively diagnose steroid use simply on the basis of the FFMI, much as one cannot make a definitive diagnosis of alcohol intoxication in a man who displayes ataxia and dysarithria upon getting out of his automobile.  In the latter case, however, the individual may be required for forensic reasons to produce a breath or urine sample.  Perhaps we could ultimately follow an analogous procedure in forensic situations with individuals displaying an abnormally elevated FFMI.”

The researchers knew that their data weren’t sufficient to assume anyone with an FFMI of 25+ was automatically on steroids.  They proposed that FFMI should work as nothing more than an initial screen.  i.e. if someone has a really high FFMI, that just means there may be sufficient reason to do a blood or urine test for steroids.

I think we can all agree that’s reasonable.  There’s a higher chance that super jacked people are on steroids than less jacked people.  However, labeling an FFMI of 25 as a hard limit for non-users was a subsequent invention of the internet.  It’s not something proposed by this study, and it’s not something the researchers themselves would agree with.

Next, the researchers plotted the FFMIs of users and nonusers and discussed their data (lengthy quote incoming):

“Figure 1 shows a plot of FFMI versus height in meters for all of the subjects in the study.  The nonusers extended up to a well-defined limit, shown as a diagonal line in the figure; many nonusers were just below this line, but non exceeded it.  On the other hand, users extended well beyond the line with 37 (45%) of the users attaining levels of FFMI beyond the uppermost of the nonusers.

The ‘cuttoff’ line has a positive slow rather than a zero slope in the figure, perhaps because the factor of height-2 in the FFMI calculation does not fully account for the fact that human beings are three-dimensional rather than two-dimensional objects.  In other words, the tallest athletes were not only taller, but also wider and thicker than the shorter athletes of apparently comparable muscularity; thus, the tallest athletes scored somewhat higher on the FFMI calculation.  Our clinical impressions supported this speculation.  During the preparation of this article, we called in the shortest nonuser (height 1.59m) and one of the tallest nonusers (height 1.93m) and remeasured both of them.  The shortest athlete displayed an FFMI (without normalization) of 23.5, whereas the tall one scored 25.4.  however, on visual inspection, the short athlete appeared more muscular than did the tall one.

To generate an approximate correction for this apparent effect of height, we calculated the slope of a regression line drawn through a plot of all the ‘elite’ nonuser athletes with FFMI scores of 22 or above. (We limited the regression calculation to this subgroup because we felt that the distribution of the elite group would more closely reflect the dictates of physiology and not be confounded by lack of achievement, as in the less muscular subjects.) This calculation yielded a slope of 6.1kg/m2.  We then used this value to calculate a ‘normalized’ FFMI, in which the FFMI was normalized to that of a 1.8-m athlete (the mean height of the nonusers):

Normalized FFMI = FFMI + 6.1 x (1.8 – h)

where h is height in meters.

Using normalized FFMI, we obtained the plot shown in Fig. 2.  Again, it can be seen that the nonusers ‘stop’ abruptly at a maximum value of 25.0, whereas many users extend well beyond this limit.”

First, let’s take a look at the data they’re referring to:

Next, let’s unpack these paragraphs:

1)  The authors acknowledge that FFMI itself may not be a great way to assess muscularity in the first place. 2

Something like a version of the corpulence index (CI) applied to lean mass may work better.  While BMI is mass divided by height squared, CI is mass divided by height cubed (to account for the fact that humans are three-dimensional).  The FFMI formula is the same as the BMI formula, except that it only deals with lean mass instead of total mass; lean mass divided by height cubed (similar to the CI) may work better.

On the other hand, other work has shown that there’s actually a negative relationship between BMI and height, suggesting that you should instead raise height to a power smaller than 2 to accurately scale body mass to height.  The same may apply to lean mass as well.

TL;DR:  scaling is tricky, and it’s not even clear that FFMI is actually a valid, meaningful measure to compare human muscularity.

2) Going by raw FFMI values, there was actually at least one individual in the nonusers group who had an FFMI above 25.

One guy was 1.93m tall (6’3”) with an FFMI of 25.4, meaning he had about 94.6kg (208.5lbs) of lean mass.  I shouldn’t need to tell you this, but that’s pretty damn big.  For context, that means he’d step on a bodybuilding stage at 7% body fat at around 102kg (225lbs).  The FFMI “cutoff” of 25 doesn’t arise until the researchers applied a “correction” to their data.

3) The correction they applied 3 was post-hoc and fairly arbitrary.

In the methods section of the paper, the authors state that their intention was simply to calculate FFMIs of the athletes using the typical FFMI formula (lean mass divided by height squared).4  They didn’t decide to make any adjustments until they’d already collected their data.  That’s not necessarily a “bad” thing, but results you only get after doing some post-hoc fiddling with your data aren’t supposed to be heralded as the main finding of a study; they typically just get a brief mention in the discussion.

You can look at the scatterplot itself to see that correction they applied probably wasn’t necessary.  If there was an overall positive trend between FFMI and height, a correction may be warranted.  In this case, it’s pretty clear that the relationship between FFMI and height is either weak or nonexistent.  The line drawn through the data isn’t a trendline; it’s just an arbitrary line on which the drug-free people with the 1st, 3rd, 6th, and 13th highest FFMIs in the study fell.

To calculate the correction (which they admit is an “approximate” correction), they picked a subgroup of the nonusers and looked at the relationship between height and FFMI.  Importantly, they didn’t report a correlation coefficient to tell us the strength of the relationship; if it wasn’t a strong relationship in the first place, it would seem odd to use it to calculate the correction.

TL;DR:  without a correction, there were one or two people in a random sample of 74 gym rats with an FFMI over 25.  The authors’ justification for applying a correction is pretty flimsy, and the correction was a post-hoc addition in the first place.

4) The authors themselves don’t even think the correction “worked.”

This is pretty easy to miss if you’re not paying attention, but the authors state:

“During the preparation of this article, we called in the shortest nonuser (height 1.59m) and one of the tallest nonusers (height 1.93m) and remeasured both of them.  The shortest athlete displayed an FFMI (without normalization) of 23.5, whereas the tall one scored 25.4.  however, on visual inspection, the short athlete appeared more muscular than did the tall one.”

Here are those two individuals:

FFMI Kouri Scatterplot 2

We know enough about them to calculate their “normalized” FFMIs.  It’s 24.78 for the short guy, and 24.6 for the tall guy – virtually identical.  The 0.18 point difference is effectively meaningless (around .5kg/1lb of lean mass).

The authors themselves say they thought the shorter guy seemed more muscular than the taller guy, but their formula says they’re equally jacked.  However, if they applied a larger correction to reflect that, it would mean pushing the short guy over the “magic” FFMI threshold of 25.

Next, the study goes from “okay, this isn’t great, but if we overlook some flaws, we can still probably learn something,” to “holy crap, how the heck did this even get published”:

“To further test the limits of FFMI, we obtained the heights, weights, and ages, at the time of competition, of all Mr. America winners from 1939 to 1959.  Because anabolic steroids were not available in gymnasiums during this era (Todd T, personal communication, July 1994), these athletes likely represented the maximum FFMI attainable without drugs.  The second author (H.G.P.) estimated the body fat of each athlete from contemporaneous photographs in bodybuilding magazines of the era, averaging the estimates from several photographs of each athlete. [Dr. Pope based these estimates on having performed body fat measurements with calipers on >200 men in the course of previous studies, thus giving him substantial experience in estimating fat from a subject’s appearance.]  The athlete’s face and written identifying information were obscured during this exercise to render all estimates blind.  Adequate photographs could not be found for two Mr. America winners (Park, 1952; and Schaefer, 1956).  The estimated normalized FFMIs for the other 20 athletes are shown in Table 2 and charted on the left-hand side of Fig. 3.  It will be seen that the presteroid Mr. America winners displayed a mean (+/- SD) normalized FFMI of 25.4 +/- 1.5, with only three having values of >27.0.”

Mr. America FFMI

Let’s just take this point by point.

  1. “We obtained the heights, weights, and ages, at the time of competition, of all Mr. America winners from 1939 to 1959.”  How do they know the information was accurate?  I have a copy of the book they cited as a source (The Super Athletes by Willoughby), but the book doesn’t cite a source to verify the numbers.  Right off the bat, it’s entirely possible that the reported heights and weights were wrong.
  2. “Because anabolic steroids were not available in gymnasiums during this era…”  Eric Helms has done a great job of documenting the history of steroid creation and dissemination in this article, but in short, it’s not true that steroids weren’t available all the way up to 1959.  We can be 99.9% sure that all winners before 1944 were truly drug-free, and quite confident that all winners before 1954 were drug-free (the first corroborated reports of testosterone use in US bodybuilding circles comes from the early 50s).  However, there’s a decent chance that a fair amount of the bodybuilders in the late 50s had dabbled with steroids.  This isn’t a major issue, but you’d expect more due diligence in a journal article.
  3. “…these athletes likely represented the maximum FFMI attainable without drugs.”  That’s a HUGE reach.  Bodybuilding was a tiny sport in the 1940s and 1950s, so to assume the bodybuilders of that day attained the absolute peak of drug-free human muscularity is absurd.  Compare the best athletes from the 40s and 50s to the best athletes in essentially any sport today – almost without exception, the top pros of yesteryear would be middling amateurs today as talent pools have grown.  There were even several drug-free lifters in the Kouri study with FFMIs higher than several of the Mr. America winners of this era!  I’m not going to argue that Grimek (FFMI of 26.9 in 1942), Stanko (FFMI of 27.3 in 1944), Eiferman (FFMI of 27.7 in 1948) and Delinger (FFMI of 28.0 in 1949) weren’t super jacked.  However, it’s asinine to assume they represented the absolute peak of drug-free muscularity.  In fact, we don’t even know that they were at their all-time best when they won the Mr. America.  After the organizers feared that Grimek was unbeatable in 1942, they instituted a rule saying that you were only allowed to win the contest once – all four of these men may very well have gotten more muscular after winning the contest, but they weren’t allowed to compete again.
  4. “The second author (H.G.P.) estimated the body fat of each athlete from contemporaneous photographs in bodybuilding magazines of the era, averaging the estimates from several photographs of each athlete.”  This is where I nearly spat out my coffee.  Visually estimating body fat percentages?  Based on images from bodybuilding magazines that very well may have been edited?  Is this a journal article or a bodybuilding.com thread?

You’d be totally entitled to disregard this section of the study entirely, as it doesn’t live up to literally any reasonable scientific standards.  However, I think we can take one thing away from it – unless the cited heights and weights were way off, and unless the body fat estimations were way off, this section kills the notion that an FFMI of 25 is a hard limit.

To use Stanko as an example (FFMI of 27.3 in an era where we can be 99.99% sure he was truly drug-free) – he was 2.3 FFMI points over the “natty limit” of 25.  Stanko was apparently 5’11.5” (1.816m) and weighted 223lbs (101.15kg).  An FFMI of 27.3 means he had 199.2lbs (90.37kg) of lean body mass, putting his estimated body fat percentage at 10.67%.

To have an FFMI of only 25, he could have at most 182.49lbs (82.78kg) of lean body mass, putting his body fat percentage at 18.16%.  In other words, either his reported weight was way off, one of the authors estimated he was a pretty lean 10% body fat when he was actually closer to 20%, or his FFMI was considerably higher than 25 in an era where we can be almost 100% certain he was truly drug-free.

Let’s move on to the discussion:

“These findings must be regarded as preliminary and subject to several possible methodological limitations.”

Good starting point.

First limitation:  some users may have slipped into the nonuser group.  “However, athletes were recruited under circumstances for which they had no particular motivation to lie about steroid use nor anything to gain from doing so.  Furthermore, all 74 nonusers produced urine samples negative for all steroids.  Finally, even if an occasional self-described nonuser had in fact used steroids, this phenomenon would not affect our estimates of a maximum FFMI in the region of 25 because many nonusers clustered just below 25, and it is impossible that all of the individuals in this cluster were lying.”  Fair, and reasonable.

Second limitation:  “Our sample size of 74 nonusers might not have been large enough to exhibit fully the upper limits of muscularity naturally attainable.”  You don’t say.

They go on to explain that the data from Mr. America winners were supposed to help mitigate this limitation.  The average FFMI of the Mr. America winners from 1939 to 1959 was 25.4.  Of the 20, 13 had FFMIs above 25, 8 had FFMIs above 26, and 3 had FFMIs above 27, with a peak of 28.0.  And again, it’s ludicrous to assume that one of a handful of bodybuilders before bodybuilding was even a major sport just happened to reach the absolute peak of drug-free human muscularity.  I’m sure there are errors in this data set, but those errors would need to be systematic, correlated, and huge to not completely destroy the notion that it’s impossible for any drug-free lifters to achieve an FFMI above 25.

Third limitation:  “our calculations of body fat are based on skinfold measurements taken by a single investigator, and our calculations for the Mr. America winners are based on body fat estimates from blinded examination of several photographs of the individual.  These methods are certainly prone to a degree of error.”

That’s an understatement.

“However, calculations from skinfold measurements, using the above equation, display a standard error of 3.4% of body fat and thus are sufficiently accurate for our purposes.  For example, a 1.8-m, 90kg (71-inch, 198-pound) athlete, measured at 10% body fat would have a normalized FFMI of 25.0.  If this body fat measurement were off my 3%, and true body fat were 13%, the FFMI would still be 24.2, a difference of only 0.8 units.”

Of course, the error could go in the other direction, and that same 3% error could yield an FFMI of 25.8 with an overestimation of body fat percentage.  With 9 people in the non-users group having FFMIs between 24.0 and 24.9, it’s very unlikely that at least one didn’t have a normalized FFMI above 25 that was masked by a body fat estimation error.

Fourth limitation:  “our formula may not be satisfactory for fat individuals.  Because a gain in the fat component of the body is consistently accompanied by some gain in the lean component, it is possible that fat individuals might be able to exceed substantially an FFMI of 25 without steroids.”  This is very valid.

What do they mean by “fat?”  The nonuser group had an average body fat percentage of 12.5 ± 5.5%, so it should apply to people down to at least 7% body fat, and up to people with at least 18% body fat.  I often see people say that the FFMI “limit” only applies to very lean people (i.e. sub-10% body fat), but that’s not something that can be taken away from this study.

Fifth limitation:  FFMI may not be a useful screening tool for endurance athletes because endurance athletes can take gear and still be scrawny.  Well, yeah.  That’s obvious.

Wrapping this sucker up

The idea that an FFMI of 25 is any sort of “natty limit” could only come from a really bad interpretation of one really bad study.

The study itself was not set up to investigate the limits of drug-free human muscularity.  Its sample was too small, and its inclusion criteria were way too lax.  The raw data themselves don’t support an FFMI “limit” of 25.0 (as one or two subjects out of just 74 had FFMIs above 25) – that came about only after a fairly arbitrary post-hoc “correction.” Furthermore, with the error inherent in estimating body fat percentages via calipers and the cluster of people with FFMIs just below 25 after “correction,” it’s very likely that at least one or two people out of 74 had normalized FFMIs above 25.0 that were masked by body fat misestimations.

Furthermore, the presented FFMIs of Mr. America winners pre-1960 should either destroy any notion that an FFMI of 25 is a “limit,” really shake your confidence in the study as a whole (again – visual body fat estimations?  Seriously?), or both.

Finally, the authors themselves say that their findings should be regarded as preliminary, and that an FFMI cutoff of 25 should only be used as an initial screening tool.  They don’t propose that everyone with an FFMI over 25 is on steroids.  They just think that an FFMI over 25 should be a red flag to warrant actual drug screening.

Make no mistakes – this study is an example of very bad science.  It’s so flawed in so many ways that I’m really not sure how it got published.

However, the subsequent interpretations of this study have been even worse.  Anyone using this paper to argue that no drug-free lifter can attain an FFMI of 25 without drugs either doesn’t know how to critically appraise and interpret research, they’re purposefully misrepresenting it to make an invalid point, or they’re just parroting the idea from some other source without actually reading the paper in the first place.

It’s baffling to me that the notion of a FFMI “natty limit” of 25.0 ever got started in the first place.  If you’re using this paper as a guide, a more accurate interpretation is just that you’re pretty unlikely to find any or many drug-free bros in a random gym with an FFMI over 25.  But by no means does it support the notion that an FFMI of 25 is impossible or nearly impossible to achieve with good genetics and years of hard work.  If you take the data at face value, they say that not all that many people will pass an FFMI of 25 without steroids, but that people with great genetics can achieve FFMIs of 26-27+.

In fact, I think proposing a “limit” is wrongheaded in the first place, since human traits tend to be normally distributed.  That’s why I’ve always addressed this question probabilistically instead of using black-and-white terms.  Probability assessment isn’t as exciting as simplistic (and wrong) black-and-white thinking, but it’s the more rigorous and intellectually honest way to approach this question.

So in summation:  stop talking about the “natty limit.” Just stop it.  Odds are very low someone hit it before the advent of steroids, and now that steroids exist and drug tests are imperfect, we’ll never know for sure what it is (or even if it exists as any sort of hard limit in the first place).  As such, the entire concept is a silly construct that’s unproven and likely unprovable, and if it exists in the first place, no one has any earthly idea where it is.

p.s. I know I said in my last article that I’m done talking about steroids for a long time.  I stand by that.  This article just discusses a specific claim I wanted to address since I see it parroted so often.  As such, I’m filing it under “general myth-busting” and “critical appraisal of research.”

p.p.s. There’s a study being published soon that’s going to blow the idea of an FFMI of 25 as the “natty limit” out of the water.  This is just an examination of why the idea was almost fractally wrong in the first place.  Edit:  the study is up now/


  1. “the nonusers included many dedicated bodybuilders.  Several had competed successfully in ‘natural’ bodybuilding contests, two held world records in strength events, and many others were recognized by their associates as highly successful weightlifters”

  2. “…perhaps because the factor of height-2 in the FFMI calculation does not fully account for the fact that human beings are three-dimensional rather than two-dimensional objects.  In other words, the tallest athletes were not only taller, but also wider and thicker than the shorter athletes of apparently comparable muscularity; thus, the tallest athletes scored somewhat higher on the FFMI calculation.”

  3. Normalized FFMI = FFMI + 6.1 x (1.8 – h)

  4. “After calculating percentage bodyfat for all of the subjects, fat-free mass was calculated using the following formula:

    fat-free mass = body weight x [1 – (% body fat / 100)]

    FFMI was then calculated as follows:

    FFMI = fat-free mass x height-2

    where weight was measured in kilograms and height in meters.

Making your novice strength training routine more effective – Two quick tips

The fitness-related content on this site has all been moved over to Strengtheory.com, my new website.

If you want to keep reading on this page, that’s perfectly fine. If you want to read this article on Strengtheory, just replace “gregnuckols” in the address bar with “strengtheory,” and don’t forget to check Strengtheory.com regularly for new articles!  If you’d like to share this article with your friends (please do!), then I’d appreciate it if you shared the Strengtheory.com URL.  It’s a prettier site for your friends to use, and it helps with the new site’s ranking in search engines.

Go Now!

This is something all new lifters need to read when they’re doing Starting Strength, Stronglifts 5×5, Greyskull LP, or any of the other beginner programs out there.

From a practical standpoint, it’ll help them get the most out of their first couple of years under the bar.  Taking the long view, it’ll also be a good introduction to some basic principles of program design.

1.  Periodize

Periodization is a massive subject, and it’s easy to get overwhelmed by the minutia.  However, in the simplest terms, periodization simply means “having defined times in your training where you emphasize different goals.”  The application can get really hairy, but the easiest way to periodize your training without an in-depth knowledge of the theory behind it – changing set and rep schemes.

Yep, it can be that simple.

So, should you periodize your training?  In a word:  “YES!”

A 2004 meta-analysis essentially showed that periodized training is almost always better than non-periodized training.  To quote the authors, “As a result of this statistical review of the literature, it is concluded that periodized training is more effective than non-periodized training for men and women, individuals of varying training backgrounds, and for all age groups.”  That’s about the most conclusive statement you’ll hear from an exercise scientist.

Here’s the easiest way to periodize one of the common beginner training programs:  instead of sticking with the kosher 3-5 sets of 5 reps for everything, proceed thusly:

Start with 3×8 for your lifts, adding weight each session until you’re unable to do so.  Once you can’t add weight every session anymore…

Switch to 5×5.  Repeat the process.

Then 5×3.

You don’t have to switch all your lifts over to the new rep scheme all at once.  If you plateau on your bench or OHP before your squat or deadlift, go ahead to switch the stalled lift to the new rep scheme, and continue as you were with the others.

This setup allows you to stick with the basic progressive overload you would usually get from a beginner’s program, while also implementing some basic periodization, which will almost certainly make the program more effective for you.  You’ll be able to linearly add weight for a longer period of time, and odds are very good that you’ll end up with bigger maxes than if you stuck with 3-5×5 for the entire program.

2.  When you finally plateau, add volume

Something I’ve never understood is the stock advice of “when you stall with your linear gains, take 10% off the bar, and build back up using the same progression.”

What’s supposed to happen in the couple of weeks while you build back to your old plateau?  Is that when the gains fairy visits to defy the basic principle of progressive overload, thereby granting you a substantially improved response to the exact same stimulus?

gainz fairy

Pictured: Gainz Fairy

Instead, if you decide to stick with the same program, deload a little more than you otherwise would, and build back up with 1-2 more sets per exercise.  So if you were doing 3 sets, do 5 sets.  If you were doing 5 sets, do 6 or 7 sets.  The scientific literature agrees almost unanimously that more volume is better for both strength and hypertrophy.  Some studies don’t reach significance, but this is mainly due to lack of statistical power due to small sample sizes (a common problem in this field).

If you want to combine these two pieces of advice, deload to about 10% below where you switched from 3×8 to 5×5.  Build back up by proceeding from 5×8 to 6×5 to 7×3.  This will more reliably keep your progress going than sticking with 3-5×5, deloading a bit, and building back up with the same sets and reps.

 

I’m sure if you’re a regular Strength & Science reader, none of this is new to you, BUT it will be new and helpful to a lot of novice lifters.  Share it around so they can see better results in their first few months under the bar, and perhaps get their first exposure to the practical application of periodization.