TSH Reference Ranges in the USA

TSH Reference Ranges in the USA

We keep seeing claims that the top of the TSH range in the USA is 3.00. (And that “they” start treatment if TSH goes above that.) This has been repeated by people who really, really should know better such as professionals. As well as ordinary people like us. :-) We all have to base our understanding on what we see and find for ourselves.

The other day I posted some links on this post:


- and one of them contained a list of TSH ranges from real, live USA labs:

Reference range

0.1 - 5.5

0.1 - 5.5

0.4 - 3.7

0.3 - 4.8

0.2 - 5.5

0.1 - 5.5

0.4 - 6.0

0.3 - 4.0

The PDF is linked here:


And the above appears on page 30.

As you can see, the very lowest top of range is 3.7. I guess that the ranges vary by the precise analyser/kit in use more than any other single factor.

This claim of 3.00 being the top of the reference range has been repeated all over the place. But, so far as I can tell, it is not backed up by what I have seen. PR4NOW posted about this the other day - seems it was a recommendation but was never implemented.

Further, the very papers which made the recommendation of tightening the ranges did not adequately take into account the differences between the various analyser machine/kit manufacturers and labs. All too many papers seem to assume standardised reference ranges (at least across the country the paper comes from) but this is simply sloppy science/medicine. They vary within every country which has more than one make of TSH analyser in use.


Image is a small fragment of the quoted PDF showing the various analysers used and the reference ranges for each of them.

48 Replies

  • Thanks for posting this. Another claim i have seen repeated around here and elsewhere is that 'the UK has the highest top of TSH range in the world', or similar wording.

  • Thanks Rod....

  • Rod, 'as so often' PR

  • I thought the claim was the cut-off for starting treatment for Hypothyroidism was a TSH of 3 or above in the USA? Like ours here in the UK is now 10.

    That's a very different thing from a lab reference range or what the assay used for the test can record. It's about what is considered 'abnormal' and an indicator of when pharmacological treatment of Hypothyroidism should begin.

  • Even if that is the case, it seems pretty clear that a result of 3.0 would represent very different points on the curve depending on the assay in use. One range goes up to 6.0 - so 3.0 is actually below halfway down (between 0.4 and 6.0, the arithmetic midpoint is 3.2). Another range is 0.4 to 3.7 - with its arithmetic midpoint at 2.05 - where 3.0 would be in the upper quarter.

    Policies based on numbers must only be applied where test results are calibrated to be consistent. It might be reasonable to suggest that 75%, 90%, 95% or some other point on the scale is a sensible cut-off but the numeric result will vary from lab to lab. For major players in the USA to continue to spout numeric cut-offs makes me leary of their intentions.

    Our use of 10 is perverse. There is a reference range (albeit the values vary from lab to lab), but the medics then ignore it! In my book, the only reasons to wait are to exclude transient hypothyroidism or something affecting the result. Whereas the USA 3.0 is within every single one of the reference ranges I posted.

    I happen to think that Wiki's basic definition is reasonable:

    In health-related fields, a reference range or reference interval usually describes the variations of a measurement or value in healthy individuals. It is a basis for a physician or other health professional to interpret a set of results for a particular patient.

    If the USA had a policy of treating at a result of 3.0, I'd infer that people with results over 3.0 are in need of treatment and therefore not healthy. They must be including a lot of non-healthy people in the calibration of the reference ranges for the various assays for them to go so far above 3.0.

    The latest and most sensitive TSH tests can record very much lower and very much higher values than these reference ranges. I'd suggest their technical ability would be more like 0.01 to 100 (and they might simply not bother distinguishing values over 100 even if they could - after all, they are obviously miles over any range and the difference between, say, 100 and 120 is quite likely of no clinical significance).

  • Found the 2012 American Thyroid Association / American Association of Clinical Endrocrinologists (ATA / AACE) paper which does discuss this in detail, amongst other things frequently mentioned on here like monotherapy vs. combination therapy, NDT, when to dose, things that affect uptake etc.

    The name of the paper is 'Clinical Practice Guidelines For Hypothyroidism in Adults:..' and it can be found here:


  • In general terms I think that's a well rounded and well considered paper with good supporting evidence for each point, playing devils' advocate with itself along the way.

    Going back to assays and their reference ranges listed above - it looks like they are related to some of the studies listed in the paper I've linked, and it continues to be a controversial subject so I'll end on a controversial note on TSH extracted from it in line with your statistical analysis viewpoint:

    "the upper normal should be either 2.5 or 3.0 mIU/L (86) for a number of reasons:

    • The distribution of TSH values used to establish the normal reference range is skewed to the right by values between 3.1 and 4.12 mIU/L.

    • The mean and median values of approximately 1.5 mIU/L are much closer to the lower limit of the reported normal reference range than the upper limit.

    • When risk factors for thyroid disease are excluded, the upper reference limit is somewhat lower."

    Page 19.

  • Heartily agree it is an interesting document and has a lot more sense than many.

    To discuss some of the specific points:

    We so very often read of medics treating the distribution of TSH results as if it were a Gaussian (or "Normal") distribution which it most certainly is not. But I am absolutely convinced that a typical GP interpretation is based on the idea that it is.

    The comment about values between 3.1 and 4.12 suggests that the specific TSH range being discussed has a top at 4.12. The same discussion based on another lab's range might use different numbers.

    The TSH ranges at many labs in the USA and the UK have shrunk over the past 8 or so years. As I understand, this has been partly due to more rigorous exclusion of people with thyroid antibodies, or other issues, from the groups used to set the reference ranges. And partly due to TSH tests being better at preventing interference from, for example, macro-TSH so at least some false results are being properly excluded.

    I have read in the past that a part of the reason for allowing TSH to go up to 10 before treatment was to make some allowance for the imperfections of the TSH tests. If, as I have suggested, the tests are often better now, then the need to get to 10 should be recognised as inappropriate. (But I do not want to suggest that the tests are in any sense perfect - they are not and misleading results are still very possible.)

    This issue makes the discussion of older and newer results rather difficult. Results are not consistent over the years. The paper mentions the Whickham survey which is definitely "old" in that sense and it is so very easy to fall into the trap of thinking that the same people in that survey, at the same stages, would have the same TSH results if tested now. I suggest that is unlikely to be true. (Obviously only a thought experiment!)

    I end by saying that if people should be treated at 3.0 (on the scale of one lab), then the top of the reference range should be 3.0 - or 2.9.


  • Rod, I think the magic number '10' is an artifact from the Wickham study which would have used a first generation TSH test, some of whom had an upper limit of 10 for the reference range. I have never found any other science to back up the magic number 10. It is true that if you wait until 10 is reached the patient will most certainly be hypo by then and will have spent several years needlessly suffering.

    After AACE suggested 0.3-3.0 in their 2002 article all hell broke loose in the field of endocrinology. Apparently the members hadn't been consulted before the announcement and there was great discord in the ranks and among other endocrine institutions, hence, the TSH reference range wars and few, if any, labs adopting the 0.3-3.0 standard. I also have always felt that the concept of 'mild' or 'subclinical' hypothyroidism is an artificial construct and does a great disservice to thyroid patients. PR

  • I have read (where? cannot remember :-( ) that the earlier TSH tests were prone to interference by, for example, macro-TSH. Hence a higher top of range.

    The logic of that escapes me. It might be sort-of true if you are considering the entire population being tested. But you simply cannot apply that logic to a single test on a single patient.

    Me being one of those who was treated when only the tiniest bit over range (5.05, if I remember right), I know what an incredible panoply of issues have been improved or resolved since treatment. I most certainly agree that there is a silly gap.

    A time delay might be arguable - to avoid treating those who have a genuine temporary blip. But when you see people having test after test over months or years - all in the so-called sub-clinical range, it is heartbreaking.

  • A couple of things you missed that are relevant.

    "The guidelines are not inclusive of all proper approaches or methods, or exclusive

    of others. the guidelines do not establish a standard of care, and specific outcomes are not guaranteed. Treatment decisions must be made based on the independent judgment

    of health care providers and each patient’s individual circumstances. A guideline is not intended to take the place of physician judgment in diagnosing and treatment of particular patients..."

    This sounds great except that most allopathic doctors treat the lab test and ignore the patient and their symptoms which results in an inferior standard of treatment.

    "The guidelines presented here principally address the management of ambulatory patients with biochemically confirmed primary hypothyroidism whose thyroid status has been stable for at least several weeks."

    This is a very limited viewpoint of the world of thyroid problems and again results in an inferior standard of treatment. Doctors need to learn to treat the patient and their symptoms first and realize the lab tests are just crude clues at best. PR

  • In the real world, guidelines are used to replace knowledge in those who are ignorant. They exercise some sort of control which inhibits those who are supposed to be guided.

    In one document I read again the other day it says words to the effect "If you don't know the specific lab reference range, use this standard range." In my book it should say - "if you don't know the range, ask yourself why not and find it out pronto".

  • Or, screw the range, look at the human being standing in front of you and treat the patient and their symptoms. 'If only!' PR

  • That's a novel concept. Do you think you could patent it?

  • Thanks for posting this Rod. I have not read the paper but just wanted to throw my 2p's worth in.

    Is it not possible that machines at different labs are calibrated using a standard stock solution of known concentration which is say 3mmol (or whatever the units are) or whatever figure the lab chooses from whatever guidelines but each machine reads that solution differently, leading to the different ranges for each machine? So that say you have the same sample of blood read on each machine you will get a different number because of the calibration but essentially the actual concentration is exactly the same?

    I can honestly say that I have lost faith in medics understanding what is going on but when it comes to basic science and journal articles I am more inclined to think that there is some piece of information that we are not privy to which would make everything make much more sense. Frankly I find this difference in reference ranges baffling but can only hope that there is some switched on scientist out there to whom it all makes sense :-)

  • I don't know whether things have changed but at one time there simply was not a stock reference solution for TSH. TSH is not something that we can readily make - it is a complex molecule.

    It should be the case that a sample which is at the extreme bottom of the range on machine 1 is also at the bottom of the range on machine 2 - even if the number at the bottom of the range is different for the two machines. E.g. 0.4 on the machine with range 0.4 to 3.7 and 0.2 on the machine with range 0.2 to 5.5. (And similarly for the top of the ranges.)

    If that were the case, it would indeed be a fairly straightforward thing to adjust the calibration to make them all read the same.

    I think that the lack of an absolute standard for measuring TSH makes this difficult to achieve.

    Mind, I am struggling now and would very much welcome someone with a better understanding of the issues helping to explain them to me as well!

  • Yes, I am comforting myself (in denial!) with the fact that there must be someone out there who this all makes perfect sense to and that there is a very easy answer...

    I bet the guy who invented the TSH test had no idea the trouble he was about to cause!

  • That would have been Dr. Robert Utiger who wrote his paper in 1965 but he reportedly hoped that doctors would still practice medicine and pay attention to the patient and he liked T3. PR

  • All of which makes me really curious just what Prof. Thienpont came up with as a way to harmonize the TFTs. There also was some mention that she had worked on the cortisol tests because of a similar problem. Maybe we could get Diogenes to teach a class on the finer points of biochemical testing. I certainly would like to fill in some of the holes in my knowledge, which is limited at best. PR

  • Me too :-)

  • Surely they don't specifically need TSH protein in the stock solution for calibration, just the protein/molecule which is being directly measured? I don't know how the assay works but from looking at the link you posted through my hypo-addled brain it looks like an antibody is measured and not the TSH directly. The TSH conc would then be assumed or back-calculated depending on how it binds the antibody.

  • In fact looking again, it says the systems measure luminescence which would be emitted from a chemical attached to a secondary antibody. So I would have thought that all you need is the conjugated antibody (or both antibodies) for calibration.

  • If you wish to prove a test works, then when everything else has been done, I think you really do need to be able to push some known concentration of TSH through the test and see if it comes up with the expected answer. And then throw everything else you can think of through to check that it is never positive when it shouldn't be.

    For TSH there is an extra complication - there are several slightly different forms of TSH (isoforms) with different "sugars" added to the basic molecule. Do they all have precisely the same biological activity?


  • Good point. Possibly not. The progesterone receptor isoforms all have different functions, however I think that is across the board. Here they show that it is a specific demographic where they found the one particular isoform so maybe this one is the result of a spontaneous mutation, suggesting that there was no prior requirement for it specifically. However if the mutation is now prevalent then perhaps a function has evolved for it over time.

  • On the other hand, is there only 1 TSH receptor isoform? Different TSHs might imply that there are different TSHRs and consequently different cellular functions.

  • Also what you suggest sounds very sensible but then there is the question of how do you know the concentration of the TSH solution you are checking it against. Chicken and egg situation. I think the immunoassay method in principle is supposed to be relatively reliable (your antibody should only bind your protein of interest once and only once giving an accurate calculation) but when different isoforms crop up which have not been seen before then the whole thing becomes messy and they need to go back to the drawing board to design their antibodies again so they are specific.

  • As I see it, the variants on TSH are like coffee. (Hold it - I don't think I have gone mad...)

    Each cup of coffee can have one, two or three spoons of sugar. The sugar can be white, or brown, or golden. Or some of each.

    But if you test the cups - well they all have coffee in them, don't they? :-)

    The protein part of TSH is identical in all cases - and that is what triggers the antibody binding. But maybe the sugars affect how long it remains active in the TSH receptor? Maybe the different isoforms have subtle effects?

  • Hehe I like this analogy :-)

    So is the only difference between the isoforms the different sugar residues? I wondered if perhaps the protein sequences were slightly different too.

  • Yes. Maybe the sugars affect whether it binds receptors at all. And also the antibodies...

  • Sorry for all the separate messages! I am typing on my phone.

    When antibodies are designed I think they check databases of millions of gene and/or protein sequences to confirm that their antibody will be specific in theory. However in practice there are loads of proteins which are as yet undiscovered which aren't in the database or haven't been sequenced yet so when you run the assay with a biological sample you may get the antibody cross reacting with something else you weren't expecting.

    Possibly in the cohort of samples they tested this assay on originally there were no known problems but with science I don't think you can say anything is certain.

    Maybe the TSH test should never have been rolled out with this in mind but I guess that's the same with any other assay out there too.

    Of course we all know it should never have been rolled out anyway considering its correlation with symptoms :-p

  • At least some TSH tests are well known to respond to both TSH and macro-TSH (where an antibody to TSH is attached to a molecule of TSH). This can result in false elevation of TSH results.

    Some tests (maybe now most?) protect themselves against this by ensuring that macro-TSH cannot get to the detector.

  • Really? How bizarre. What is the purpose of macro-TSH?

    Sounds like they are slowly trying to iron out the problems but not as quickly as we patients would like!

  • Not sure it has a purpose.

    The immune system has (somehow) decided that TSH is "foreign" and therefore (somehow) creates antibodies to attach to it. When macro-TSH is formed, I would expect that it can be recognised and gobbled by by a lymphocyte. Or something not too many miles away from that!

  • Oh ok. Is it related to hashi's?

  • No idea - the whole issue of why we have the antibody we have is endlessly confusing to me!

  • mmm yes I don't have hashi's so know absolutely nothing about it!

  • Surely you would want to include the concentration of macroTSH though? If the pituitary is making it in the first place doesn't that suggest it is needed to flog the thyroid into action?

  • I understand but I suspect that it is unimportant.

    This is pure speculation:

    If one macro-TSH unit has the same impact as, say, 10,000 molecules of TSH, then blocking that will have negligible impact on TSH measurement overall but will prevent this massive impact from distorting the results.

  • I think I'm lost... I may have to come back to macro TSH another day :-)

  • Also Rod, would you be so kind as to explain the units to me in the table you posted please? ul = microlitres but what does the U stand for? And mL?

  • μIU/mL

    micro International Units per milliLitre

    International Units are substance-specific amounts which are often used when the actual substance is of unknown or variable composition.

    For vitamin D, there are 40 IUs in one microgram. But when originally defined, the vitamin D IU was based on biological activity. Vitamin D IUs are entirely different to IUs of TSH or anything else.

    I think they are all in my document:



  • Thank you, I never understood that about vit d either! :-)

  • They did not know the molecular structure of vitamin D (or TSH!). So they measured it by its effect.

    Sort of like one spoon of granulated sugar being an IU of "sweetness". But if it were saccharin, you'd only have one of those tiny weeny sweeteners. And if it were inulin, you might need several spoons to have the same effect. That is - it is a totally arbitrary effect-based way of determining equivalence. And you would not need to know the molecular structure of the sugar, saccharin or inulin to be able to make quite precise comparisons.

  • How interesting. You would have thought by now though they would have ditched it for something more accurate, seeing as they now know the structure.

  • millilitre with a capital L?

  • Oh I guess cos it's American right?

  • No - not because of that.

    Many medical abbreviations use non-standard capitalisation to avoid the problem of distinguishing between lower-case ell, upper-case eye and numeric one, or upper-case oh and numeric zero. All these confusions have been spectacularly possible in some fonts.

    (As far as I am concerned, with the exception of very "arty" fonts, the fundamental requirement of all fonts is that the various characters are distinguishable.)

    So they use capital L for litre where real science uses lower-case l.

    Look at some of the results that people have posted here when they includes units. (That is, where they have copied-and-pasted or posted an image.) Not universal but quite common.


  • Love this quote:

    "Physicians rely heavily on TSH results in the face of non-specific symptoms

    But no lab test is infallible"


You may also like...