Accuracy of anthropometric measurements by general practitioners in overweight and obese patients

Background We recently showed that abdominal obesity measurements (waist and hip circumference, waist-to-hip ratio) were inaccurate when performed by general practitioners (GPs). We hypothesise that measurement error could be even higher in overweight and obese patients due to difficulty in locating anatomical landmarks. We aimed to estimate GPs’ measurement error of general (weight, height and body mass index (BMI)) and abdominal obesity measurements across BMI subgroups. Methods This cross-sectional study involved 26 GPs in Geneva, Switzerland. They were asked to take measurements on 20 volunteers within their practice. Two trained research assistants repeated the measures after the GPs (“gold standard”). The proportion of measurement error was computed by comparing the GPs’ values (N = 509) to the average value of two measurements taken in turn by the research assistants and stratified by BMI subgroup (normal/underweight: <25 kg/m2, overweight: 25 ≤ BMI < 30 kg/m2, obese: ≥30 kg/m2). Results General obesity measurements were less prone to measurement error than abdominal obesity measurements, regardless of the BMI subgroup. The proportions of error increased across BMI subgroups (except for height), and were particularly high for abdominal obesity measurements in obese patients. Conclusions Abdominal obesity measurements are particularly inaccurate when GPs use these measurements to assess overweight and obese patients. These findings add further strength to recommendations for GPs to favour use of general obesity measurements in daily practice, particularly when assessing overweight or obese patients.


Background
The high prevalence of obesity has become a major global health challenge, since obesity is associated with severe health consequences, contributing to the increase in cardiovascular morbidity and mortality [1,2]. Early detection by general practitioners (GPs) of patients at high risk for the development of cardiovascular disease is therefore essential. Anthropometric measurements are a non invasive and inexpensive method to assess patients' nutritional status and have been suggested for wide use in clinical practice. Measuring abdominal in addition to general obesity has been recommended in order to improve cardiometabolic risk assessment, [3,4] because the pattern of fat distribution has been shown to have a large influence on cardiometabolic risk [5] and, as a consequence, abdominal obesity seems to predict the development of cardiovascular diseases better than overall obesity [6][7][8].
The waist circumference (WC) and the waist-to-hip ratio (WHR), which is determined by dividing the WC by hip circumference (HC), have become widely accepted measures for assessing abdominal obesity [9,10]. However, these abdominal obesity anthropometric measurements suffer from higher measurement error than body mass index (BMI), even when they are performed by specially trained measurers [11]. We previously showed in two studies conducted in primary care settings that, in contrast to BMI, obesity was not accurately detected by these abdominal obesity measurement methods and that these measures led to frequent diagnostic misclassification [12,13]. This could be related to the fact that specific manipulations are required to take these measurements and that GPs are less used to them than they are to measuring BMI.
Measurement error could be particularly high in overweight and obese patients due to difficulty in locating anatomical landmarks [11]. To our knowledge, very few studies examined the influence of BMI subgroup on the risk of measurement error and none in primary care settings. WC and HC were found to have high reliabilities regardless of BMI subgroup in a small study by Wang (n = 76), where the participants were measured by a research assistant twice with a 10-min interval [14]. Another small study by Nordhamn (n = 51), in which WC and HC were measured by two raters showed that reliabilty tended to decrease in overweight participants for WC and WHR, but not for HC [15]. Note however that these two studies suffer from important limitations: the absence of a gold standard precluded the true assessment of measurement error, and the authors used Asian definitions of BMI subgroups, with lower BMI cut-offs.
Our objective was thus to estimate GPs' measurement error of general (weight, height and BMI) and abdominal obesity (WC, HC, WHR) anthropometric measurements across BMI subgroups.

Methods
This is a secondary analysis of data collected in a study of the accuracy of anthropometric measurements in general practice [13]. The methods have been described in detail elsewhere and are briefly summarized here. The research protocol was aproved by the local research ethics committee.

Recruitment of doctors and volunteers
This study involved a convenience sample of 26 general practitioners (GPs) practicing in the canton of Geneva, Switzerland. The GPs were asked to recruit 20 adult volunteers among their patients, ten for each of two preplanned measurement sessions. The patients' next scheduled appointment was synchronized with one of the measurement sessions. In the original study doctors were randomly assigned to two separate groups for the second measurement session. The intervention group received special training in anthropometric measures (the doctors received a training document, prepared by the authors, explaining the appropriate measurement methods according to international recommendations), [16][17][18] the other acted as control. Since the intervention did not appear to be associated with a significant improvement in GPs' measurement accuracy, measurements from both groups and both sessions were pooled for the present analysis [13].

Data collection
The measurement sessions took place within the normal routine of the practice. The GPs were asked to perform the measurements as usual, within the consultation. After having been measured by the GP in his/her consultation room, and while the GP was taking care of the next patient, each volunteer was measured in turn by the two research assistants in a quiet room, close to the consultation room. The research assistants took the measurements according to the recommended procedure for which they had been trained (see below). They were asked to take the measurements with a calibrated flat beam scale for mobile use (SECA 877, scale division: 100 g, capacity: 200 kg), a stadiometer (SECA 217, graduation length: 1 cm, range: 20-205 cm) and a measuring tape.

Training of the research assistants
Prior to the start of the study, a specialist in anthropometric measurements provided a theoretical and practical training to the two research assistants. As part of this training, a dozen volunteers were measured at the same time by the specialist and the research assistants to confirm that the measurements were accurate. The training was based on international guidelines (see Appendix 1) [16][17][18]. The instructor of the research assistants was a senior attending physician at Geneva University Hospitals, in the department of therapeutic education for obesity and chronic diseases.
Note that the research assistants were already rather skilled before this training, because they had previously been trained in anthropometric measurements as part of the Bus Santé study (an annual cross-sectional populationbased survey collecting data on cardiovascular risk factors in the canton of Geneva).

Gold standard
The average values of measurements performed by the two trained research assistants was used as a gold standard against which GPs' measurements were compared to assess the accuracy of GPs' measurements.

Statistical analysis
We used frequencies to describe the GPs' characteristics. We assessed the inter-observer variability between the two research assistants by computing the technical error of measurement (TEM) for each anthropometric measurement. We also computed %TEM (TEM/mean × 100), a measure of the coefficient of variation of TEM, because it is difficult to compare TEMs directly due to the positive association between TEM and measurement size. Then, the difference between the GP's measurements and the gold standard (=measurement error) was computed for each volunteer. We verified the assumption of normality using the Shapiro-Francia test [19]. One-sample t-tests were then performed to compare the GPs' and the research assistants' measurements, and we checked whether mean differences were statistically significantly different from zero. Analyses were stratified by BMI subgroup into underweight/normal (BMI <25 kg/ m 2 ), overweight (25 ≤ BMI <30 kg/m 2 ) and obese participants (BMI ≥30 kg/m 2 ). Since only a small subgroup of patients were underweight (n = 16, 3.1%) these were grouped with normal participants. Mean relative differences (by dividing the absolute differences by the average values of the measurements undertaken by the research assistants), [13,20] i.e. the percentage error of the measurements, were preferred to mean absolute differences, because they allowed us to make comparisons between BMI subgroups and anthropometric measurements, as relative values do not depend on the magnitude of these measurements. Then, using oneway analysis of variance (ANOVA), each difference was compared among the three BMI subgroups, which were also compared using Cuzick's nonparametric test for trend across ordered groups. Finally, multiple linear regression taking into account the repeated measures design, was performed to assess the association of BMI subgroups with the measurement errors after taking into account potential confouders (session (before/after training) and group (intervention/control)). As a measurement difference of <3% is unlikely to be clinically relevant, [11] we decided to consider that the measurements were accurate when the relative measurement errors were <3%.
Statistical significance was set at a two-sided p-value ≤0.05. We performed all statistical analyses with STATA version 14.0.

Results
Participating GPs were aged 44.1 years on average (standard deviation (SD) 6.1, range 33-59), and 58% were women; they were experienced doctors (years since certification 16.3 (SD 5.8), range 7-32). The GPs and two research assistants conducted measurements on 509 volunteers.
Overall, the mean relative measurement differences, computed from the absolute measurement differences (see Appendix 2), were not associated with BMI subgroups in crude analysis, and were smaller for weight, height, BMI and HC, compared to WC and WHR (Table 1). Height was the most accurate anthropometric measurement (measurement differences: 0.07% (SD 0.58) in underweight/normal, 0.04% (1.00) in overweight and 0.07% (0.72) in obese participants). Only height, BMI and HC were not statistically different when measured by the GPs or the research assistants. For weight, although the differences were small, they were statistically significant (0.36% measurement error for the normal group (absolute difference: 0.22 kg), 0.44% for the overweight group (absolute difference: 0.30 kg) and 0.37% for the obese group (absolute difference: 0.34 kg)).
In multivariable analysis (Table 2), the proportions of error were higher in overweight and obese compared to underweight/normal participants, they increased across BMI subgroups, except for height (ex: for weight, the coefficients were 0 for normal/underweight (baseline), 0.46% for overweight and 0.91% for obese participants), and they increased more for the abdominal compared to the general obesity anthropometric measurements, though the association with BMI subgroups did not follow a trend in a statistically significant manner, except for WHR. In addition, since a measurement difference of <3% is unlikely to be clinically relevant, [11] the differences in proportions of error between overweight/obese and underweight/normal participants were clinically not significant for general

Summary
Our findings indicate that GPs' weight, height, BMI and HC measurements are more accurate than their measures of WC and WHR, with errors becoming increasingly more important in higher BMI subgroups.

Comparison with existing literature
Several authors showed that general obesity measurements were more reliable than abdominal obesity measurements and we recently confirmed these findings when these measurements were taken by GPs within their own practice [12,21,22]. As suggested in our previous paper, [13] these results are probably explained by the fact that weight and height measurements are universally known and performed using a relatively simple procedure. In comparison, abdominal obesity measurements are newer concepts and require specific manipulation. We discussed the role of GPs' knowledge and their usual practice in anthropometrics in our previous paper [13]. We showed that, compared to weight, height and BMI, a majority of GPs hardly ever used the abdominal obesity measures and their knowledge regarding these measurements was relatively low. We showed statistically significant results regarding the mean relative differences between GPs' weight measurement and the gold standard. However, these differences are not clinically relevant (0.36% measurement error for normal group (absolute difference: 0.22 kg), 0.44% for overweight group (absolute difference: 0.30 kg) and 0.37% for obese group (absolute difference: 0.34 kg)).
We did not record the gender of the volunteers; therefore, we cannot compute the percentage of misclassification, as the definition of abdominal obesity differs by gender. However, in a previous study, [12] we showed that only 1% of the volunteers were misclassified when the measurements were based on the BMI, compared to 6% when using WC measurement, and 23% when using WHR determination.
Our study suggests that the proportions of error increase across BMI subgroups, except for height. These results slightly contrast with a small study, in which Wang examined the association between BMI subgroup and intrarater reliability. Two unexperimented research assistants received training and together measured WC and HC on 76 participants, twice within a 10-min interval: one was responsible for placing the tape and the other for reading the tape and recording the data. The reliability of these measurements was found to be high for all subgroups, without significant differences across BMI subgroups [14]. The design of this study did not include a gold standard, thus precluding the assessment of measurement error and direct comparison with our findings. In addition, the authors used Asian definitions of overweight and obesity to define BMI subgroups, with lower BMI cut-offs. In another study by Nordhamn [15], WC and HC were measured by two raters in 26 overweight and 25 lean participants. Each participant was measured four times, three times on the first occasion and once on the second. The authors concluded that reliabilty decreased in overweight participants for WC and WHR determination, but not for HC [15]. This study also had important limitations, which makes comparisons with our results difficult: measurement error could not be calculated (no gold standard), WC and HC were measured with the participant in supine position,

Limitations
First, the GPs were not selected at random (convenient sample); as a consequence, our findings may be too conservative, as these GPs may have been more concerned with the subject covered by our study, and therefore may take the anthropometric measurements more frequently and/or more carefully. Second, the study was carried out only in the Geneva area and the study sample may not be representative of all GPs in Switzerland, or Europe more broadly. Third, the study sample was biased towards a higher proportion of overweight (33.2%) and obese (21.6%) volunteers than in the general population in Switzerland [23]. Fourth, we did not record the age and gender of the volunteers, so we cannot provide adjusted mean differences.

Strengths
The study was undertaken within the normal conditions of day-to-day clinical practice and involved GPs with no particular training in anthropometric measurements. There were only minimal differences between the two research assistants in their mean measurements; this added strength to the value of our gold standard. Misclassification was unlikely, as we used the research assistants' measurements to stratify the subjects into BMI subgroups. Finally, the BMI subgroups were defined using the usual WHO definitions of overweight and obesity [24].

Conclusion
Our study suggests that the abdominal obesity measurements are particularly inaccurate when GPs use these to assess overweight and obese patients. We recommend that GPs essentially use general obesity measurements (i.e. weight, height and BMI determination) in daily practice, particularly when assessing overweight or obese patients.

Appendix 1
Points emphasised during training of the research assistants (two one-hour sessions) WEIGHT The participant should remove his shoes and all clothes except underwear, and then step on the centre of the scale, remaining in a relaxed position. Weight is recorded to the nearest 0.05 kg; if the participant refuses to remove his clothes, 1 kg is substracted from the measurement reading to account for the garments worrn and the refusal is reported.

HEIGHT
The participant should remove his shoes and all clothes except underwear, and stand erect on the floorboard of the stadiometer with his back on the side of the vertical board; the weight should be evenly distributed on both feet, the legs closed and stretched, the arms to the sides and the shoulders relaxed; the heels, buttocks and back should slightly touch the vertical board.
The participant is asked to look straight ahead, inhale deeply and stand fully erect while the examiner lowers the horizontal bar to the head crown with hair compressed, and takes the measurement to the nearest 0.1 cm.

WAIST CIRCUMFERENCE
The participant should remove his shoes and all clothes except underwear, and stand erect; he is asked to roll up the shirt/sweater and to lower the trouser/skirt waistband, so the examiner can palpate the hip area to identify the measurement reference points, and to mark the level of measurement (the midpoint between the lowest rib and the iliac crest). The measuring tape is placed horizontally, with sufficient tension to avoid slipping off but without compressing the skin.
The measurement is made at the end of a normal expiration, twice to the nearest 0.1 cm, the arms of the participant to the sides; if the difference between the two recorded measurements is greater than 0.5 cm, a third measurement is taken, and the mean of the two nearest values is recorded.

HIP CIRCUMFERENCE
The participant should remove his shoes and all clothes except underwear, and stand erect; he is asked to lower the trouser/skirt waistband, so the examiner can palpate the hip area. The tape is placed at the maximum extension of the buttocks, horizontal to the floor, with sufficient tension to avoid slipping off but without compressing the skin.
The measurement is made twice to the nearest 0.1 cm, the arms of the participant to the sides; if the difference between the two recorded measurements is greater than 0.5 cm, a third measurement is taken, and the mean of the two nearest values is recorded.
Waist circumference (WC) and waist-to-hip ratio (WHR): abdominal obesity in men when WC ≥ 102 cm and/or WHR ≥ 0.95; abdominal obesity in women when WC ≥ 88 cm and/or WHR ≥ 0.8.