Design and validation of a food frequency questionnaire (FFQ) for the nutritional evaluation of food intake in the Peruvian Amazon

Background Food frequency questionnaires (FFQs) are dietary surveys widely used in studies of nutritional epidemiology because they are inexpensive, easy to complete and provide significant information on dietary intake over a long period of time. FFQs should be written specifically for each study group since diet may be influenced by ethnicity, culture, economic status and environmental factors. The aim of the authors on this work was to design and validate a food frequency questionnaire for the Peruvian Amazon population. Methods Three hundred three individuals were selected and completed three 24-h recalls (R24). Two FFQs were conducted on two different occasions (FFQ.1 and FFQ.2). The validity of the FFQ was evaluated by comparing the relationship between the average daily nutrient intake estimated by the FFQs and that obtained by the three R24. The reproducibility was measured by comparing the mean nutrient intake of the two FFQs carried out. The correlations were assessed using the Pearson correlation coefficient, the intraclass correlation coefficient (ICC), the Lin correlation coefficient (CCC) and the Bland–Altman plot. Results The results obtained to establish the validity showed a high correlation, with an average Pearson’s correlation coefficient of 0.70, a CCI of 0.65 and a CCC of 0.60. Approximately 60% of the nutrients had a CCC above 0.60. In terms of reproducibility, better results were obtained, with an average Pearson’s correlation coefficient of 0.67, 0.64 for CCI and 0.58 for CCC. Conclusions The correlation coefficients show good validity and reproducibility, and therefore, the FFQ we have developed may be considered a useful and valid tool to estimate the dietary intake of the Peruvian Amazon population.


Background
Dietary intake is one of the determining factors in the development of chronic diseases. Unlike other habits, it is a complex value to measure because people are rarely able to discriminate exactly what and how much they eat [1,2]. In addition, an inaccurate dietary assessment can often be a major obstacle to understand the impact on the risk of developing diet-related diseases [2].
A range of different tools, from 24-h recalls (R24) or dietary diaries to food frequency questionnaires (FFQ), are used nowadays to measure dietary intakes. FFQs are the most commonly employed tools in studies of nutritional epidemiology because they are inexpensive, easy to develop and provide useful information on dietary intake over a long period of time [3][4][5]. Furthermore, this tool can be used to conduct personal interviews. This is extremely important when studying population groups with high illiteracy rates.
The FFQ should be developed specifically for the target population of the study since diet may be influenced by ethnicity, culture, economic status and environmental factors [6]. Besides, the questionnaire must be validated to ensure that the measurements are correct and therefore provide relevant information [7]. To assess the validity of the measurements obtained, it is necessary to check that the results obtained are similar to the ones observed when employing other methods [8][9][10].
The Department of Loreto is the largest territory in the Peruvian Amazon. Its capital city is Iquitos, located in the Great Plains of the Amazon Basin on the banks of the Amazon River at the confluence of the Nanay and Itaya Rivers. According to the data from the Peruvian National Institute of Statistics and Informatics (INEI) in 2015 [11], "this province has been placed in the third group in the range of extreme poverty incidence, with a high rate of chronic malnutrition in children and of overweight and persistent diseases in adults". According to diverse studies, the population of Loreto has not been eating properly for a considerable period of time [12][13][14][15]. Given the amount of food resources available in this area [14], this situation can be put down to a poor food culture along with a low purchasing power [15]. For this reason, conducting dietary intake studies represent a fundamental step to know the vulnerability of the population of Loreto and the severity of the problems associated with their nutrition. At the same time, not many researches have focused on assessing intake and eating patterns in this region. It is important to highlight that its geographical location, the climate, the large number of existing indigenous people and its border situation with three other countries (Ecuador, Colombia and Brazil) provide significant differences in the eating habits of its inhabitants with respect to other regions of Peru [14,15].
However, in Peru, the FFQ currently in use is not suitable for this area since it is made up of foods commonly consumed in urban areas regardless of food availability, consumption and eating habits of the people who live in the jungle.
The objective of this work is, therefore, to design and validate a new FFQ specifically aimed to study food intake habits of the Peruvian Amazon population analysing their food consumption patterns in greater depth and identifying possible deficiencies and their relationship with chronic diseases.

Study design, population and sample
We carried out an observational survey study divided in three phases. The target population was made up of the inhabitants of Pueblo Libre, a populated centre of Belén located in the periphery of Iquitos, an area affected by floods between February and June. This location lends it a transitional character between the communities who live close to the rivers in the jungle and the city itself. It was selected for this study because it clearly represents this transition and because it allows us to obtain more information about the local foods consumed in the area.
The statistical-epidemiological package EPIDAT (version 4.1) and the Pueblo Libre population census (Belén District, Iquitos, 2009) were used to determine the sample size [16]. For an expected prevalence of chronic malnutrition of 24.9% [17], an accuracy level of 5% and a safety level of 95%, a minimum sample size of 218 persons was obtained.
Stratified random sampling without replacement was performed by age and gender based on the random selection of 40 homes corresponding to Sector 12 of Pueblo Libre. It was assumed that the family unit of each home was made up of an average of 6 people. In the end, a total of 303 individuals were selected aged between 3 and 83 years old.

Reference method
To design the FFQ and its subsequent validation, the R24 was taken as the reference method. This type of dietary survey provides information not only about an individual's food consumption on a given day but also about the amounts of food ingested, the ingredients and the cooking method used on each recipe. Besides, since it is carried out as an interview, it is an ideal method for population groups with a low literacy level, like in this case [18].
For each studied individual in the sample, three R24 were collected over three consecutive days: two on business days and one on a non-working day. The R24 were conducted through a personal interview by a qualified and trained nutritionist at each participant's home. In the case of children, their mothers were interviewed. The interviewees explained the amounts of food they had consumed using domestic measurement units (a tablespoon, a cup, etc.). These were shown to the interviewer, who then estimated the weight in grams of every portion. The R24 were given to the same person on all occasions to avoid variability in food intake.
After that, the average amounts of food, energy and nutrients ingested were calculated using Nutriplato 4.7 software [19]. It was updated with information from the Peruvian Amazon food composition database compiled by us and the nutritional details of the dishes most frequently consumed by this population group [15].

Food frequency questionnaire
To design and validate our FFQ, we developed a food composition database specifically for the Peruvian Amazon population. We used it together with the FFQ that was being used in other areas of Peru [15]. We omitted non-typical Amazon foods and added other commonly consumed ones in the studied area. It was composed of 132 food types classified in 10 categories ((1) dairy products; (2) eggs, meat, fish and seafood; (3) vegetables; (4) fruits; (5) cereals, legumes and dry fruits; (6) fats and oils; (7) creams; (8) sweets and snacks; (9) drinks; and (10) condiments). The amount of food consumed was calculated according to the amounts consumed on a daily, weekly, or monthly basis. The weight in grams of the average servings was estimated using the mean of consumption obtained in the R24 (Additional file 1).
One hundred six of the 303 individuals who completed the R24 went through the FFQ personal interview twice: firstly, 2 months after the R24 completion and a second time, 5 months later (Fig. 1).
The average food, energy and nutrient consumption estimate was also performed using the updated Nutriplato 4.7 [19] software.
The FFQs were given to the same person on all occasions to avoid variability in food intake.

Validation and analysis of the data
The quantitative variables were analysed according to their mean, median and standard deviation (SD) for the R24 and FFQ. The importance of the differences in nutrient intake between the FFQ assessments and the mean value of the three R24 was carried out using the Wilcoxon test or t test for related samples, according to the distribution of the data. Gross classification errors were calculated using contingency tables for the average intake value of the three R24 and the FFQ.1 since this one was the closest in time.
To evaluate the validity of the FFQ, we compared the correlation between the average daily nutrient intake values obtained for the two FFQs and that obtained for the three R24 [4,[20][21][22]. Histograms with normality curves and the Kolmogorov-Smirnov test were used, considering p > 0.05 to verify the normality of the distribution of each nutrient. The data (log 10 ) of the variables which did not comply were transformed. Pearson's correlation coefficient was used to obtain an initial approximation of the correlation between the variables. Next, the intraclass correlation coefficient (ICC) was calculated [23,24] in order to correct and eliminate inter-and intra-individual variations. To do this, the data were adjusted for energy according to the residual method proposed by Willet [24,25] to control the confounding effect of the calories. Finally, Lin's concordance correlation coefficient (CCC) was used to compute both the accuracy of the FFQ and the accuracy of the relation. This enabled us to observe the deviation of the data obtained by the two methods employed on a line from the origin and at 45°, corresponding to the perfect line of concordance or perfect reproducibility.
The Bland-Altman plot was used for energy and adjusted macronutrients [26,27] to check graphically the limits of agreement between the measurements carried out by the R24 and FFQ.
The reproducibility assessment of the FFQ was performed by measuring the correlation between the two FFQs performed, with a 5-week difference, as well as using the Pearson correlation coefficient, the CCI and the CCC [8,23,25]. The calculations were made using the SPSS programme, version 15.0 (SPSS Inc., Chicago, USA). In all statistical tests, the significance level used was p < 0.05.

Characterisation of the sample
Three hundred three individuals were selected between 3 and 83 years old, thus exceeding the limit of the previously calculated sample size which was 218.
Regarding gender distribution, the balance was almost exact: 156 men (51.3%) and 147 women. These data coincide with the gender structure used in 2009 [16] ( Table 1).
As for the age and gender distribution, the population structure has also been respected. If we group this variable in the same age strata, we find that it follows a distribution similar to the one published in the census ( Table 2).
The mean, median and SD of nutrient intakes are shown in Table 3. The values estimated by the FFQs were significantly higher than those of the R24 for calcium, phosphorus, potassium, iron, folic acid and vitamins B12, C and D. No significant differences were found for macronutrients and energy. Regarding the comparison of intake between both FFQs, no significant differences were found, except for calcium, phosphorus and riboflavin.
The percentages of classification errors are shown in Table 4. In both FFQs and the R24, between 60% and 96% of the individuals were classified in the same quintile or the adjacent one.

Validity
The validity results for energy and nutrient intake between R24 and FFQ.1 are shown in Table 5. After analysing Pearson's correlation coefficient, ranges were found from 0.65 to 0.87 for macronutrients, 0.55-0.89 for minerals and 0.12-0.83 for vitamins. The highest correlations were obtained for energy, sodium, thiamine, vitamin B6 and folic acid and the lowest correlations were found for vitamins A and D. The ICC ranged from − 0.25 to 0.88 with values above 0.60 for all nutrients except for calcium, iron and vitamins A and D, with unadjusted data. When adjusted for energy, the ICC values were lower for many nutrients, although nearly all showed a correlation greater than 0.6. The values for Lin's concordance correlation coefficient were higher for macronutrients and energy, with values between 0.76 and 0.89, 0.44 and 0.81 for minerals and 0.09 and 0.76 for vitamins. In all cases, the values obtained for vitamins A and D showed the lowest correlation. Table 6 shows the validity results between R24 and FFQ.2. The Pearson correlation coefficient produced a value of 0.87 for energy. For macronutrients, a range from 0.77 to 0.88 was observed, with minerals between 0.21 and 0.91 and vitamins between − 0.11 and 0.84. The ICC for unadjusted data was 0.86 for energy, between 0.77 and 0.84 for macronutrients, from 0.10 to 0.87 for minerals (with calcium showing the lowest correlation) and between − 0.026 and 0.85 for vitamins. Except for calcium, phosphorus, iron and vitamins A and D, all the nutrients obtained an ICC above 0.6. After adjusting the data, minimal variations were observed and the results were the same as for the unadjusted data, being calcium, iron and vitamins A and D the micronutrients that obtained the lowest values. The values for CCC were higher for macronutrients and energy: between 0.84 and 0.88. For        minerals, very different data were obtained: 0.04 for calcium, 0.38 for iron, 0.72 for potassium and 0.86 for sodium. In the case of vitamins, only vitamin C, folic acid and vitamin B12 obtained a CCC higher than 0.6. Vitamins A and D were the ones that obtained the lowest values: 0.02 and − 0.07, respectively. The Bland-Altman plots indicated a high level of agreement between both methods (Fig. 2) since there were very few observations outside the limits for energy, protein, lipids and carbohydrates.

Reproducibility of the FFQ
The Pearson correlation coefficient showed very high values for energy and macronutrients (0.61-0.87). In the case of minerals, ranges between 0.22 and 0.95 were obtained, with calcium at the bottom end and those of sodium and phosphorus at the top. The ICC gave a value of 0.87 for energy, a range from 0.61 to 084 for unadjusted macronutrients and from 0.80 to 0.91 for adjusted macronutrients. In the case of minerals, the values ranged between 0.12 and 0.93 for non-adjusted data and between 0.12 and 0.91 for adjusted data, with calcium obtaining the lowest results in both cases. CCC showed very high values most of which were above 0.6, being energy, proteins, lipids, sodium and potassium, the ones showing the greatest concordance. Again, calcium and vitamins A and D obtained the lowest values ( Table 7). The Bland-Altman plots showed a high level of agreement between both FFQs, since very few observations were found outside the limits.

Discussion
An FFQ was designed and validated to make a nutritional assessment of food intake in the Peruvian Amazon. To determine the validity, the average nutrient intake was obtained using and comparing three R24 on consecutive days. As found in many other studies [3,9,20,25,[28][29][30][31][32], R24 was chosen as the reference method.
To determine reproducibility, the FFQ data were compared in two different occasions, with a time interval of 5 months between both moments. Several authors have indicated that to ensure minimum variation between the results obtained from two surveys collected from the same individual, there must be an intervening time period from 4 to 6 months [2,26].
Regarding the distribution of the sample, it should be noted that there are not many individuals aged over 50 included in the age groups due to the low life expectancy of the population in this area as shown in the last census [11,12,16].
The results obtained to establish the validity show a close correlation, with an average Pearson's correlation coefficient of 0.70, an average ICC of 0.65 and 0.60 for CCC. Approximately 60% of the nutrients had a CCC above 0.60. If we estimated the validity by comparing R24 with FFQ.2, we observed a slight decrease in the correlation values obtained (0.67, 0.63 and 0.54), probably because the time elapsed between the two surveys exceeded 12 weeks [2,26].
As for reproducibility, better results were obtained, with an average Pearson's correlation coefficient of 0.67, 0.64 ICC and 0.58 CCC. It should be noted that, in this case, 70.6% of the nutrients obtained a CCC above 0.6.
Only vitamins A and D and calcium had a much lower correlation. This is probably due to the lack of nutritional information about these vitamins and this mineral in the South American food composition tables [33,34] added to the Peruvian Amazon food database and used for this validation. The lack of information about common foods in food composition tables is considered a source of error in the validation of FFQs [2]. After adjusting the data by the residual method, a slight fall was produced in the ICC for almost all nutrients, as shown in other studies [10,23,[35][36][37].
Our results are higher than those obtained in the studies performed by Dehgham [8,38], Elorriaga [31], Satvinder [35] and Marcinkevage [39] and similar to others, such as the one carried out by Jackson [9] on the Jamaican population group, which obtained an ICC range between 0.5 and 0.88. Nor was it possible in this study to establish a correlation for fat-soluble vitamins. Another study performed on a New Zealand population group by Wong JE, obtained a Spearman's correlation coefficient of 0.71 and an ICC of 0.69 when estimating reproducibility [40].
The Bland-Altman plot, based on a graphical interpretation, was used to obtain further information about the relationship between the FFQs and the results obtained via the R24. The results we observed are similar to the ones shown on the studies conducted by Trinidad [10], Zapata [32], Goni [41] and De Salvo [42], where a small number of individuals fell outside the recommended limits, confirming an acceptable level of agreement between both methods. The classification capacity obtained by the FFQ was greater than the one observed in other studies [25,31,32,43,44] when comparing the extreme misclassification measured with FFQ.1 and the average of the three R24. However, it should be noted that some studies, for comparison purposes, grouped the participants in tertiles [25] or quartiles [45] instead of quintiles, as was the case in our study.
We can, therefore, confirm that the correlation coefficients indicate a good relationship to establish both validity and reproducibility, since they are within the values considered acceptable (0.5-0.8) according to Cade et al. and Willet [2,46] (Fig. 3).

Conclusions
We conclude that the FFQ employed is a useful and valid tool to estimate the dietary intake of the Peruvian Amazon population, and thus, it can be used in future studies that link dietary patterns to chronic diseases.

Limitations
The main limitation of this research is the fact that it was impossible to use accepted biomarkers as a reference value for the validation of the FFQ. However, numerous researchers also used R24, just like us [8,9,20,23,31,32,35,38,39,44,45].