Identification of dietary patterns by factor analysis and study of the relationship with nutritional status of rural adolescents using factor scores

Study was undertaken to know food and nutrient consumption patterns and their relationship with nutritional status among rural adolescents in Orissa. It was a Community based cross sectional study, conducted at district level in the State of Orissa. Data on 686 adolescent boys and 689 adolescent girls were utilized. Factor analysis was used to find dietary pattern and discriminate analysis and its relationship with undernutrition. The study revealed that among adolescent boys, there existed six patterns among food-groups and three patterns among nutrients explaining 52% and 76% of total variation. Similarly among adolescent girls, seven patterns among food groups and three patterns among nutrients, explaining 67% and 80% of total variation. The discriminate analysis using the factor scores revealed overall 56% of adolescent boys, and 53% of girls were correctly classified. About 46% of boys who were actually thin were predicted as normal, while, 40% who were normal were predicted as thin. Among girls 50% who were actually thin were predicted as normal, while, 36% who were normal were predicted as thin. In conclusions, there exists considerable relationship between dietary patterns and nutritional status among rural adolescents.


Background
Adolescence is a period of rapid growth and human development, after infancy [1]. Adequate intake of foods and nutrients contribute significantly to the growth and development during the adolescence period particularly among the girls, the future mothers [2]. The complexity of relationship of dietary patterns presents a challenge to study the prevalence of under nutrition in relation to dietary intakes. Food intake is generally studied in terms of adequacy of nutrients [3]. However, foods contain other chemical compounds, some of which are established, some are poorly characterized, and others being completely unknown cannot be measured. Since there is very little information about dietary intakes and the relationship with nutritional status of adolescent population in India [4], an attempt was made to find the relationship between dietary intakes and nutritional status.
About 45% of rural adolescents in India are currently reported to suffer from undernutrition, as assessed by Body Mass Index (BMI) < −2SD [5]. The relationship between intake of a food-group and the prevalence of under nutrition may erroneously be attributed to a single component, overlooking the fact that there exists multi-colinearity between nutrients and foods, which can be demonstrated by employing sophisticated statistical procedure, such as principal components and factor analysis, to derive food consumption patterns [6]. Factor analysis is used to find latent variables or factors among observed variables [7]. In other words, factor analysis is used to reduce the number of variables which groups variables with similar characteristics together. The reduced number of factors can also be used for further analysis. Thus, using Factor analysis it is possible to study the food and nutrient intake patterns and its relationship with the nutritional status. For developing general descriptions of dietary patterns, Principal Component Analysis (PCA), followed by factor analysis is used. The objective is to transform a large set of correlated variables into smaller sets of non-correlated variables, known as principal components or factors [8]. In factor analysis, rather than establishing a diet indicator, data objectively indicate as to how measurements are clustered. The aim of this technique is to identify the underlying structure in data matrix, by summarizing and consigning data to arrive at a systematic measurement of the diet. To summarize data, factor analysis desires dimension that, when interpreted and understood, describes data in terms of a much smaller number of items than do the individual variables [8].
The objective of the present analysis is to describe the food and nutrient-consumption patterns among adolescent rural population that are apparently homogeneous, and to relate these with the prevalence of thinness in the State of Orissa, India using factor analysis.

Study design and sample
The data on diet and nutritional status, collected from the survey of district nutrition profile in Orissa state,was used. In total, 12,000 households-400 households per district-from all the 30 districts in Orissa state were covered [9]. Anthropometric measurements, viz. height and weight, were taken on all the available individuals in the households. In every alternate household covered for nutrition assessment, a 24-hour recall family diet survey was carried out. Individual dietary intakes were also assessed on 12,621 individuals of different ages of both the sexes. From the average daily intake of foods, nutrients were computed using food composition tables [10]. The present analysis included data on 686 adolescent boys and 689 adolescent girls, aged between ≥ 10 and < 18 years, on whom the data on both anthropometry and dietary intakes was available.

Data collection
Data was collected by qualified staffs (nutritionist, social worker and anthropologist) that were trained for a period of 3 weeks in standard survey methodologies. Anthropometric measurements such as height and weight were collected using standard equipment and procedures [11]. Height was measured using anthropometer rod and weight was measured using SECA weighing balance. Diet survey was carried out using 24 hr recall method of diet survey in alternative house hold covered for anthropometry [12]. Scientist from the Institute had revisited the household completed on previous day by the project staff to ensure quality control.

Statistical analysis
The mean, median, and inter-quartile range of various food-groups and nutrients for adolescent boys and girls were calculated. Diet patterns were obtained by exploratory factor analysis for 13 food-groups and 11 (3 macro and 8 micro) nutrients. Further, the proportion of adolescents consuming < 50% of Recommended Dietary Intake (RDI) [13] for foods and nutrients was calculated for different age groups of both the sexes.
Exploratory Factor analysis-an explorative multivariate statistical technique-was used for the identification of factors in a set of dietary measurements. Such factors would correspond to indicators, and all variables were considered simultaneously, each one in relation to the others. Principal Component Analysis was used for extraction of factors and orthogonal rotation (varimax option) to derive non-correlated factors and minimize the number of indicators that have high loading on one factor [6,7]. The first component extracted is the one that accounts for the maximum possible variance in the dataset. The second component, independent of the first, will be the one that explains the largest possible share of the remaining variance and so on, without the components being correlated with each other [8].
Since dietary data has been collected using 24 hr recall method actual intakes were measured. When quantitative data is available, factor analysis is the suitable statistical method and gives better results as compared to Food Consumption Score, Dietary Diversity score where in these methods uses scoring procedure instead of actual diet intakes. Since we did not aim to compare various methods to determine dietary intake patterns, the other methods have not been applied.
The adequacy of data was evaluated based on the value of Kaiser-Meyer-Olkin (KMO) and Bartlett's test (homogeneity of variance). The KMO measure, which represents the adequacy of sample-size, compares the value of partial correlation coefficients against the total correlation coefficients.
Undernutrition of a given individual was computed on the basis BMI of a given individual (weight in kg/height in metre 2 ). The adolescents were categorized into one of the two groups of 'Thinness' as Group 1: BMI < median −2 SD units and Normal if BMI ≥ median -2SD units as Group 2 using WHO reference values [14] .
Discriminate function analysis was used for studying the relationship between the food and nutrient intake and the nutritional status [8]. The factor scores obtained by the factor analysis for food and nutrients were considered as continuous independent variables, and BMI categories of the adolescents were considered dichotomous dependent variable. Statistical analysis was performed using the SPSS software (version 19.0).

Ethical approval
The study was approved by the Institutional Ethical Review Committee and Scientific Advisory Committee (SAC) of the National Institute of Nutrition. The study does not involve any biochemical investigations; hence only informed written consent was obtained from the head of the village and from head of the household.

Results
The average and interquartile range of various food intakes according to sex is presented in Table 1. The food grouping was done as is adopted in the National Nutrition Monitoring Bureau surveys in India over the past 40 years and is well-accepted by nutritionists. The mean and median intakes of were comparable in both the boys and girls with respect to cereals & millets, pulses and legumes, roots and tubers, other vegetables, condiments and spices, and fats and oils . However, the median intakes of nuts and oil seeds, fruits, fish and other sea foods, meat and poultry, milk and milk products and sugar and jaggery indicated that 50% were consuming these foods inadequately. In fact, 75% of the adolescents were consuming income elastic foods like fish and other seafood's, meat and poultry and milk and milk products in amounts < 50% of Recommended dietary intakes(RDI). However the intakes are more or less similar in both adolescent boys and girls.
The median intakes of various nutrients according to age groups and sex are presented in Table 2. The median intake of energy for boys and girls were 1902 Kcal and 1844 Kcal respectively. There exists a large variation in the consumption of various nutrients was observed among adolescents. There were no sex differentials in the median intakes of various nutrients.
Since the Recommended dietary intakes (RDIs) are different for different age groups and sex, the intakes were compared with RDI according to age groups and sex (Table 3). About 7 to 16% of adolescents were not meeting even 50% of RDI for cereals. In case of pulses more than a half of the adolescents were not meeting 50% RDI. Improving trend with age in the consumption of different foods was observed in both the boys and girls, with no significant sex differentials.
The consumption of energy was relatively better with 86% consuming > 50% of RDA, while about 70% were consuming proteins > 50% of RDA (Table 4). More than three fourths of the adolescent boys and girls consumed less than 50% of the RDA for fats and micronutrients such as calcium, riboflavin iron and vitamin A.
There was sufficient correlation between different foods (Boys: 0.613; Girls: 0.513) to proceed with factor analysis. Similarly, in the case of nutrients, the KMO measure indicated a higher correlation between nutrients (boys: 0.793 and girls: 0.755). The Bartlett test of sphericity for foods and nutrients among boys and girls was highly significant (p < 0.01), indicating homogeneity of variance by the consumption of foods and nutrients.
Five components for boys and seven components for girls were extracted by factor analysis for different foodgroups ( Table 5). The five components (factors) in the initial solution have an Eigen value over 1, accounting for about 52% of the observed variation in the foodconsumption pattern among the boys, while, among the girls the seven factors accounted for about 67% of the observed variation in the food-consumption pattern.
The first factor, which accounted for 16.6% of the total variance among the boys, was labeled as income-elastic foods, as high factor loading was observed for sugar and jaggery, followed by milk and milk products, Fats and oils and pulses and legumes. The second factor explained 10.1% of the total variance and was labeled as plant-foods, and green leafy vegetables, fruits and cereals and millets were the major contributors. The third factor accounted for 9.5% of the total variance, and this factor was labeled as traditional contributed by the intake of fish and other sea foods, meat and poultry and condiments  and spices. The fourth factor explained 8.4% of the total variance and labeled as plant foods contributed by other vegetables and roots and tubers. The last and the fifth factor explained 7.9% of the total variance and were labeled as protein-rich foods (nuts and oilseeds).
In case of the girls, the first factor, which accounted for 13.7% of the total variance, was labeled as incomeelastic foods with high factor loadings for milk and milk products followed by sugar and jaggery. The second factor explained 10.4% of the total variance and was labeled as traditional (Fish and other sea-foods and Fats and oils). The third factor accounted for 9.7% of the total variance, and was labeled as plant foods characterized by the intake of other vegetables and roots and tubers. The fourth factor explained 9.2% of the total variance and labeled as staple such as cereals and millets along with fruits. The fifth factor explained 8.0% of the total variance and were labeled as plant-protein (pulses and legumes).The sixth and seventh factors explains 7.7% of total variation each consists of meat and poultry, condiments and spices, nuts and oil seeds and green leafy vegetables (Table 5).
In the case of nutrients, three components for each sex were extracted (Table 6). Energy, thiamin, niacin, protein, riboflavin, and free folic acid had higher loadings on factor 1 among the boys, which explained 51.8% of the total variance. It was labeled as macro nutrients and B-vitamins. Factor 2, labeled as 'vitamins' contained vitamin A and C and had high loadings explaining 14.5% of the total variation. The third factor characterized by the intake of calcium and iron, labeled as 'minerals' which accounted for 9.3% of the total variance. In case of girls, factor 1 explained 47.2% of total variance with loadings of similar to that observed in boys. The second factor which explained 17% of total variation was labeled as fats and minerals (Calcium and total fat) and the third factor explaining 15.4% of total variation was labeled as macro nutrients and vitamins.
The prevalence of thinness (BMI < −2SD) was 49.4% among the boys and 46.6% among the girls.
The distributions of predicted thinness based on the derived BMI values by the discriminate function analysis using the factor scores of various foods and nutrients as against those based on the anthropometric measurements are presented in Table 7. The discriminate analysis revealed overall 56% of adolescent boys, and 53% of girls were correctly classified. About 46% of boys who were actually thin were predicted as normal, while, 40% who were normal were predicted as thin. Among girls 50% who were  actually thin were predicted as normal, while, 36% who were normal were predicted as thin.

Discussion
Studies have shown that conventional dietary pattern has some limitations to study the relationship between dietary patterns and type 2 diabetes as well as cardiovascular diseases [15,16]. Our earlier study has established relationship between factor scores of dietary patterns derived through factor analysis and chronic energy deficiency [17]. In the present manuscript similar technique has been used to demonstrate the relationship between dietary pattern and nutritional status among adolescents.
The dietary data has been collected using 24 hr recall method in which actual intakes were measured. When quantitative data is available, the multivariate factor analysis which is based on interrelations of the entire data set is a suitable statistical method. The result obtained through this method is robust in indentifying the dietary patterns.
It is interesting to note that a half of the rural adolescents were not consuming green leafy vegetables, nuts and oil seeds, fruits, fish and other fish products, meat and poultry, milk and milk products and sugar and jaggery. Low consumption of green leafy vegetables among adolescents may be due to non availability and low purchasing capacity among the study population. Consequently, about 70-90% of them were not meeting even 50% of RDI for various nutrients such as calcium, iron, vitamin A and riboflavin [18]. Results of an earlier study revealed that both PCA and cluster analysis are useful approaches for the assessment of dietary patterns [19]. A common criticism of the two  techniques is that these involve several subjective -but important-decisions, such as grouping of foods and nutrients and possible transformations of variables [20].
PCA also involves decisions about the number of components to be retained and their subsequent labeling [20]. Another disadvantage of these techniques is that though they generate patterns based on variation in diet, there is no guarantee that these patterns will be predictive of a particular health outcome. However, the techniques have the advantage that they are empirically derived and are, therefore, not limited by mere a priori knowledge [8].
In developing countries, studies on identifying dietary patterns and their relationship with nutritional status are scarce. The intake of foods and nutrients highly correlated, and the classification methods based on univariate analysis may lead to flawed estimates [8]. Therefore, multivariate methods, such as factor analysis, represent an alternative approach to the evaluation of individual foods and nutrients and to examine the association of dietary patterns with nutritional status. Moreover, it should be kept in mind that individuals intake of nutrients depend on their intakes of different types foods, which are influenced by many factors,  Factor scores derived by factor analysis predicted correctly (52.8%) those who are thin as thin and normal as normal group. Adolescent girls (50.4%), who were originally thin (<−2SD) were predicted as normal and 36.3% who were originally normal were predicted as thin.
such as cultural, socioeconomic and demographic characteristics. Describing food intake in different consumption patterns may be useful in developing community based intervention programmes. It is possible to find a smaller number of measures using factor analysis that would permit the identification of persons who are nutritionally at risk [21]. Several studies examined the relationship between dietary patterns and nutritional status, overweight, obesity etc., there were many in consistencies in establishing a clear relationship between them [22,23]. However, the present data showed a considerable relationship between the dietary pattern and the nutritional status, as indicated by the distribution of predicted BMI based on factor scores. About a half of the adolescents of with thinness (boys: 54% and girls: 50%) were correctly classified based on the scores obtained by factor analysis. Similar findings were also observed among adult population from our earlier study [17].
National Institute of Nutrition conducted workshop to disseminate the results of the survey to all the stakeholders of Government of Orissa and distributed brochures containing the need for consumption of locally available foods such as green leafy vegetables and fruits. This will be useful for the politician, policy planners or development partners in formulating optimal dietary interventions to this vulnerable population [24].

Conclusion
Factor analysis allowed the identification of dietary patterns based on the data from food and nutrient intake. The factors extracted for boys and girls are different in case of food groups, but similar in case of nutrients among adolescent rural population.
There exists a relationship between specific diet pattern and thinness among the rural adolescent population. These results will be useful for identifying adolescent boys and girls with thinness in the community and help the planners in formulating dietary interventions to them.