Mathematics Project Topics

Application of Multivariate Analysis to Agronomic Trial

Application of Multivariate Analysis to Agronomic Trial

Application of Multivariate Analysis to Agronomic Trial

Chapter One

 Objective of the Study

The specific objectives of the study include:-

  • To test the multivariate normality assumption on agronomic
  • To study the relationship of the agronomic measured characteristics in both the growth and yield
  • To determine the percent contribution of both direct and indirect correlation in path
  • To reduce the dimensionality of the measured characteristics from m to p; where p <

CHAPTER TWO

LITERATURE REVIEW

Introduction

Agronomy is the science of crop production and the manipulation of cultural practices in a way to improve the productivity of the crop and the soil. The actual practice of agronomic trial include varietals trial, fertilizer trial, planting date, weed management, inter and intra row spacing, crop protection etc. Though few examples of multivariate analysis of agronomic trials exist in the literature, such analysis can provide useful additional information as a supplement to normal univariate analysis.

Multivariate techniques provide the statistical tools that can be used by the researchers to analyze complete set of data. It handles the analysis of problems  that involves independent variables and several dependent variables, which may be correlated to one another to varying degrees (Anderson, 1984). This chapter is intended to provide an overview of multivariate concepts and a thorough dimension of path analysis, principal component and factor analysis. This thesis focuses on the reduction of the dimensionality of the agronomic variables and testing the normality assumption of a multivariate normal distribution.

Normality

Normality of data can be assessed either by the shape of the curve on the histogram plot or by the linearity of quantile-quantile plot. In Q – Q plots, we

compare the real value of the variables against the standardized values (Kendall et al. 2004). Suppose that f(x, θ) is the density distribution of the population, where  θ is the unknown parameter. Q– Q plots can be used to test for the normality of  the order statistic, X(1) ≤ X(2) ≤ X(3) ….≤ X(n) where X(i), is the 1 th quantile of

the sample. Thus, if the plot of (X(i) , ZZ(i)) or the Q–Q plot is linear, then the variable is assessed as normal. The linearity of the Q-Q plots can also be tested by a sample correlation coefficient, rQ (Kendall et al, 2004).

Let ρ be correlation between X(i) and Z(i). We reject Ho: |ρ| =0, i.e linearity in

favour of Hi: |ρ| ≠ 0, if the sample correlation coefficient is less than our critical value. The correlation coefficient is given by:

C o v X

i i , Z

j j )

rQ    =  

V a r X

i i )´ V a r Z

j j )

0 ≤ rQ ≤ 1

The Q–Q plot will be considered linear if the value of rQ is high, i.e close to 1

Path Analysis

Path analysis is a statistical technique that differentiates between correlation and causation. It partitions correlations into direct and indirect effects (Afifi and Clark, 1984). The technique features multiple linear regression and generates standardized partial regression coefficients (Path coefficients). Path coefficients are independent of units of measurement; the relative importance between causal relationship may be determined (Loelilin, 1987; Pantone et al., 1989). It has been used rather extensively in agronomic studies to study factors affecting plant yield (Pantone et al., 1992). Path coefficient analysis is a method of partitioning simple coefficient of correlation in its linearly correlated variables in to the direct effect between an independent and dependent variables and the indirect effect(s) via its respective coefficient (s) of correlation and the path coefficient. Simple correlation of coefficient is also known by several other names, like linear correlation coefficient, first degree of polynomial response, total correlation coefficient, etc (Dewey and Lu, 1959). It is called linear correlation coefficient to distinguish it from other curvilinear responses. It is known as first degree of polynomial, because higher degrees of polynomials like, quadratic (second degree), cubic (third degree), quartic (fourth degree), etc. can be estimated and their significance is tested in Analysis of Variance (ANOVA) by the significance of response with the deviation from that response. Normally, four to five degrees of polynomials explain all the variances. It is essential that the cause and effect type of relationship must exist between the independent and dependent variables proven beyond doubt with the logic and literature. Otherwise simple coefficient of correlation is the most abused statistic. Generally, it is considered that any two variables can be related. The way case of smoking and lung cancer has lingered on for decades is a good example. The principle of path analysis is explained using a simple example with two independent and a dependent variable, followed by another example with three independent and a dependent variable. It is also possible to work path coefficients with more than one related dependent variables.

The choice of variables and their number is very important. It is not worth selecting variables with negative correlations and path coefficients, because they do not contribute to the dependent factors. Also, higher number of factors may result in a high residual, which cannot be explained.

 

CHAPTER THREE

METHODOLOGY

Sources of Data

The data used in this project were obtained from the research station of the Institute for Agricultural Research, ABU, Samaru Zaria, Nigeria. The data are the results of the fertility trial of cowpea rain-fed in the period 2004/05 and 2005/06 rainy seasons. The experiment was conducted in the randomized complete block design using a five fertilizer rates (0, 30, 60, 90 and 120) grams and replicate three times. The observed data were measured on both vegetative and production stages.

Quartile-Quartile Plot

Prior to the use of any multivariate method, we must  test for normality. The multivariate tools PCA, FA and CA cannot be applied unless the variables X1, X2,. . . , XP have a multivariate normal distribution. If each Xi is normally distributed, then it is reasonable to assume that X1 , X2,. . . , XP has a multivariate normal distribution. If not, a reasonable and appropriate transformation can be applied, such as the log transformation, a square root transformation, arcsine transformation or Box-con transformation, in order to obtain an overall multivariate normality.

CHAPTER FOUR

RESULTS AND DISCUSSION

Introduction

This chapter consists of the results obtained from the analysis conducted in the research work. The data obtained from the Institute for Agricultural Research, Samaru, ABU Zaria was subjected to statistical analysis using the SPSS ver. 15 statistical package. The path analyses was carried out on the variables and correlation coefficient were partitioned into both direct and indirect contributions and observed that shoot dry weight and crop growth rate are contributing more to the yield of cowpea. The Principal component analysis was carried out using a 90 x 10 dimension matrix. This order of the dimension was the runs of the trial and the parameter measured in the crop. The principal component analysis was able to extract three component from the earlier dimension and its explain about 95.6% of the variation in the model. Prior to the analysis, the data was subjected to the normality test using the Q-Q plot and linearity was obtained in the plot which shows that the Multivariate Normal Distribution was observed in the data. If not, transformation can be applied, such as log transformation, a square root transformation, or an exponential transformation, in order to obtain an overall multivariate normality. Since the below Q-Q plot have a linear plot, then the variables are assessed as normal.

CHAPTER FIVE

SUMMARY, CONCLUSION AND RECOMMENDATION

 Summary and Conclusion.

The result of the study shows that the data used in the study observed normality assumption as shown by the plotted Q-Q plot in Fig. 1 and 2. It was observed that a linear plot indicated by the Q-Q plot signifies linearity. The data observed a multivariate normal distribution.

In Table 3 of chapter four, it was observed that Pod yield and number of Pods were observed to be positively and significantly correlated with all the components assessed (Number of Branches, Weight of defoliated leaves, Number of Pods, Length of Pods, Shoot Dry Weight, Plant Height, Number of Leaves, Leaf Area Plant, and Crop Growth Rates) except Leaf Area Index and Weight of Defoliated Leaves. The various parameters exhibited significant interrelationship with each other. This shows the importance of the parameter to the pod yield of cowpea. (Ado et al., 1988) reported that the correlation coefficients between two variables is the sum of the paths connecting them, partitioning the correlation gives the direct and indirect contributions of the different independent parameters (components) on the dependent variable (pod yield). In Table 2, Shoot Dry Weight and Crop Growth Rate contribute mostly to the yield of Cowpea. Also, principal component was used in the reduction of the dimensionality of the data collected. The dimension of the variable was reduced to three component with the first component explaining 59.15%, second component explains 24.10% and the third component explains 12.32% of the variation. Cumulatively, 95.57% of the variation was explained by the three components as shown in both the table of principle component coefficient and the scree plot (Table 1 and Fig. 3).

Recommendation

It is observed from this research work that a base line information was gotten that the data generated from field trials of cowpea are of multivariate normal distributions and data reduction techniques can be employed to help in analyzing and understanding of the data.

This would assist the users of this type of data to make use of multivariate statistical methods. Moreso, since often the variables are correlated.

REFERENCES

  • Adebayo A. A and Tukur A. L.(1999) Adamawa in the maps 1st ed. Pp 112.
  • Ado SG;Tanimu B. Echekwu, CA and Alabi S.O. (1988 ) Correlation and Path coefficient analysis between yield and other characteristics in sunflower. Nigeria Journal of Agronomy.
  • Afifi, A.A., and V. Clark. (1984). Path analysis. P. 235-237. In Computer – Aided Multivariate Analysis. Lifetime Learning Publ., Belmont, CA.
  • Anderson T. W (1984). An Introduction to Multivariate Statistical Analysis. 2nd Edition. New York: John Wiley and Sons.
  • Dewey, D.R and K.K. Lu (1959). A Correlation and Path Coefficient Analysis of Crested Wheat Grass Food Production. Agronomy Journal 51:515-518.
  • Everith B.S and Dunn G. (2001). Applied Multivariate Data Analysis, 2nd edition, Arnold.
  • Farnham, I.M., Johannesson, K.H., Singh, A.K., Hodge, V.F., Stetzenbach, K.J. 2003. Factor Analytical Approaches For Evaluating Groundwater Trace Elements Chemistry Data. Anal.Chim Acta. 490, 123-138.
  • Harry H. H (1976) Modern Factor Analysis 3rd Edition Revised, USA.
  • Garcia, J.H., Li, W.W., Arimoto, R., Okrasinski, R., Greenlee, J., Walton, J. 2004. Characterization and Implication of Potential Fugitive Dust Sources in the Paso del Norte region. Sci. Total Environ. 325, 95-122.
  • Hotelling, H. (1954): Multivariates Analysis, in O. Kempthorne et al. (eds.), “Statistics and Mathematics in Biology,” pp. 67-80, The Iowa State University Press, Ames.
WeCreativez WhatsApp Support
Our customer support team is here to answer your questions. Ask us anything!