Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google

Correlation Plot

A correlation measures how two variables are related and is useful for measuring the association between the two variables. A correlation plot shows the strength of any linear relationship between a pair of variables. The ellipse package provides the plotcorr function for this purpose. Linear relationships between variables indicate that as the value of one variable changes, so does the value of another. The degree of correlation is measured between $[-1, 1]$ with 1 being perfect correlation and 0 being no correlation. The Pearson correlation coefficient is the common statistic and R also supports Kendall's tau and Spearman's rho statistics for rank-based measures of association, which are regarded as being more robust and recommended other than for a bivariate normal distribution. The cor function is used to calculate the correlation matrix between variables in a numeric vector, matrix or data frame. A matrix is always symmetric about the diagonal, and the diagonal consists of 1s (each variable is perfectly correlated with itself!)

The sample R code here generates the correlations for variables in the wine dataset (cor) and then orders the variables according to their correlation with the first variable (Type: [1,]). This is sorted and ellipses are printed with colour fill using cm.colors.



library(ellipse)
wine.corr <- cor(wine)
ord <- order(wine.corr[1,])
xc <- wine.corr[ord, ord]
plotcorr(xc, col=cm.colors(11)[5*xc + 6])

http://rattle.togaware.com/code/rplot-wine-corr.R

The correlation matrix is:

> wine.corr
                       Type     Alcohol       Malic          Ash  Alcalinity
Type             1.00000000 -0.32822194  0.43777620 -0.049643221  0.51785911
Alcohol         -0.32822194  1.00000000  0.09439694  0.211544596 -0.31023514
Malic            0.43777620  0.09439694  1.00000000  0.164045470  0.28850040
Ash             -0.04964322  0.21154460  0.16404547  1.000000000  0.44336719
Alcalinity       0.51785911 -0.31023514  0.28850040  0.443367187  1.00000000
Magnesium       -0.20917939  0.27079823 -0.05457510  0.286586691 -0.08333309
Phenols         -0.71916334  0.28910112 -0.33516700  0.128979538 -0.32111332
Flavanoids      -0.84749754  0.23681493 -0.41100659  0.115077279 -0.35136986
Nonflavanoids    0.48910916 -0.15592947  0.29297713  0.186230446  0.36192172
Proanthocyanins -0.49912982  0.13669791 -0.22074619  0.009651935 -0.19732684
Color            0.26566757  0.54636420  0.24898534  0.258887259  0.01873198
Hue             -0.61736921 -0.07174720 -0.56129569 -0.074666889 -0.27395522
Dilution        -0.78822959  0.07234319 -0.36871043  0.003911231 -0.27676855
Proline         -0.63371678  0.64372004 -0.19201056  0.223626264 -0.44059693

                  Magnesium     Phenols Flavanoids Nonflavanoids
Type            -0.20917939 -0.71916334 -0.8474975     0.4891092
Alcohol          0.27079823  0.28910112  0.2368149    -0.1559295
Malic           -0.05457510 -0.33516700 -0.4110066     0.2929771
Ash              0.28658669  0.12897954  0.1150773     0.1862304
Alcalinity      -0.08333309 -0.32111332 -0.3513699     0.3619217
Magnesium        1.00000000  0.21440123  0.1957838    -0.2562940
Phenols          0.21440123  1.00000000  0.8645635    -0.4499353
Flavanoids       0.19578377  0.86456350  1.0000000    -0.5378996
Nonflavanoids   -0.25629405 -0.44993530 -0.5378996     1.0000000
Proanthocyanins  0.23644061  0.61241308  0.6526918    -0.3658451
Color            0.19995001 -0.05513642 -0.1723794     0.1390570
Hue              0.05539820  0.43368134  0.5434786    -0.2626396
Dilution         0.06600394  0.69994936  0.7871939    -0.5032696
Proline          0.39335085  0.49811488  0.4941931    -0.3113852

                Proanthocyanins       Color         Hue     Dilution    Proline
Type               -0.499129824  0.26566757 -0.61736921 -0.788229589 -0.6337168
Alcohol             0.136697912  0.54636420 -0.07174720  0.072343187  0.6437200
Malic              -0.220746187  0.24898534 -0.56129569 -0.368710428 -0.1920106
Ash                 0.009651935  0.25888726 -0.07466689  0.003911231  0.2236263
Alcalinity         -0.197326836  0.01873198 -0.27395522 -0.276768549 -0.4405969
Magnesium           0.236440610  0.19995001  0.05539820  0.066003936  0.3933508
Phenols             0.612413084 -0.05513642  0.43368134  0.699949365  0.4981149
Flavanoids          0.652691769 -0.17237940  0.54347857  0.787193902  0.4941931
Nonflavanoids      -0.365845099  0.13905701 -0.26263963 -0.503269596 -0.3113852
Proanthocyanins     1.000000000 -0.02524993  0.29554425  0.519067096  0.3304167
Color              -0.025249931  1.00000000 -0.52181319 -0.428814942  0.3161001
Hue                 0.295544253 -0.52181319  1.00000000  0.565468293  0.2361834
Dilution            0.519067096 -0.42881494  0.56546829  1.000000000  0.3127611
Proline             0.330416700  0.31610011  0.23618345  0.312761075  1.0000000

Copyright © 2004-2010 Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010