TransWikia.com

Find clusters of positive correlations

Mathematica Asked by user72725 on February 3, 2021

I have a square matrix that shows relationships between 71 plants: 1 is a positive, -1 is negative, 0 is inconclusive and blanks are unknown:

I would like find the largest groups of plants that only have a positive relationship with no negative relationships among any of the members.

data = {{"", "Basil", "Cucumber", "Tomato", "Potato", "Peanut"}, {"Basil", 
  "", 0, -1, 0, ""}, {"Cucumber", "", "", "", -1, -1}, {"Tomato", -1,
   "", "", "", ""}, {"Potato", "", -1, 0, "", ""}, {"Peanut", 1, -1, 
  "", "", ""}}

The full set:

https://pastebin.com/06krccza

I was able to find out who to make a Weighted Adjacency Graph using:

Generating social network graph from a CSV file

However, I am looking for something that simplifies relationships to where I can pick out groups of good matches.

One Answer

Ingest the data, make a matrix of the correlations, make a list with plant names:

data = Get["~/Downloads/06krccza.txt"];
matData = data[[2 ;; -1, 2 ;; -1]];
lsPlantNames = Rest@data[[1]];
Length[lsPlantNames]

(*70*)

Make association of correlations and distances:

aCors = Association@
   Map[lsPlantNames[[#[[1]]]] -> #[[2]] &, 
    Most[ArrayRules[SparseArray[matData]]]];
aDists = Map[
   N@Which[TrueQ[# == 1], 0, TrueQ[# == -1], 1000, True, 1] &, aCors];

Note that in order to address question's main, non-trivial condition

[...] find the largest groups of plants that only have a positive relationship with no negative relationships among any of the members.

the distances in aDists that correspond to negative correlations are (very) large numbers.

Make a nearest neighbors graph:

gr = NearestNeighborGraph[lsPlantNames, {90, 0.1}, 
  DistanceFunction -> (Lookup[aDists, Key[{#1, #2}], 1000] &), 
  Method -> "Octree", DirectedEdges -> False, 
  GraphLayout -> "SpringElectricalEmbedding", VertexLabels -> "Name"]

enter image description here

Find cliques / clusters:

lsClqs = FindClique[gr, Infinity, All];
Length[lsClqs]

Examine the clusters lengths:

Tally[Length /@ lsClqs]

(*{{4, 1}, {3, 10}, {2, 32}, {1, 36}}*)

Verify the found clusters do not have negative correlations

aHasNegativeCor = 
  Association[# -> FreeQ[Outer[aCors[{##}] &, #, #], -1] & /@ clqs];
Tally[Values[aHasNegativeCor]]

(*{{True, 78}, {False, 1}}*)

Examine the negative correlation and/or delete it:

Select[aHasNegativeCor, ! # &]

(*<|{"Beans, Runner", "Garlic", "Leek"} -> False|>*)

Final result:

lsClqs2 = Keys[Select[aHasNegativeCor, # &]];
lsClqs2[[1 ;; 4]]

(*{{"Onion", "Pea", "Potato", "Tomato"}, {"Onion", "Parsnip", 
  "Tomato"}, {"Leek", "Onion", "Pea"}, {"Garlic", "Leek", "Pea"}}*)

First answer

Some code that might help these questions.

Since data was not provided, let us make some:

SeedRandom[32]; 
data2 = Block[{lsWords = Sort@RandomWord[71], res},
   res = Flatten[
     Table[{lsWords[[i]], lsWords[[j]], 
       RandomChoice[{0.1, 0.8, 0.1} -> {-1, 0, 1}]}, {i, 1, 
       Length[lsWords]}, {j, i + 1, Length[lsWords]}], 1];
   res = Union[Join[res, res[[All, {2, 1, 3}]]]];
   Select[res, #[[3]] != 0 &]
   ];

Make a graph with the positive correlations only:

 gr = Graph[UndirectedEdge @@@ Select[data2, #[[3]] > 0 &]]

Find graph communities:

 CommunityGraphPlot[gr, VertexLabels -> "Name"]

enter image description here

If you provide the actual data more adequate answers might be given.

Answered by Anton Antonov on February 3, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP