Which supervised learning algorithms are available for matching?

Question

I'm working on a non-profit where we try to help potential university applicants by matching them with alumni that want to share their experience/wisdom and, at the moment, it is happening manually. So I'll have two tables, one with students and one with alumni (they may have some features in common, but not necessarily all of them)

$begin{array}{|l|c|c|} text{Name} & text{Gender} & text{Height}  hline text{Kathy} & F & 165  hline text{Tommy} & M & 182  hline text{Ruth} & F & 163  hline ... & ... & ...  end{array}$ $begin{array}{|l|c|c|} text{Name} & text{Gender} & text{Weight}  hline text{Miss Lucy} & F & 65  hline text{Miss Geraldine} & F & 70  hline text{Miss Emily} & F & 60  hline ... & ... & ...  end{array}$

Currently, we are manually matching the members of table 1 with those in table 2. We will also collect information after the match ("Was it a good match? Please rate it on a scale from 1 to 10"). So it will look something like this:
$$
begin{array}{|l|l|c|} 
text{Person #1} & text{Person #2} & text{Match?}  hline
text{Ruth} & text{Miss Lucy} & N  hline
text{Tommy} & text{Miss Emily} & Y  hline
text{Kathy} & text{Miss Geraldine} & N  hline
text{Ruth} & text{Miss Emily} & N  hline
... & ... & ...  
end{array}$$

I would like to use a learning algorithm for this process. I know a little bit of machine learning, but I am still very much a novice (so it's also an opportunity for me to learn more about it), but I can't wrap my head around how you would do this kind of supervised learning when you have two sets both of which have multiple features. What sort of matching algorithms are available to do this? (Also, I prefer to work in R)

(By the way, I would be grateful if you could just point me in the right direction and I'll try to read about it and solve it myself. Also, I know how deeply frustrating it is to see questions that have already been answered -- if this is case, please don't hesitate to let me know without answering the question. I have already tried to search for various strings on Google and StackExchange, but mostly find lecture slides on graph theory that don't seem to be what I'm looking for (although it may just be because it's a bit over my head). Many thanks!)

João Almeida · Accepted Answer

You can try to frame this problem as a recommender systems situation. Where you have your users (prospective students) and items (alumni) and want to recommend to the users one item.

It's not a perfect fit as you want just one item for each user and you don't have previous match data for each user. However you could investigate this idea a bit further. I'm applying these techniques to the recruitment problem, I'm matching users with job offers and I'm having some success.

Try to read a bit about recommender systems, to start I recommend chapter 9 of mining massive data sets, it's really introductory, but gives a good overview of the most common techniques.

DaL · Answer

I would have separate the problem into two:

Predicting whether a certain pair will be a good match.
Matching the pairs.

First, lets discuss the prediction problem.
I think you should treat matching the pairs as a supervised learning problem and not as a recommendation problem.
As João Almeida wrote new student won't have any previous relations with alumni.
Even the alumnus will have very few previous relations. 
I would have add to each alumni some features based on aggregations (e.g., the number of past relations, the ratio of past good matches).

After that you should build a dataset of the past pairs, using 'Match?' as the concept.
It is not clear whether you will be able to learn a good match rule, even if it exists.
I guess that your dataset is quite small. If the probability of a match is low, you might have imbalance problem. 
As AN6U5 commented, height and weight are quite strange features to match students to alumnus.
Compute the relations between the features and the concept (e.g. mutual information, Pearson correlation) in order to see if you have useful features.

As for the second question, even if you can predict well if a pair will be a good match, you still have an algorithmic problem of which pair to use.
Consider a "super alumni" that will be a good match to any student. You wouldn't like to match it to a "super student" but to a student that it will be hard to match to other alumni.
Luckily, there matching algorithms that you can use.
Build a graph with the students and alumnus as nodes. Create an edge if you predict a good match and run a matching algorithm upon it.

Peter · Answer

When you have some historical data on good/bad matches (and "okay" features to describe these matches), you can try a Siamese Neural Network. This type of model is a "few shot" model, meaning that it is designed to work with a relatively small ammount of (training) data and potentially noisy features.

Essentially you fit a model to pairs of (training) data ("matches"). The model will learn to interpret the available features so to calculate a Euclidean distance between pairs, e.g. $0 =$ "no distance" (perfect match) vs. $1 =$ "max. distance" (very poor match). The good thing is, that you can generate a lot of training data (by making pairs) even when there are few observations (the whole idea comes from image classification with few training images, works like a charm!).

Once you have a trained model and you want to find a new match for a university applicant, you can simply make (hypothetical) pairs to be predicted by the model, e.g.:

1 Applicant A <-> Alumni 1
2 Applicant A <-> Alumni 2
3 Applicant A <-> Alumni 3

When you feed these (hypothetical) matches to the network and make a prediction, you will get a vector of "distances" for the pairs, something like:

1 0.83
2 0.06
3 0.56

Provided that your training data are okay and that your model works well, you can immediately make a ranking of possible matches (lower distance would be better, i.e. $2 succ 3 succ 1$ in the example above).

I'm a great fan of siamese networks. They can be a little complex, but when you get the model right, siamese networks are extremely powerful (and often more powerful than alternative models/approaches).

I don't have sample code in R at hand. However, see this GitHub repo for a "toy version" of a siamese neural network with Keras/Tensorflow in Python. The model uses the "Iris data" (three classes). You should be able to use most of the code (data prep and model) off the shelf for your usecase. Maybe adjust some parts of the model, such as the hidden layers. Since Keras is also available for R, you could adapt the model to R.

For the prediction part, just predict single pairs as described above (unlike averaging over classes like in the code sample).

Note that the model in the sample code uses a custom lambda layer. So when you save the model, you need to save the weights only. When you load the model for prediction, just load the model in plain code and add (load) the weights. Lambda layers cannot be compiled properly by Keras' save/load command. So hardcoding the model and loading the weights only is required.

Which supervised learning algorithms are available for matching?

3 Answers

Add your own answers!

Ask a Question