Recommender Sytems - Learning To Rank

Nish · July 24, 2023

Intro

Modern day Recommendation Sytems can involve multiple componenents as depicted below all of which are neccessary in order to make a final recommendation.

Simplified view of making a recommendation. General takeaway is you start of with lots of items and those are reduced until you are left with your final recommendations.
Simplified view of making a recommendation. General takeaway is you start of with lots of items and those are reduced until you are left with your final recommendations.

In essence this can be broken down into 3 parts

  1. Retreival Component
    • Selecting the top k items to feed into the ranking model.
  2. Ranking Component
    • Ranking is a ubiquitous task that appears in many systems not just recommendations systems. These include systems involving Search, Question & Answer etc.
    • More generally ranking is the act of ordering a specific set of items for a specific context.
    • Learning to rank specifically refers to learning the function $f$ which generates scores for items given a certain context and does so in a way that resulting ranking based on those scores is optimal.
  3. Post-ranking Component
    • Filtering down the ranked list based on other factors that might be important (fairness, recent trends etc).

The focus of this article is ranking component.

Resources

  1. Libraries
    • TensorFlow Recommenders provides a great end-end production grade library for developing recommendation based applications (various useful components). Main advantage is that it’s been battle tested and is used as part of Googles: YouTube & Google Play services.
    • TensorFlow Ranking provides useful functionality specifically for the ranking tasks.

How does LTR work at a high level?

As mentioned in the intro you are essesntially learning a function $f$. This learning can be broken down roughly as follows.

  1. Create and feed in your dataset.
    • Typically a dataset can take the form of a tuple of vectors. Essentially each element in the dataset is a list of items which you want to rank alongside a label which acts as a relevance proxy.
    • Mathematically something like $\mathcal{D} = (\textbf{x}, \textbf{y}) \in \chi^{n} \times \mathbb{R}^{n}$
  2. Create your model which outputs scores.
    • This would typically be some neural network based model which would take in the above dataset in batches and then output scores for each of the model.
    • The way in which the model output scores can be different and typically falls into 3 buckets.
      • Pointwise methods
        • Considers whether items are independently relevant outputs a score for each item (regression/classification).
      • Pairwise methods
        • Doesn’t care about whether items are independently relevant or not but instead relative preferences between pairs of items.
        • Can think of it interms of the question “Can I correctly predict relevancy between pairs?”
      • Listwise methods
        • State of the art methods which consider the entire list and what the ordering between all items is.
        • Associated loss functions are more complex.
  3. Adjust your model using a loss function.
    • The loss can act in the standard way as follows $\mathcal{L}(f) = \frac{1}{ \lvert D \rvert} \sum_{(\textbf{x}, \textbf{y}) \in \mathcal{D}} \mathcal{l}(f(\textbf{x}), \textbf{y})$ where you can optimize it as standard using gradient descent.
      • Main difference is it acts on a vector rather than a single item.
    • Specific types of losses for the ranking use case can be found on the TensorFlow Ranking loss docs.

What tools can be used to implement these?

A library known as TensorFlow Ranking exists which can help out with a bunch of these tasks. This library has been tested and is advocated by Google developer advocates for use when designing production grade systems.

In particular it includes things like ranking tailored losses, ranking metrics, data loaders, network layers, models etc which make it alot easier to build models.

Citation Information

If you find this content useful & plan on using it, please consider citing it using the following format:

@misc{nish-blog,
  title = {Recommender Sytems - Learning To Rank},
  author = {Nish},
  howpublished = {\url{https://www.nishbhana.com/Learning-To-Rank/}},
  note = {[Online; accessed]},
  year = {2023}
}

x.com, Facebook