TransWikia.com

Trying to figure out the optimal selection based on a set of rules

Software Engineering Asked by Envin on January 24, 2021

Background: We have software that displays different products to the user

Problem: With a given set of rules, determine which is the primary product we should show the user. These are images. We are writing this in Python, but I don’t think that matters.

Rules:

These don’t really matter so I’m just going to abbreviate them. I can calculate a boolean return value for each of these. What matters is that some must match completely, some must match in an order of precedence.

Rule    Notes
Rule_1   The image HAS to have this.
Rule_2   The image HAS to have this.
Rule_3   The image HAS to have this.
Rule_4   (see the next two for actual rule) The image must match one of the following:
  Rule_4a   If image is this, use it
  Rule_4b   Otherwise, use this
Rule_5   The image must match one of the following in order (descending preference)
  Rule_5a   If this passes, then this image could be used
  Rule_5b   If this is a different type, lets say a "front" image, use this
  Rule_5c   If this is another type, "back" of the product image.
Rule_6  This image must be at least this size

What I’m thinking:

I’m thinking of iterating through the list of images, and attaching a score to each one and then returning the highest score.

Problem with that:

I’m not sure how to handle the priority order (let me know if that is making sense). Best way I can state it is that some of these are && and some of these are ||, but I don’t know how to do || in precedence like this.

2 Answers

Your idea of giving each image a score is a good one.

To deal with the rules "match all subrules", "match any subrule without preference" and "match any subrule with preference", I would assign the scores like this:

  • The scores are on a scale of 0 to 100. 0 means the rule is not matched at all and 100 means a perfect match. Truely binary rules (an image either matches or not) can only give a score of 0 or 100.

  • A "match all subrules" has a score the minimum score of all the subrules

  • A "match any subrule without preference" has the maximum score of all the subrules

  • A "match any subrule with preference" gives each subrule a preference factor, which is a value in the range [1 .. 0) and the score of each subrule is multiplied with the corresponding preference factor before taking the maximum of the resulting scores.

    The preference factors can either be assigned manually or automatically (for example by having the rules equidistant in the interval; i.e. Rule_5a = 1, Rule_5b = 0.6, Rule_5c = 0.3)

This way, a match with a lower preference will usually give the image a lower score and the preference mechanism works like a partial match to the rule.

Correct answer by Bart van Ingen Schenau on January 24, 2021

If you're not going to put weights on the rule set, I believe something like this should work:

Rule1: (0 / 1)
Rule2: (0 / 1)
Rule3: (0 / 1)
Rule4: (0 -> 3) (or 0 / 1 / 2 if XOR, failing on 0)
    Rule4a: 2
    Rule4b: 1
Rule5: (sum of subrules: 0 -> 7)
        # every value matches to a unique case (f.e.: 5b + 5c = 3),
        # every sub-rule value is greater than the sum of the next ones 
        # sorting by alphanumeric might scale better for large subrule set)
    Rule5a: 4
    Rule5b: 2
    Rule5c: 1
Rule6: (0 / 1)

2*2*2*3*8*2 = 384 cases in your example.

You can then assign an int flag [0 - 384] to assign priority (or just leave the resulting string of digits like below, which still show the 'compatibility' to the rule set)

111251

It's just a (string or) integer, isn't it? The higher the number (or alphanumeric string in a sorted list of them), the more the images are in accordance with the rules. You can set and scale precedence with more digits/characters reserved per rule (for XOR or case-select with precedence).

List comprehension on those resulting values should be fast enough to check for rules separately or whatever else you might need to change.

If you're going to use weights, a numpy array might do the job, combining a system similar to the one described above (but binary) with weights per rule and sub-rule.

Answered by DocLeonard on January 24, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP