TransWikia.com

Achievement System based on many past records

Software Engineering Asked by Miguel Stevens on February 9, 2021

I want to build an achievement system that awards certain badges based on conditions,

I wonder how to tackle this, if for example, the user has ran 9999 kilometers, if I want to award him a badge when he reaches that 10.0000km, that would mean after each activity, loop all previous activities and take the sum of the distance.

It feels like this isn’t going to scale well, but what is the other option?

  • Let’s say I store the total kilometers for that user, and update after each run, so base the achievement on the total km number set in his profile, but this limits me to only a simple badge, what if I wanted a badge that is awarded when a user runs a 100km per week, 4 weeks in a row?

  • I could keep track and update a data store after each activity, for example with the latest activity, update the amount ran this week, the previous week, the amount ran this day, … but this is bound to have data synchronization issue’s, what if we’re tracking the data of not one but a group of people?

  • Should I use an in-memory storage to keep all relevant data closeby? For example memcached or redis? This will have the problem of having duplicate data in the cache and in the real (mysql) database

  • Using Event Sourcing (which I know nothing about), would this be beneficial?

If I want to keep into account that badges can be added/removed and modified in retrospect, I already have to let go of my options

Are there any known models for this? I looked at the "Rule engine" approach, but that’s more an architectural decision, which doesn’t address this problem.

4 Answers

You should keep your original source of truth - your list of all activities performed - especially because any way you try to condense it might make it impossible to retroactively add new badges later. But then you can implement an additional auxiliary cache with the relevant data for each class of badges.

For the "total distance ran" badge system, you can store the total distance per user. When this system starts up, it won't know any total distances, so it will have to read all the runs so far and calculate the totals. But it will only have to do this once. Subsequently, every time an activity is posted, it can update and check the total, immediately.

If you find any bug in the updating logic, you can just delete the cache and let the system rebuild it - and remove badges from users who haven't run 10000km.

This can be implemented separately from most of the application. All it needs is a live feed of activities performed, including past ones.

If you want to implement "runs a 100km per week, 4 weeks in a row" - that needs a completely different kind of cache storage - perhaps a ring buffer per user - and if you implement it as a bolted-on module like this, it can have it.

Answered by user253751 on February 9, 2021

"If" is not an analysis.

You're not asking a concrete question. You're dreaming up possibilities and pondering what the world would be like if those possibilities were actually relevant. That's not a productive approach on how to implement the current requirements of your application.

What you're doing is the equivalent of saying "I have to design a new T-shirt. But what if people had 3 arms?". It's an interesting thought exercise but it adds nothing of value to the current work you have to do to design a T-shirt for 2-armed people.

Rather than focus on the things you can imagine, focus on the things you know for a fact. Then decide your implementation based on the actual tangible requirements.


Let's say I store the total kilometers for that user, and update after each run, so base the achievement on the total km number set in his profile

This is the simplest implementation that works for the badge you need.

but this limits me to only a simple badge

But it's the badge you needed to implement. If it's "only a simple badge", and that is somehow a problem, then why are you trying to implement this badge to begin with?

what if I wanted a badge that is awarded when a user runs a 100km per week, 4 weeks in a row?

Do you need such a badge? If you do, then track the data you need to observe the badge completion (in this case, storing the last 4 week totals would suffice).

If you don't, then why theorize about it? Pondering features that you don't need is not productive when trying to consider how to implement the features you do need.

what if we're tracking the data of not one but a group of people?

The same response applies here. Is this actually something you need or not? Because there's no point discussing how to implement a feature you don't need.

If you do need it, then this needs proper analysis. How is a group defined? Can people belong to several groups? If so, does their personal progress get shared with all groups? Is the group badge based on personal minimums for everyone, a group total, a group average, ...?

This is an unanalyzed point. But whatever the analysis finds, the end result will always be to track the data that is relevant to observe badge completion.

Should I use an in-memory storage to keep all relevant data closeby? For example memcached or redis?

This question has nothing to do with which data you need to store to track a certain badge's completion. What you store and where you store it are unrelated topics.

Using Event Sourcing (which I know nothing about), would this be beneficial?

There is nothing in your question that leads me to believe event sourcing would add functionality that is (a) necessary and (b) not already available.

If I want to keep into account that badges can be added/removed and modified in retrospect

If you want the ability to retroactively apply future badges, that means that you need to store all data. Any data you fail to store is going to lead to the inability to create a badge in the future that relies on this data (that you won't have anymore).

You really need to outweigh the hassle and performance hit of storing every possible snippet of data, the odds that you're going to need that data in the future, and the consequences of not having that data when you do implement a future badge that needs it. Don't forget to account for all that effort you're going to spend on data that you're actually never going to use, if you never end up creating a badge that requires this data.

This is not objectively answerable by a random stranger on the internet. This is a matter of you deciding what matters to you.

Answered by Flater on February 9, 2021

You were mentioning so many "what-ifs" about the possible conditions for those awards that I am under the impression you need to do some more requirements analysis first. But you will be better off when you don't try to build some "catch-all" system with dozens of "just-in-case" assumptions. Instead:

  • Design your database initially with as few redundant information as possible. From what you wrote, I guess storing number of kilometers per activity per user, together with the timestamp of the activity, everything else can be deduced.

  • Implement one rule type at a time. After each rule, measure performance! Only when you notice real, relevant performance issues, introduce redundant pieces of information like intermediate sums which suit that specific rule (like the sum of all kilometers of all activities for a certain user, or sum over some time interval, or sums over some groups, or average values, or minimum/maximum values over a certain interval, or number of consecutive days with minimum run length, whatever your rules require).

  • The only optimization you should care for beforehand is proper indexing of your DB. When you are selecting or aggregating the activities of one user, your program should probably not make a full table scan over all records of all users.

  • Do you have to care for error corrections in existing activities? Then you need to provide methods for recalculating redundant data from the "base data", and you need to have a clear distinction what "base data" is, and what derived data.

  • If activities are persisted immutable with no changes to them allowed afterwards, you may handle corrections not by changing older activities, but by having some extra "correction activity". Then you are doing already some sort of "Event Sourcing", and the distinction between "base data" and "derived data" is quite unimportant. Recalculating redundant data from the event data might still be necessary, but only in case of a system failure or program error, not as part of a regular correction use case.

Answered by Doc Brown on February 9, 2021

Hold those horses!

  • how are you calculating these achievements?
  • when are they awarded?

First off these are two different species of "achievement".

  • Total KM run, is a simple accumulator
  • X Km run Per T Time Period Starting At Y Time Offset for N consecutive time periods, is either a series of accumulated buckets, or something else entirely depending on what the Y offset is.

Second off when you are awarding these achievements will also play a factor in which solutions are even reasonable.

An offline solution done once per day is unacceptable if the awarding needs to happen when the run is completed.


For achievements of the first ilk it make sense to phrase it as a single bucket, with levels on it. A level at 10, 100, 500, 1000, etc... or whichever make sense for what the bucket is.

You could calculate this total:

  • periodically (like once per day/wekk) using all the events.
  • cache the result from yesterday and add the days tally.
  • update the tally as events come in.

It really depends on when the achievement is to be "awarded".


As for the second ilk of achievement, that is more complex. You really need to answer what the Y Time Offset is.

If this is exactly 168 hours ago then welcome to fun land. Your only solution is to calculate each and every bucket of time each time you check to see if an "award" is due.

If you can get it to mean each week starting from Monday. Then its possible to use a series of buckets. No need to recalculate historic buckets, and for the current bucket any of the single bucket approaches would work.

If the weeks slide by day (or hour) you can still make this work by bucketing to the slide period (days/hours) and then summing the buckets when checking for the award. Of course this is more work.

Answered by Kain0_0 on February 9, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP