In this post, we focus on presentation bias, a negative effect present in almost all ML systems with User Interfaces (UI).
ML models often produce scores and rankings that are displayed in UIs for human decision-makers. Depending on the way ML results are presented in these UIs, you can expect different behaviors from the end-user.
Here are some common types of presentation biases:
❌ Position bias
The probability of receiving user feedback from an item is affected by where the item is shown. This is the root cause of position bias. As an illustration, Bar-Ilan et al. (2008) conducted an experiment to identify the presentation biases in search engines. They showed that placement on the search engine’s results page is the most important factor to assess the quality of a suggested item, rather than the actual content displayed.
Worse, sometimes some ML results are not presented at all to the end-users. There is then absolutely no way to collect feedback from them and learn properly from their behaviors. This is a well-known problem in statistics, called censoring. For example, in an e-commerce recommender system, if low-scored products are taken from items that are presented, we will be biasing our model, since we are only learning from customer behaviors on products that are already high-scored.
Fortunately, there are remedies:
✅ Presentation discounting
One technique is to focus on items that were previously shown to end-users but were not selected by them. One simple way to address the issue is to present these items less frequently. This is called a presentation discounting system and was introduced by Li et al.(2014). This discounting technique is interesting because it does not change the ML model by itself. It only reduces (discounts) the presentation of some unselected items.
✅ Introducing randomness
Another way to address the problem is to introduce some form of randomness to affect the presentation of items (Radlinksi et al., 2006). The more we add randomness, the more we degrade the short-term performance of the model, but the more we reduce the presentation bias in the long term. There is then a trade-off between short-term and long-term performance that has to be fixed by the developers of the ML system.
At Giskard, we reduce presentation bias by confronting ML results with human reactions using a visual quality inspection tool.
- Bar‐Ilan, J., Keenoy, K., Levene, M., & Yaari, E. (2009).
- Li, P., Lakshmanan, L. V., Tiwari, M., & Shah, S. (2014, August). Modeling impression discounting in large-scale recommender systems.
- Radlinski, F., & Joachims, T. (2006, February). Minimally invasive randomization for collecting unbiased preferences from clickthrough logs.