Recommender Systems (RS) represent the classic example of technology that we use without us being fully aware of it. Just think of all the times Netflix suggests us the next movie or series to watch or when Amazon suggests a new product to buy or the adverts we are exposed to while browsing the internet. In summary, Recommender Systems, or recommendation systems, are machine learning systems, are algorithms that suggest users the elements that are relevant to them; films to watch, products to buy, articles to read… Their importance has grown over time and today they are fundamental tools in many sectors, since they improve the relevance of a service to its users. And it is certainly no coincidence that a few years ago Netflix was giving away millions of dollars to the developer who created a recommendation system that worked better than the one already in use.
What are Recommender Systems?
In extreme synthesis, to ‘know’ in advance which videos, films, news or music the user will request, services such as Google or Netflix use a Machine Learning technique called Recommender Systems. RS comprise a class of techniques and algorithms capable of suggesting relevant elements to users. Ideally, the suggested items are as relevant as possible to the user, so that they can interact with them.
Then elements are classified according to their relevance and the most pertinent ones are shown to the user. The role of Recommender Systems is so applicable that already four years ago Harvard Business Review wrote: ‘Perhaps the only most important algorithmic distinction between born digital and legacy companies is not personnel, data sets or computing resources but their real-time commitment to providing accurate and actionable customer advice. Recommender Systems actually lead organisations to radically rethink how they can get more value from their data while creating greater value for their customers. […] The more people use a company’s Recommender System, the more precious they become and the more valuable they become, the more people use them’. It is important to underline how the approach to Recommender Systems has evolved over the years. Thus, over time, what was considered simply as a classification or prediction problem was faced according to the Markov Decision Process (MDP) in which Machine Learning, Reinforcement Learning and Deep Reinforcement Learning methodologies are used. With this approach, even more space has been given to the areas of application of the recommendation engines.
The different approaches of the Recommender System.
To achieve their purpose, that is to suggest users the relevant elements for their navigation, RS are mainly based on two methodologies: collaborative filtering and content-based methodology, to which is also added a hybrid form between the two.
Collaborative filtering actually provides predictive information on a user’s interests, starting from a very large set of information relating to the preferences of other users. That, it is based on the interactions previously recorded and stored in the user-item/user-element interaction matrix. To give a concrete example, this type of recommendation is used every time the message appears on an e-commerce site: ‘Who bought this product also bought …’, or ‘Who visited this page also visited…’.
In fact, this approach assumes that previous user-element interactions are sufficient to detect similar users and / or similar elements and make predictions based on estimated proximity.
The main advantage of this approach in Recommender Systems is that it does not require the collection of information on users or ‘items’ or rather on the elements and therefore adapts to multiple use cases. It is sufficient for users to interact with the items to make more the recommendation is accurate and effective. The advantage of this type of approach also represents its limitation: since only past interactions are taken into account, collaborative filtering is not effective either with new users or with new items, whether they are products or services. Where interactions are scarce the system loses its effectiveness.
The content-based approach, i.e., based on content, needs information or more precisely additional content on the items or the user. This is descriptive information, attributes, keywords, and tags that are crossed with the other contents that make up the user’s profile created on the basis of their explicit preferences, as well as on the collection of navigation data, also translated in this case into attributes. In fact, the system makes a comparison between the content of the items and what defines the user’s profile, arriving at suggesting articles, products or services of interest to them. It is a methodology based on probability estimates, generally obtained using Bayesian classifiers. The classic example that is used to explain how the RS content-based work are the systems for recommending films. They use additional information such as age, gender or other personal information that are added data such as duration of the film, actors, and director, to suggest the user the next movie to watch.
Compared to collaborative filtering, content-based Recommender Systems do not suffer from the ‘cold’ starting problems that occur in the absence of historical data and therefore are able to provide relevant suggestions to both new users and new items.
There are also hybrid recommendation engines, which are based both on collaborative metadata and transactional data of content-based Recommendation Systems and for this very reason exceed the limits of each of the two approaches described above. In a hybrid recommendation engine, natural language processing tags can be generated for each product or element and vector equations to calculate the similarity of the products. A collaborative filtering matrix can then be used to recommend items to users based on their behaviours, activities and preferences. Among the academic examples of this approach, we find again Netflix, which takes into account both the user’s interests, as required by the collaborative approach, and the descriptions or characteristics of the film or show, therefore based on the content.