logo
🔖

Recommender Systems Handbook

Published
Published
Author
Unknown
URL
Status
Genre
Book Name
Recommender Systems Handbook
Modified
Last updated December 26, 2023
Summary
The Recommender Systems Handbook is an insightful and comprehensive guide to the world of recommendation systems and the associated algorithms, techniques, technologies, and applications. It provides useful information on how to design, develop, and evaluate such systems, as well as offering an overview of the challenges and opportunities involved. Key Learnings: - Features and algorithms of modern recommendation systems - Evaluation strategies and methods - Performance optimization and advanced applications - Challenges and opportunities of this field Why Read for UX Designers: As a UX designer, it's important to understand how user experience can be enhanced through the use of powerful algorithms, as is the case with modern recommendation systems. Reading this book will give you a comprehensive review of the fundamentals and applications of these systems, allowing you to make informed decisions on how best to use these technologies. Other Books of Interest: If you are interested in further exploring the intersection between UX and algorithms and recommender systems, other interesting books to consider include Designing the Interactions (Matthew Milan), Algorithms of the Intelligent Web (Haralambos Marmanis), and Programming the Semantic Web (Simeon Simoff).
Created time
Aug 24, 2022 11:13 AM

🎀 Highlights

it is common to turn to one’s peers for recommendations when selecting a book to read;
The development of RSs initiated from a rather simple observation: individuals often rely on recommendations provided by others
time affects baseline predictors, b u(t) and b i(t). However, temporal dynamics go beyond this, they also affect user preferences and thereby the interaction between users and items.
temporal effects are the hardest to capture, because preferences are not as pronounced as main effects (user-biases), but are split over many factors.
Similarly, for context aware recommendation where the prediction depends on the current situation, e.g., location, mood, time, the context can be added as additional variables to a factorization machine resulting in a context-aware scoring function.
A good recommender system is often contextual, considering the situation at which a recommendation should be made, e.g., what product is the user currently browsing on, or time information such as weekday vs weekend.
Item recommendation can be formulated as a context-aware ranking problem where the whole set of items should be ordered given a query context. In traditional collaborative filtering , the context would be the user, in more complex cases, the context might be user and time, a user and location, the sequence of previously selected items by a user, etc.
2  Problem Definition The goal of item recommendation is to retrieve a subset of interesting items for a given context c ∈ C from a set of items I. This chapter uses the general concept of a context as a placeholder to cover many common item recommendation scenarios. In the simplest case, the context could be the user, in more complex cases, the context might be user and time, a user and location, the sequence of previously selected items by a user, etc.
6.3  Dynamic User Model When a user interacts with a recommender system, the system should be responsive and change the recommendation based on a user’s feedback. For example, after watching a video, the system should be able to make better recommendations taking into account this new information.
Many traditional recommendation algorithms ignore the chronological order of user historical interactions. Nonetheless, the user item interactions are essentially sequential. User’s short-term interests have huge impact on her decisions. Time context (e.g., holiday, black Friday, etc.) also affects user behaviors. Moreover, items’ popularity are dynamic rather than static over time
The temporal dynamics call for sequence-aware recommender systems. Learning preference representation from the sequence of actions becomes the fundamental task in sequence-aware recommendation.
Wu et al. [124] propose recurrent recommender networks (RRN) which uses LSTMs to capture the temporal dependencies for both users and items.
Intuitively, if a user has no activities for a relative long time, her last action might have less impact on her current decision. Thus, Xu et al. [131] propose a time kernel to learn functional time representations in a self-attention model.
There are many other methods for temporal dynamics modeling. For example, Chen et al. [79] propose a hierarchical gating networks to discriminate item importance based on users’ preferences via gating mechanism.
Chen et al. [16] design a memory-augmented network based sequential recommendation algorithm. It utilizes an external memory matrix to store, access, and manipulate users’ historical records in a more explicit fashion.
The proposed framework DeepPage is able to adaptively optimize a page of items based on user’s real-time actions. Zheng et al. [154] present DRN, a reinforcement learning based news recommender method which considers: (1) the dynamic changes of news content and user preferences, (2) incorporating return patterns (to the service) of users, and (3) the increase diversity of recommendations. Xian et al. [128] propose a reinforcement learning approach to find the paths in knowledge graph to enhance the explainability of recommendation.
Zhao et al. [151] present a spatio-temporal gated network (STGN) for POI recommendation by enhancing the long-short term memory network with gating mechanisms. Specifically, a time gate and a distance gate are proposed to control the updates of short-term and long-term preference representations.
Many existing approaches to recommender systems focus on recommending the most relevant items to individual users and do not take into consideration any contextual information, such as time, place, and the company of other people (e.g., for watching movies or dining out).
it may not be sufficient to consider only users and items—it is also important to incorporate the contextual information into the recommendation process in order to recommend items to users under certain circumstances.
In particular, mobile recommender systems constitute an important special case of context-aware recommenders, where context is often defined by spatial-temporal information such as location and time, and there exists a large body of literature dedicated specifically to mobile
mobile app recommenders [73, 130], and many others. In particular, mobile recommender systems constitute an important special case of context-aware recommenders, where context is often defined by spatial-temporal information such as location and time, and there exists a large body of literature dedicated specifically to mobile recommender systems (e.g., see [69, 124, 130, 132, 134, 137] for a few representative examples).
the CEO of Netflix, pointed out in 2012 [63], Netflix can improve the performance of its recommendation algorithms up to 3% when taking into account such contextual information. More recently, it has been observed at Netflix that “contextual signals can be as strong as personal preferences;
Netflix considers the following types of contexts in their recommendation methods: location (country and region within the country), the type of device being used for watching videos, time, cultural/religious/national festivities, attention (if the user is focused on watching the video or it is being played in the background), companion with whom the video is being watched, outside events occurring at the same time (sport events, elections, etc.), weather, seasonality, and user’s daily patterns (e.g., commuting to work) [19].
For example, when choosing which songs to play for a given user, some streaming platforms infer the mood of the user based on the users’ short-term goals and their recent activities. In addition to the inferred mood, Spotify also relies on the following types of contexts in their recommendation algorithms when providing song recommendations, among others: day of the week, time of the day, user’s region, type of user’s device, and the platform on the device being used for listening [54, 58,
Similarly, LinkedIn uses various types of contextual information, including date, time, location, device/platform and page, to provide career-related recommendations [11]. In fact, context plays a central role in LinkedIn recommendations, the goal for which has been explicitly formulated as “predict probability that a user will respond to an item in a given context” [emphasis added] [11].
user’s personal characteristics and item’s content attributes, can have an effect on user’s preferences for items. Moreover, aside from the user and item features, a number of other factors that reflect the user’s circumstances while consuming the items, may also impact these preferences, such as time, location and weather.
Turning to recommender systems, one of their goals is estimating user’s utility,
We start with the traditional and popular representational approach to modeling contextual information in Sect. 2.2, explore and describe a broader classification of modeling contextual factors in Sect. 2.3, and discuss the ways to design and obtain contextual factors in Sect. 2.4.
is important to note that Context can be modeled in CARS in a variety of different ways, as will be discussed in Sect. 2.3 in more detail. One popular approach—we refer to it as the traditional (representational) approach to context-aware recommender systems—assumes that the contextual information, such as time, location, and the company of other people, is explicitly described by a set of pre-defined contextual factors (sometimes called contextual dimensions, variables, or attributes), the structure of which does not change over time (i.e., is static).
It is important to note that Context can be modeled in CARS in a variety of different ways, as will be discussed in Sect. 2.3 in more detail. One popular approach—we refer to it as the traditional (representational) approach to context-aware recommender systems—assumes that the contextual information, such as time, location, and the company of other people, is explicitly described by a set of pre-defined contextual factors (sometimes called contextual dimensions, variables, or attributes), the structure of which does not change over time (i.e., is static).
User: the people to whom movies are recommended; it is defined by UserID, but can have additional user characteristics available (e.g., demographic and socioeconomic attributes).
Further, the contextual information consists of the following three contextual factors: Location: the location from which the user watches the movie that is represented by LocationType (“home”, “theater”, “airplane”, and “other”). Time: the time when the movie can be or has been seen; it is represented by Date. Depending on the relevant granularity for a given application, Date can also be aggregated in different ways, such as DayOfWeek (with values Mon, Tue, Wed, Thu, Fri, Sat, Sun) or TimeOfWeek (“weekday” and “weekend”).
Although this complexity can take many different forms, one popular defining characteristic is the hierarchical structure of contextual information that can be represented as trees, as is done in most of the context-aware recommender and profiling systems, including [4] and [99]. E.g., Example 1 already mentioned that the standard Date values (i.e., calendar dates) can be hierarchically aggregated to DayOfWeek and then further to TimeOfWeek.
In [4, 10], the authors proposed to treat the contextual information as part of a multidimensional data (MD) model within the framework of Online Analytical Processing (OLAP) used for modeling multidimensional databases deployed in data warehousing applications [44, 74]. Mathematically, the OLAP model is defined with an n-dimensional tensor (see Sect. 4 for subsequent discussion of tensors and their factorization in the context of CARS). In particular, in addition to the classical User and Item dimensions, additional contextual dimensions, such as Time, Location, etc., are also included as part of the tensor.
For example, rating R(101, 7, 1) = 6 in Fig. 2 means that for the user with User ID 101 and the item with Item ID 7,
For example, rating R(101, 7, 1) = 6 in Fig. 2 means that for the user with User ID 101 and the item with Item ID 7, rating 6 was specified during the weekday.
A broader classification of major approaches to modeling contextual information, i.e., classification that goes beyond the standard assumption of the explicit availability of predefined contextual factors with stable (static) structure, is based on the following two aspects of contextual factors [9]: (i) what a recommender system may assume (or know) about the structure of contextual factors, and (ii) how the structure of contextual factors changes over time.
a recommender system can have different levels of knowledge about the contextual factors. This may include knowledge of the list of relevant factors, their structure, and lists of their values.
Explicit: The contextual factors relevant to the application, as well as their structure and lists of their values are known explicitly. For example, in a restaurant application, the recommender system may use only DayOfWeek, TimeOfDay, Company, and Occasion contextual factors. For each of these factors, the system may know the relevant structure and the complete list of their values, such as using values Morning, Afternoon, Evening, Night for the TimeOfDay variable.
Latent: No information about contextual factors is explicitly available to the recommender system, and it makes recommendations by utilizing only the latent knowledge of context in an implicit manner. For example, the recommender system may build a latent predictive model, such as a hierarchical linear or hidden Markov model, to estimate unknown ratings, where context is modeled using latent variables.
Traditional (i.e., Explicit Static) Approach   As discussed in Sect. 2.2, this approach corresponds to the representational view of context [53], which assumes that all the contextual information in a given application can be modeled with a predefined, explicit, finite set of observable factors, where each factor has a well-defined structure and the structure does not change significantly over time (i.e., is static). Vast majority of the first generation of CARS papers has focused on this approach, and it still remains popular because of its simplicity and clarity.
Explicit Dynamic Approach   This approach represents recommendation settings where the explicit structure as well as the list of values of the contextual factor can change over time.
consider conversational recommender systems that can iteratively collect contextual information from the user. The list of types and values of such contextual information can also change dynamically over time when new contextual factors extracted from recent conversations being constantly added to the list.
Context Parsing method that extracts all relevant contextual factors along with their values from user reviews. However, not all the reviews contain information about the values of all contextual factors, which makes the observed data incomplete.
The first generation of CARS-related research was mostly focused on the traditional approach of modeling contextual information assuming static and explicitly defined structure.
new wave of research papers on this topic explores the potential of large amounts of data collected through various means, such as user logs or various types of sensors, largely following the latent static approach to modeling contextual factors. As we discuss in Sect. 4, recent work on the use of reinforcement learning in CARS started exploring the latent dynamic approach. More generally, however, the dynamic approaches (both explicit and latent) to modeling contextual factors have been highly under-explored and, thus, represent a strong potential for novel future research. We will discuss these issues further in Sect. 5. 2.4
The data values for contextual factors can also be obtained in several ways, including: Directly, i.e., by asking users direct questions or eliciting this information directly from other sources of contextual information. For example, a website may obtain contextual information by asking a user to answer some specific questions before providing a context-aware recommendation. Similarly, a smartphone app may obtain time, location, and motion data from the phone’s clock, GPS sensor, and accelerometer, respectively, and weather information can be obtained from a third-party resource by querying it with a specific time and location.
dynamic information about the user or item, such as information that user is on vacation this week, would often be considered and modeled as context.
for context-aware recommender systems to be useful, system designers must model context and choose contextual factors in a way that encompasses both types of aforementioned interactions.
order to provide genuine context-aware recommendations, the recommender system must be able to use the contextual information (i.e., context in which the user intends to consume, which can be elicited from the user, observed or imputed directly by the system) at the time of recommendation.
the corresponding contextual information (i.e., context in which the item was consumed by the user) also needs to be elicited from the user or observed directly by the system.
for a music streaming platform, the recommendation for the next song can be made based on the user’s self-reported current mood, and the user’s immediate consumption of the song can be viewed as occurring directly in that same context.
a contextual factor should be relevant for a substantial number of users and/or items. This means that the values of a relevant contextual factors should not be constant across different user-item experiences.
Naturally, not all the available contextual factors might be relevant or useful for recommendation purposes. Consider, for example, a book recommender system. Many types of contextual data could potentially be obtained by such a system from book buyers, including: (a) the purpose of buying the book (possible options include for work, for leisure, etc.); (b) planned reading time (weekday, weekend, etc.); (c) planned reading place (at home, at school, on a plane, etc.); (d) the value of the stock market index at the time of the purchase.
For example, for mobile recommendation applications, the following four general types of contextual information are often considered [9, 55]: physical context (e.g., time, position, activity of the user, weather, light conditions, temperature), social context (e.g., is the user alone or in the group, presence and role of other people around the user), interaction media context (e.g., device characteristics—phone/tablet/laptop/etc., media content type—text/audio/video/etc.), modal context (e.g., user’s state of mind—mood, experience, current goals).
has been proposed by Baltrunas et al. [23], who developed a survey-based instrument that asks the users to judge what their preferences would be in a wide variety of hypothetical (i.e., imagined) contextual situations. This
Another approach for assessing relevance of contextual information has been proposed by Baltrunas et al. [23], who developed a survey-based instrument that asks the users to judge what their preferences would be in a wide variety of hypothetical (i.e., imagined) contextual situations. This allows to collect richer contextual preference information in a short timeframe, evaluate the impact of each contextual factor on user preferences based on the collected data, and include into the resulting context-aware system only those factors that were shown to be important.
the collected data includes only hypothetical contextual preferences (i.e., preferences for items that users imagined consuming under certain contextual circumstances), the authors demonstrate that the resulting context-aware recommender system was perceived to be more effective by users as compared to the non-context-aware recommender.
In the first approach, systems typically use contextual information (obtained either directly from the user, e.g., by specifying current mood or interest, or from the environment, e.g., obtaining local time, weather, or current location) to query or search a certain repository of resources (e.g., restaurants) and present the resources that best match a given query (e.g., nearby restaurants that are currently open) to the user.
While both general approaches offer a number of research challenges, in the remainder of this chapter we will focus on the second, more recent trend of the contextual preference elicitation and estimation in recommender systems. To
its general form, a canonical 2-dimensional (2D) (User × Item) recommender system can be described as a function, which takes partial user preference data as its input and produces a list of recommendations for each user as an output. Accordingly, Fig. 5 presents a general overview of the canonical 2D recommendation process, which includes three components: data (input), 2D recommender system (function), and recommendation list (output).
As mentioned in Sect. 2.2, canonical recommender systems are built based on the knowledge of partial user preferences presented in the form < user, item, rating > . In contrast, context-aware recommender systems are built based on the knowledge of partial contextual user preferences and typically deal with data records of the form < user, item, context, rating > , that also includes the contextual information in which the item was consumed by this user (e.g., Context =  Saturday). In addition, context-aware recommender systems may also make use of the structures of context attributes, such as context hierarchies (e.g., Saturday → Weekend) mentioned in Sect. 2.2. Based on the presence of this additional contextual data, several important questions arise: How contextual information should be reflected when modeling user preferences? Can we reuse the wealth of knowledge in canonical (non-contextual) recommender systems to generate context-aware recommendations?
2  Context in Recommender Systems 2.1  What Is Context? Context is a multifaceted concept that has been studied across different research disciplines, including computer science (primarily in artificial intelligence and ubiquitous computing), cognitive science, linguistics, philosophy, psychology, and organizational sciences. In fact, an entire conference—CONTEXT1—is dedicated exclusively to studying this topic and incorporating it into various other branches of science, including medicine, law, and business. Since context is a multidisciplinary concept, each discipline tends to take its own idiosyncratic view that is somewhat different from other disciplines and is more specific than the standard generic dictionary definition of context as “interrelated conditions in which something exists or occurs”.2 Therefore, there exist many definitions of context across various disciplines and even within specific subfields of these disciplines.
The settings where the contextual factors are stable over time are classified as static, whereas the factors changing over time are classified as dynamic
The latent approach is mostly used in order to represent context in an efficient and reduced manner from high-dimensional data, where the relationships between the original contextual features are revealed. Because the contextual factor structure is stable, it can be modeled with latent variables, mostly in the form of a vector containing numeric attributes.
the CEO of Netflix, pointed out in 2012 [63], Netflix can improve the performance of its recommendation algorithms up to 3% when taking into account such contextual information.
More recently, it has been observed at Netflix that “contextual signals can be as strong as personal preferences; …make them central to your system and infrastructure” [19].
unstructured latent contextual vector from its leaf to
Several popular approaches to train recommender systems from implicit feedback data were presented. An advantage of sampling based approaches is that they can be applied to most recommender models.
In any case, properly tuning and setting up models can be more important than switching to a more complex approach [7, 21, 22].
Video features can also be leveraged to improve recommendations. Usually, videos are converted into a sequence of frames and audio waves. As such, CNNs based models become a desirable option for video analysis.
Xu et al. [17] propose a key frame recommender system to select the key frames from a video for each
Xu et al. [17] propose a key frame recommender system to select the key frames from a video for each user.
Explainable recommender systems not only generate personalised recommendations but also produce intuitive explanations to the recommendations.
Entertainment Covington et al. [23], Chen et al. [12], Van et al. [85], Huang et al. [56], Wang et al. [121], Cheng et al. [19], Ying et al. [138], Yang et al. [135]
Regardless of what approach is chosen to model contextual factors in a specific context-aware recommender system, a common challenge for the system designers is that there are many candidate contextual factors available for consideration. Thus, the question of which information observable by a recommender system should be considered as relevant,
Still another approach toward inferring contextual information, albeit for non-RS related problems, was proposed in [75] where temporal contexts were discovered in web-sessions by decomposing these sessions into non-overlapping segments, each segment relating to one specific context. These contexts were subsequently identified using certain optimization and clustering methods.
Contextual Factors Should Not Be Static Properties of Users and Items   The first guideline represents an observation that, in order to model a certain data attribute as a contextual factor, it should truly be representative of context and not be a static characteristic of users or items.
the user’s dietary restrictions (e.g., vegetarianism or peanut allergy) would be more appropriate to model as user attributes, while the user’s intended company for the meal (e.g., with a significant other vs. with small children vs. alone) would be more appropriate to model as contextual factors.
Recommendation (system-to-user interaction): the system provides recommendations to the users of the items predicted to be most relevant to them; Feedback (user-to-system interaction): the users provide feedback to the system about their preferences for the consumed items.
In particular, for the delayed consumption recommender systems, where the recommendation interactions and the feedback interactions may occur a substantial time apart, the available contextual information can be significantly different. For example, for a restaurant recommendation application, at the recommendation time it may be more useful to ask for the user’s current food-related desires and moods (e.g., “I am in the mood for authentic Italian pizza”). However, at the feedback time, typically more specific contextual details that affected the actual restaurant experience can be collected, e.g., from the user’s restaurant review (such as “I ended up getting spaghetti and meatballs”, “it was too hot—air conditioning was not working properly”, and “the waiter was rude”). Modeling the “common” view of contextual information that combines the recommendation-time and the feedback-time perspectives represents an important challenge for context-aware recommender systems designers, especially in the delayed consumption applications. In contrast, this challenge is typically less pronounced for instant consumption recommender systems, where the recommendation interactions and feedback interactions occur in close temporal proximity, and where the available contextual information can be treated as essentially identical. For example, for a music streaming platform, the recommendation for the next song can be made based on the user’s self-reported current mood, and the user’s immediate consumption of the song can be viewed as occurring directly in that same context.
In the presence of available contextual information, following the diagrams in Fig. 6, we start with the data having the form U × I × C × R, where C is additional contextual dimension and end up with a list of contextual recommendations i 1, i 2, i 3… for each user.
Zheng et al. [141] use a similar approach (called context “relaxation”)
Zheng et al. [141] use a similar approach (called context “relaxation”) for travel recommendations.
The predictions are done using these contextual micro-profiles instead of a single user model.
Amatriain [17] introduce the idea of micro-profiling (or user splitting), which splits the user profile into several (possibly overlapping) sub-profiles, each representing the given user in a particular context.
if a person wants to see a movie on a weekend, and on weekends she only watches comedies, the system can filter out all non-comedies from the recommended movie list. More generally, the basic idea for contextual post-filtering approaches is to analyze the contextual preference data for a given user in a given context to find specific item usage patterns (e.g., user Jane Doe watches only comedies on weekends) and then use these patterns to adjust the item list, resulting in more “contextual” recommendations, as depicted in Fig.
Each of these CM methods requires building a contextual profile prof(u, c) for user u in context c, and then using the contextual profiles of all the users to find the N nearest neighbors of user u in terms of these profiles in context c.
Adomavicius and Tuzhilin [5] present a method of extending a regression-based Hierarchical Bayesian (HB) collaborative filtering model of estimating unknown ratings proposed by Ansari et al. [15] in order to incorporate additional contextual dimensions, such as time and location, into the HB model.
Oku et al. [97] propose to incorporate additional contextual factors (such as time, companion, and weather) directly into recommendation space and use machine learning techniques to provide recommendations in a restaurant recommender system.
Furthermore, Oku et al. [97] empirically show that the context-aware SVM significantly outperforms the non-contextual SVM-based recommendation algorithm in terms of predictive accuracy and user’s satisfaction with recommendations.
Recent state-of-the-art context-aware methods represent the relations between users/items and contexts as a tensor, with which it is difficult to distinguish the impacts of different contextual factors and to model complex, non-linear interactions between contexts and users/items. Therefore, several attention-based recommendation models are used to enhance CARS through adaptively capturing the interactions between contexts and users/items and improve the interpretability of recommendations through identifying the most important contexts [52, 66, 92].
[92] proposed a neural model, named Attentive Interaction Network (AIN), to enhance CARS through adaptively capturing the interactions between contexts and users/items, and [52]
Context-aware sequential recommendations have been extensively used to monitor the evolution of user tastes over time, which helps to improve the quality of contextual recommendations [107].
Reinforcement Learning (RL)   Most recommendation models consider the recommendation process as static, which makes it difficult to capture users’ temporal intentions and to respond to them in a timely manner.
[54] proposed a contextual bandit algorithm to decide which content to recommend to users, where the reward function takes into consideration contextual information, and [84] proposed a framework for online learning and adaptation to sequential preferences within a listening session
Multi-armed bandit is the most thoroughly studied RL problem. It is inspired by slot machines in a casino: for a bandit (slot) machine with M arms, pulling arm i will result in a random payoff (reward)
Multi-armed bandit is the most thoroughly studied RL problem. It is inspired by slot machines in a casino: for a bandit (slot) machine with M arms, pulling arm i will result in a random payoff (reward) r, sampled from an unknown and arm-specific distribution p i. The objective is to maximize the total reward of the user over a given number of interactions.
They present a context-aware collaborative filtering
Tripathi et al. [123] combine the potential of reinforcement learning and deep bidirectional recurrent neural networks for automatic personalized video recommendation. They present a context-aware collaborative filtering approach, where the intensity of user’s non-verbal emotional response toward the recommended video is captured through interactions and facial expression analysis.
In [121], a Q-learning-based travel recommender is proposed, where trips are ranked using a linear function of several content and contextual features including trip duration, price, and country, and the weights are updated using user feedback.
Context-awareness is being recognized as an important issue in many recommendation applications, which is evidenced by the increasing number of papers being published on this topic.
researchers have focused primarily on how to take advantage of contextual information in order to improve the quality of recommendations for different recommendation tasks and applications.
the main research issues, challenges, and directions can be broadly classified into the following four general categories [8]: Algorithms, i.e., developing recommendation algorithms that can incorporate contextual information into recommender systems in advantageous ways. Evaluation, i.e., in-depth performance evaluation of various context-aware recommendation approaches and techniques, their benefits and limitations. Engineering, i.e., designing general-purpose architectures, frameworks, and approaches to facilitate the development, implementation, deployment, and use of context-aware recommendation capabilities. Fundamentals, i.e., deeper understanding the notion of context and modeling context in recommender systems.
In particular, [101] show that context-aware recommendation techniques outperform canonical (non-contextual) approaches in terms of accuracy, trust, and several economics-based performance metrics across most of their experimental settings.
Another interesting example of user studies with context-aware recommender systems is the work by Braunhofer et al. [31]. In this study, the users, after receiving a recommendation from a context-aware system that recommends points-of-interest, were asked to evaluate the system’s performance on the following two dimensions: “Does this recommendation fit my preference?” (i.e., “personalization” performance) and “Is this recommendation well-chosen for the situation?” (i.e., “contextualization” performance).
One of the major issues that have slowed down the progress of the CARS field in the past was the availability of large-scale publicly available datasets on which novel CARS-based methods could be evaluated. This situation has improved significantly over the last few years when such datasets became publicly available. For example, DePaulMovie [142] and LDOS-CoMoDa [94] datasets contain movie ratings collected along with contextual information; InCarMusic [21], Frappe [20] and STS [32] datasets provide the apps usage logs.
Despite this progress, the CARS community would benefit significantly from additional publicly available, large-scale, context-oriented datasets, and this should also
Despite this progress, the CARS community would benefit significantly from additional publicly available, large-scale, context-oriented datasets, and this should also be an important priority for the research community.
the study by Hussein et al. [67, 68], where the authors introduce a service-oriented architecture enabling to define and implement a variety of different “building blocks” for context-aware recommender systems, such as recommendation algorithms, context sensors, various filters and converters,
Another important “Engineering” aspect is to develop richer interaction capabilities with CARS that make recommendations more flexible. As compared to canonical recommender systems, context-aware recommenders have two important differences. The first is increased complexity, since CARS involve not only users and items in the recommendation process, but also various types of contextual information. Thus, the types of recommendations can be significantly more complex in comparison to the canonical non-contextual cases.
The combination of these two features calls for the development of more flexible recommendation methods that allow the user to express the types of recommendations that are of interest to them rather than consuming standard recommendations that are “hard-wired” into
The second difference is increased interactivity, since more information (i.e., context) usually needs to be elicited from the user in the CARS settings. For example, to utilize the available contextual information, a CARS system may need to elicit from the user (Tom) with whom he wants to see a movie (e.g., girlfriend) and when (e.g., over the weekend) before providing any context-specific recommendations.
The combination of these two features calls for the development of more flexible recommendation methods that allow the user to express the types of recommendations that are of interest to them rather than consuming standard recommendations that are “hard-wired” into the recommendation engines provided by many current vendors.
The second requirement of interactivity also calls for the development of tools allowing users to provide inputs into the recommendation process in an interactive and iterative manner, preferably via some well-defined user interfaces (UI).
What are the tradeoffs of different modelling assumptions (e.g., static vs. dynamic context)? The recommender systems community has been moving towards studying some of these questions.
Moving forward, we believe that significantly more research is needed to better understand various aspects of latent and/or dynamic contextual information and how to leverage it in providing better recommendations. Another related and also under-explored research area is how to extract the contextual information from different data sources, such as user reviews, tweets, sensors, IoT devices, etc., that is not explicitly observed and/or recorded as context by the recommender system.
In addition, a user model in context-based RSs, as described in Chapter “Context-Aware Recommender Systems: From Foundations to Recent Developments”, is dynamic and changes based on contextual features such as temporal, location, mood or any other relevant features. Usually in context-aware RSs the users are known to the system and have user models that should be altered according to the contextual information to address the short-term preferences emerging in a specific context.
We should stress that utilizing the right data is fundamental to the performance of an RS. As described above, a variety of data and knowledge sources can be leveraged in various RSs techniques. The decision about the data to use for a system, and how to use it should be done carefully while considering availability of data, the recommendation algorithm, the required effort, and the available resources.
Explicit feedback is known to be more reliable than implicit one and provides a level of preference of the user for an item (e.g., ratings 1–5). However, explicit feedback is often not available, or very sparse, since many users would not bother to provide it. Ratings are the most popular form of explicit feedback data that RSs collect [59].
Users   Users of an RS, as we mentioned above, may have very diverse goals and characteristics. In order to personalize the recommendations and the human-computer interaction, RSs exploit a range of information about the users. This information can be structured in various ways. User data is said to constitute the user model [16, 30]. The user model profiles the user, i.e., encodes her preferences and needs. Various user modeling approaches have been used and, in a certain sense, an RS can be viewed as a tool that generates recommendations by building and exploiting user models [13, 14].
In context aware RS (see Chapter “Context-Aware Recommender Systems: From Foundations to Recent Developments”) the user model incorporates the contextual information to be utilized during the recommendation process in order to recommend items to users under specific contextual situations. For example, by using the temporal context, a travel recommender system would include contextual features describing the weather, and the time of year when the vacation was consumed, so that a vacation recommendation in winter can be very different from the one recommended in summer.
In Chapter “Deep Learning for Recommender Systems” the authors overview some key methods and highlight the impact of deep learning techniques in the recommender systems field. They present a range of challenging tasks in recommendation, such as cold-start problem, explainability, temporal dynamics, and robustness that can be addressed using deep neural networks.
Evidence suggests that people tend to rely more on recommendations from their friends than on recommendations from similar but anonymous individuals [69].
social recommender systems [31]
An essential goal of recommender systems is to help users make better choices [23, 34, 37].
research towards a more user-centric analysis. In this handbook some, already mentioned contributions, deal with these novel evaluation dimensions: Chapters “Fairness in Recommender Systems”, “Novelty and Diversity in Recommender Systems”, and “Value and Impact of Recommender Systems”.
Another notable example of RSs that emerged with the diffusion of new communication technologies are recommender systems related to the social web, and specifically those that target the social media domain. With the rise of social networks (e.g., Facebook, LinkedIn, Tweeter, Flickr, and others), users are overloaded with information, activities and interactions.
Social recommender systems are RSs that aim at assisting the user in identifying relevant content (e.g., tweets, feeds or images), and engage only in relevant activities and interactions (e.g., discussions, or comments).
Chapter “Social Recommender Systems” describes two main types: recommendations of social media content and recommendations of people. For recommendations of social content, the chapter reviews various social content media domains, and provides a detailed case study and insights learned from a recommender system operated in the enterprise which suggests mixed social media items.
investigated. The list includes: the need for explanation, privacy concerns, social relationships, trust and reputation, as well as the need to define special evaluation measures.
topics that should be considered and should be further investigated. The list includes: the need for explanation, privacy concerns, social relationships, trust and reputation, as well as the need to define special evaluation measures.
“Multimedia Recommender Systems: Algorithms and Challenges” the authors survey the state-of-the-art research related to multi-media RS and, in particular focusing on techniques that integrate item or user side information into a hybrid recommender.
number of open issues are related to the critical stage of acquiring reliable information about the user preferences in order to generate a useful user profile. It is clear that in many real-world applications of RSs, implicit feedback is much more readily available and requires no extra effort on the user’s side.
Hence, also when managing implicit feedback one has to consider the biases of the available data and the implication for the construction of effective and fair systems. These topics are further discussed in Chapters “Novelty and Diversity in Recommender Systems”, “Value and Impact of Recommender Systems”, “Multistakeholder Recommender Systems”, and “Fairness in Recommender Systems”. But more effective solutions for tackling it must be further developed.
in certain application domains explicit feedback is still central. This is clearly illustrated in Chapters “Social Recommender Systems”, “Group Recommender Systems:
Beyond preference aggregation”, and “People-to-People Reciprocal Recommenders”. These are application domains or settings where the RS cannot totally operate as a black box; users are interacting also with themselves, not only with items.
simplifying the cognitive cost of preference acquisition is of primary importance.
In these applications the RS acts as a mediator between users, enabling the users to better understand each other and browse recommendations that are often referring to actions or preferences of other users. This is clearly seen in a group RS where the goal of the system is to support the choices of a group of users, which often requires the reciprocal understanding
Therefore, simplifying the cognitive cost of preference acquisition is of primary importance.
aside from the algorithms, which are used to predict the user preference and behaviour, and compute the recommendations, the mechanism through which users provide their input and the means by which they receive the systems output, play a significant role and can play an even larger role in determining the success or failure of a recommender system.
Moreover, users often do not know or do not reflect on their preferences beforehand, especially when users approach an RS for information discovery. In such cases, the system-supported interaction and visualization contribute to the user construction of their preferences within a specific recommendation session.
it is common to turn to one’s peers for recommendations when selecting a book to read;
The development of RSs initiated from a rather simple observation: individuals often rely on recommendations provided by others
time affects baseline predictors, b u(t) and b i(t). However, temporal dynamics go beyond this, they also affect user preferences and thereby the interaction between users and items.
temporal effects are the hardest to capture, because preferences are not as pronounced as main effects (user-biases), but are split over many factors.
Similarly, for context aware recommendation where the prediction depends on the current situation, e.g., location, mood, time, the context can be added as additional variables to a factorization machine resulting in a context-aware scoring function.
A good recommender system is often contextual, considering the situation at which a recommendation should be made, e.g., what product is the user currently browsing on, or time information such as weekday vs weekend.
Item recommendation can be formulated as a context-aware ranking problem where the whole set of items should be ordered given a query context. In traditional collaborative filtering , the context would be the user, in more complex cases, the context might be user and time, a user and location, the sequence of previously selected items by a user, etc.
2  Problem Definition The goal of item recommendation is to retrieve a subset of interesting items for a given context c ∈ C from a set of items I. This chapter uses the general concept of a context as a placeholder to cover many common item recommendation scenarios. In the simplest case, the context could be the user, in more complex cases, the context might be user and time, a user and location, the sequence of previously selected items by a user, etc.
6.3  Dynamic User Model When a user interacts with a recommender system, the system should be responsive and change the recommendation based on a user’s feedback. For example, after watching a video, the system should be able to make better recommendations taking into account this new information.
Many traditional recommendation algorithms ignore the chronological order of user historical interactions. Nonetheless, the user item interactions are essentially sequential. User’s short-term interests have huge impact on her decisions. Time context (e.g., holiday, black Friday, etc.) also affects user behaviors. Moreover, items’ popularity are dynamic rather than static over time
The temporal dynamics call for sequence-aware recommender systems. Learning preference representation from the sequence of actions becomes the fundamental task in sequence-aware recommendation.
Wu et al. [124] propose recurrent recommender networks (RRN) which uses LSTMs to capture the temporal dependencies for both users and items.
Intuitively, if a user has no activities for a relative long time, her last action might have less impact on her current decision.
Intuitively, if a user has no activities for a relative long time, her last action might have less impact on her current decision. Thus, Xu et al. [131] propose a time kernel to learn functional time representations in a self-attention model.
There are many other methods for temporal dynamics modeling. For example, Chen et al. [79] propose a hierarchical gating networks to discriminate item importance based on users’ preferences via gating mechanism.
Chen et al. [16] design a memory-augmented network based sequential recommendation algorithm. It utilizes an external memory matrix to store, access, and manipulate users’ historical records in a more explicit fashion.
The proposed framework DeepPage is able to adaptively optimize a page of items based on user’s real-time actions. Zheng et al. [154] present DRN, a reinforcement learning based news recommender method which considers: (1) the dynamic changes of news content and user preferences, (2) incorporating return patterns (to the service) of users, and (3) the increase diversity of recommendations. Xian et al. [128] propose a reinforcement learning approach to find the paths in knowledge graph to enhance the explainability of recommendation.
Zhao et al. [151] present a spatio-temporal gated network (STGN) for POI recommendation by enhancing the long-short term memory network with gating mechanisms. Specifically, a time gate and a distance gate are proposed to control the updates of short-term and long-term preference representations.
Many existing approaches to recommender systems focus on recommending the most relevant items to individual users and do not take into consideration any contextual information, such as time, place, and the company of other people (e.g., for watching movies or dining out).
it may not be sufficient to consider only users and items—it is also important to incorporate the contextual information into the recommendation process in order to recommend items to users under certain circumstances.
In particular, mobile recommender systems constitute an important special case of context-aware recommenders, where context is often defined by spatial-temporal information such as location and time, and there exists a large body of literature dedicated specifically to mobile
mobile app recommenders [73, 130], and many others. In particular, mobile recommender systems constitute an important special case of context-aware recommenders, where context is often defined by spatial-temporal information such as location and time, and there exists a large body of literature dedicated specifically to mobile recommender systems (e.g., see [69, 124, 130, 132, 134, 137] for a few representative examples).
the CEO of Netflix, pointed out in 2012 [63], Netflix can improve the performance of its recommendation algorithms up to 3% when taking into account such contextual information. More recently, it has been observed at Netflix that “contextual signals can be as strong as personal preferences;
Netflix considers the following types of contexts in their recommendation methods: location (country and region within the country), the type of device being used for watching videos, time, cultural/religious/national festivities, attention (if the user is focused on watching the video or it is being played in the background), companion with whom the video is being watched, outside events occurring at the same time (sport events, elections, etc.), weather, seasonality, and user’s daily patterns (e.g., commuting to work) [19].
For example, when choosing which songs to play for a given user, some streaming platforms infer the mood of the user based on the users’ short-term goals and their recent activities. In addition to the inferred mood, Spotify also relies on the following types of contexts in their recommendation algorithms when providing song recommendations, among others: day of the week, time of the day, user’s region, type of user’s device, and the platform on the device being used for listening [54, 58,
Similarly, LinkedIn uses various types of contextual information, including date, time, location, device/platform and page, to provide career-related recommendations [11]. In fact, context plays a central role in LinkedIn recommendations, the goal for which has been explicitly formulated as “predict probability that a user will respond to an item in a given context” [emphasis added] [11].
user’s personal characteristics and item’s content attributes, can have an effect on user’s preferences for items. Moreover, aside from the user and item features, a number of other factors that reflect the user’s circumstances while consuming the items, may also impact these preferences, such as time, location and weather.
Turning to recommender systems, one of their goals is estimating user’s utility,
We start with the traditional and popular representational approach to modeling contextual information in Sect. 2.2, explore and describe a broader classification of modeling contextual factors in Sect. 2.3, and discuss the ways to design and obtain contextual factors in Sect. 2.4.
is important to note that Context can be modeled in CARS in a variety of different ways, as will be discussed in Sect. 2.3 in more detail. One popular approach—we refer to it as the traditional (representational) approach to context-aware recommender systems—assumes that the contextual information, such as time, location, and the company of other people, is explicitly described by a set of pre-defined contextual factors (sometimes called contextual dimensions, variables, or attributes), the structure of which does not change over time (i.e., is static).
It is important to note that Context can be modeled in CARS in a variety of different ways, as will be discussed in Sect. 2.3 in more detail. One popular approach—we refer to it as the traditional (representational) approach to context-aware recommender systems—assumes that the contextual information, such as time, location, and the company of other people, is explicitly described by a set of pre-defined contextual factors (sometimes called contextual dimensions, variables, or attributes), the structure of which does not change over time (i.e., is static).
User: the people to whom movies are recommended; it is defined by UserID, but can have additional user characteristics available (e.g., demographic and socioeconomic attributes).
Further, the contextual information consists of the following three contextual factors: Location: the location from which the user watches the movie that is represented by LocationType (“home”, “theater”, “airplane”, and “other”). Time: the time when the movie can be or has been seen; it is represented by Date. Depending on the relevant granularity for a given application, Date can also be aggregated in different ways, such as DayOfWeek (with values Mon, Tue, Wed, Thu, Fri, Sat, Sun) or TimeOfWeek (“weekday” and “weekend”).
Although this complexity can take many different forms, one popular defining characteristic is the hierarchical structure of contextual information that can be represented as trees, as is done in most of the context-aware recommender and profiling systems, including [4] and [99]. E.g., Example 1 already mentioned that the standard Date values (i.e., calendar dates) can be hierarchically aggregated to DayOfWeek and then further to TimeOfWeek.
In [4, 10], the authors proposed to treat the contextual information as part of a multidimensional data (MD) model within the framework of Online Analytical Processing (OLAP) used for modeling multidimensional databases deployed in data warehousing applications [44, 74]. Mathematically, the OLAP model is defined with an n-dimensional tensor (see Sect. 4 for subsequent discussion of tensors and their factorization in the context of CARS). In particular, in addition to the classical User and Item dimensions, additional contextual dimensions, such as Time, Location, etc., are also included as part of the tensor.
For example, rating R(101, 7, 1) = 6 in Fig. 2 means that for the user with User ID 101 and the item with Item ID 7,
For example, rating R(101, 7, 1) = 6 in Fig. 2 means that for the user with User ID 101 and the item with Item ID 7, rating 6 was specified during the weekday.
A broader classification of major approaches to modeling contextual information, i.e., classification that goes beyond the standard assumption of the explicit availability of predefined contextual factors with stable (static) structure, is based on the following two aspects of contextual factors [9]: (i) what a recommender system may assume (or know) about the structure of contextual factors, and (ii) how the structure of contextual factors changes over time.
a recommender system can have different levels of knowledge about the contextual factors. This may include knowledge of the list of relevant factors, their structure, and lists of their values.
Explicit: The contextual factors relevant to the application, as well as their structure and lists of their values are known explicitly. For example, in a restaurant application, the recommender system may use only DayOfWeek, TimeOfDay, Company, and Occasion contextual factors. For each of these factors, the system may know the relevant structure and the complete list of their values, such as using values Morning, Afternoon, Evening, Night for the TimeOfDay variable.
Latent: No information about contextual factors is explicitly available to the recommender system, and it makes recommendations by utilizing only the latent knowledge of context in an implicit manner. For example, the recommender system may build a latent predictive model, such as a hierarchical linear or hidden Markov model, to estimate unknown ratings, where context is modeled using latent variables.
Traditional (i.e., Explicit Static) Approach   As discussed in Sect. 2.2, this approach corresponds to the representational view of context [53], which assumes that all the contextual information in a given application can be modeled with a predefined, explicit, finite set of observable factors, where each factor has a well-defined structure and the structure does not change significantly over time (i.e., is static). Vast majority of the first generation of CARS papers has focused on this approach, and it still remains popular because of its simplicity and clarity.
Explicit Dynamic Approach   This approach represents recommendation settings where the explicit structure as well as the list of values of the contextual factor can change over time.
consider conversational recommender systems that can iteratively collect contextual information from the user. The list of types and values of such contextual information can also change dynamically over time when new contextual factors extracted from recent conversations being constantly added to the list.
Context Parsing method that extracts all relevant contextual factors along with their values from user reviews. However, not all the reviews contain information about the values of all contextual factors, which makes the observed data incomplete.
The first generation of CARS-related research was mostly focused on the traditional approach of modeling contextual information assuming static and explicitly defined structure.
new wave of research papers on this topic explores the potential of large amounts of data collected through various means, such as user logs or various types of sensors, largely following the latent static approach to modeling contextual factors. As we discuss in Sect. 4, recent work on the use of reinforcement learning in CARS started exploring the latent dynamic approach. More generally, however, the dynamic approaches (both explicit and latent) to modeling contextual factors have been highly under-explored and, thus, represent a strong potential for novel future research. We will discuss these issues further in Sect. 5. 2.4
The data values for contextual factors can also be obtained in several ways, including: Directly, i.e., by asking users direct questions or eliciting this information directly from other sources of contextual information. For example, a website may obtain contextual information by asking a user to answer some specific questions before providing a context-aware recommendation. Similarly, a smartphone app may obtain time, location, and motion data from the phone’s clock, GPS sensor, and accelerometer, respectively, and weather information can be obtained from a third-party resource by querying it with a specific time and location.
dynamic information about the user or item, such as information that user is on vacation this week, would often be considered and modeled as context.
for context-aware recommender systems to be useful, system designers must model context and choose contextual factors in a way that encompasses both types of aforementioned interactions.
order to provide genuine context-aware recommendations, the recommender system must be able to use the contextual information (i.e., context in which the user intends to consume, which can be elicited from the user, observed or imputed directly by the system) at the time of recommendation.
the corresponding contextual information (i.e., context in which the item was consumed by the user) also needs to be elicited from the user or observed directly by the system.
for a music streaming platform, the recommendation for the next song can be made based on the user’s self-reported current mood, and the user’s immediate consumption of the song can be viewed as occurring directly in that same context.
a contextual factor should be relevant for a substantial number of users and/or items. This means that the values of a relevant contextual factors should not be constant across different user-item experiences.
Naturally, not all the available contextual factors might be relevant or useful for recommendation purposes. Consider, for example, a book recommender system. Many types of contextual data could potentially be obtained by such a system from book buyers, including: (a) the purpose of buying the book (possible options include for work, for leisure, etc.); (b) planned reading time (weekday, weekend, etc.); (c) planned reading place (at home, at school, on a plane, etc.); (d) the value of the stock market index at the time of the purchase.
For example, for mobile recommendation applications, the following four general types of contextual information are often considered [9, 55]: physical context (e.g., time, position, activity of the user, weather, light conditions, temperature), social context (e.g., is the user alone or in the group, presence and role of other people around the user), interaction media context (e.g., device characteristics—phone/tablet/laptop/etc., media content type—text/audio/video/etc.), modal context (e.g., user’s state of mind—mood, experience, current goals).
has been proposed by Baltrunas et al. [23], who developed a survey-based instrument that asks the users to judge what their preferences would be in a wide variety of hypothetical (i.e., imagined) contextual situations. This
Another approach for assessing relevance of contextual information has been proposed by Baltrunas et al. [23], who developed a survey-based instrument that asks the users to judge what their preferences would be in a wide variety of hypothetical (i.e., imagined) contextual situations. This allows to collect richer contextual preference information in a short timeframe, evaluate the impact of each contextual factor on user preferences based on the collected data, and include into the resulting context-aware system only those factors that were shown to be important.
the collected data includes only hypothetical contextual preferences (i.e., preferences for items that users imagined consuming under certain contextual circumstances), the authors demonstrate that the resulting context-aware recommender system was perceived to be more effective by users as compared to the non-context-aware recommender.
In the first approach, systems typically use contextual information (obtained either directly from the user, e.g., by specifying current mood or interest, or from the environment, e.g., obtaining local time, weather, or current location) to query or search a certain repository of resources (e.g., restaurants) and present the resources that best match a given query (e.g., nearby restaurants that are currently open) to the user.
While both general approaches offer a number of research challenges, in the remainder of this chapter we will focus on the second, more recent trend of the contextual preference elicitation and estimation in recommender systems. To
its general form, a canonical 2-dimensional (2D) (User × Item) recommender system can be described as a function, which takes partial user preference data as its input and produces a list of recommendations for each user as an output. Accordingly, Fig. 5 presents a general overview of the canonical 2D recommendation process, which includes three components: data (input), 2D recommender system (function), and recommendation list (output).
As mentioned in Sect. 2.2, canonical recommender systems are built based on the knowledge of partial user preferences presented in the form < user, item, rating > . In contrast, context-aware recommender systems are built based on the knowledge of partial contextual user preferences and typically deal with data records of the form < user, item, context, rating > , that also includes the contextual information in which the item was consumed by this user (e.g., Context =  Saturday). In addition, context-aware recommender systems may also make use of the structures of context attributes, such as context hierarchies (e.g., Saturday → Weekend) mentioned in Sect. 2.2. Based on the presence of this additional contextual data, several important questions arise: How contextual information should be reflected when modeling user preferences? Can we reuse the wealth of knowledge in canonical (non-contextual) recommender systems to generate context-aware recommendations?
2  Context in Recommender Systems 2.1  What Is Context? Context is a multifaceted concept that has been studied across different research disciplines, including computer science (primarily in artificial intelligence and ubiquitous computing), cognitive science, linguistics, philosophy, psychology, and organizational sciences. In fact, an entire conference—CONTEXT1—is dedicated exclusively to studying this topic and incorporating it into various other branches of science, including medicine, law, and business. Since context is a multidisciplinary concept, each discipline tends to take its own idiosyncratic view that is somewhat different from other disciplines and is more specific than the standard generic dictionary definition of context as “interrelated conditions in which something exists or occurs”.2 Therefore, there exist many definitions of context across various disciplines and even within specific subfields of these disciplines.
The settings where the contextual factors are stable over time are classified as static, whereas the factors changing over time are classified as dynamic
The latent approach is mostly used in order to represent context in an efficient and reduced manner from high-dimensional data, where the relationships between the original contextual features are revealed. Because the contextual factor structure is stable, it can be modeled with latent variables, mostly in the form of a vector containing numeric attributes.
the CEO of Netflix, pointed out in 2012 [63], Netflix can improve the performance of its recommendation algorithms up to 3% when taking into account such contextual information.
More recently, it has been observed at Netflix that “contextual signals can be as strong as personal preferences; …make them central to your system and infrastructure” [19].
unstructured latent contextual vector from its leaf to
Several popular approaches to train recommender systems from implicit feedback data were presented. An advantage of sampling based approaches is that they can be applied to most recommender models.
In any case, properly tuning and setting up models can be more important than switching to a more complex approach [7, 21, 22].
Video features can also be leveraged to improve recommendations. Usually, videos are converted into a sequence of frames and audio waves. As such, CNNs based models become a desirable option for video analysis.
Xu et al. [17] propose a key frame recommender system to select the key frames from a video for each
Xu et al. [17] propose a key frame recommender system to select the key frames from a video for each user.
Explainable recommender systems not only generate personalised recommendations but also produce intuitive explanations to the recommendations.
Entertainment Covington et al. [23], Chen et al. [12], Van et al. [85], Huang et al. [56], Wang et al. [121], Cheng et al. [19], Ying et al. [138], Yang et al. [135]
Regardless of what approach is chosen to model contextual factors in a specific context-aware recommender system, a common challenge for the system designers is that there are many candidate contextual factors available for consideration. Thus, the question of which information observable by a recommender system should be considered as relevant,
Still another approach toward inferring contextual information, albeit for non-RS related problems, was proposed in [75] where temporal contexts were discovered in web-sessions by decomposing these sessions into non-overlapping segments, each segment relating to one specific context. These contexts were subsequently identified using certain optimization and clustering methods.
Contextual Factors Should Not Be Static Properties of Users and Items   The first guideline represents an observation that, in order to model a certain data attribute as a contextual factor, it should truly be representative of context and not be a static characteristic of users or items.
the user’s dietary restrictions (e.g., vegetarianism or peanut allergy) would be more appropriate to model as user attributes, while the user’s intended company for the meal (e.g., with a significant other vs. with small children vs. alone) would be more appropriate to model as contextual factors.
Recommendation (system-to-user interaction): the system provides recommendations to the users of the items predicted to be most relevant to them; Feedback (user-to-system interaction): the users provide feedback to the system about their preferences for the consumed items.
In particular, for the delayed consumption recommender systems, where the recommendation interactions and the feedback interactions may occur a substantial time apart, the available contextual information can be significantly different. For example, for a restaurant recommendation application, at the recommendation time it may be more useful to ask for the user’s current food-related desires and moods (e.g., “I am in the mood for authentic Italian pizza”). However, at the feedback time, typically more specific contextual details that affected the actual restaurant experience can be collected, e.g., from the user’s restaurant review (such as “I ended up getting spaghetti and meatballs”, “it was too hot—air conditioning was not working properly”, and “the waiter was rude”). Modeling the “common” view of contextual information that combines the recommendation-time and the feedback-time perspectives represents an important challenge for context-aware recommender systems designers, especially in the delayed consumption applications. In contrast, this challenge is typically less pronounced for instant consumption recommender systems, where the recommendation interactions and feedback interactions occur in close temporal proximity, and where the available contextual information can be treated as essentially identical. For example, for a music streaming platform, the recommendation for the next song can be made based on the user’s self-reported current mood, and the user’s immediate consumption of the song can be viewed as occurring directly in that same context.
In the presence of available contextual information, following the diagrams in Fig. 6, we start with the data having the form U × I × C × R, where C is additional contextual dimension and end up with a list of contextual recommendations i 1, i 2, i 3… for each user.
Zheng et al. [141] use a similar approach (called context “relaxation”)
Zheng et al. [141] use a similar approach (called context “relaxation”) for travel recommendations.
The predictions are done using these contextual micro-profiles instead of a single user model.
Amatriain [17] introduce the idea of micro-profiling (or user splitting), which splits the user profile into several (possibly overlapping) sub-profiles, each representing the given user in a particular context.
if a person wants to see a movie on a weekend, and on weekends she only watches comedies, the system can filter out all non-comedies from the recommended movie list. More generally, the basic idea for contextual post-filtering approaches is to analyze the contextual preference data for a given user in a given context to find specific item usage patterns (e.g., user Jane Doe watches only comedies on weekends) and then use these patterns to adjust the item list, resulting in more “contextual” recommendations, as depicted in Fig.
Each of these CM methods requires building a contextual profile prof(u, c) for user u in context c, and then using the contextual profiles of all the users to find the N nearest neighbors of user u in terms of these profiles in context c.
Adomavicius and Tuzhilin [5] present a method of extending a regression-based Hierarchical Bayesian (HB) collaborative filtering model of estimating unknown ratings proposed by Ansari et al. [15] in order to incorporate additional contextual dimensions, such as time and location, into the HB model.
Oku et al. [97] propose to incorporate additional contextual factors (such as time, companion, and weather) directly into recommendation space and use machine learning techniques to provide recommendations in a restaurant recommender system.
Furthermore, Oku et al. [97] empirically show that the context-aware SVM significantly outperforms the non-contextual SVM-based recommendation algorithm in terms of predictive accuracy and user’s satisfaction with recommendations.
Recent state-of-the-art context-aware methods represent the relations between users/items and contexts as a tensor, with which it is difficult to distinguish the impacts of different contextual factors and to model complex, non-linear interactions between contexts and users/items. Therefore, several attention-based recommendation models are used to enhance CARS through adaptively capturing the interactions between contexts and users/items and improve the interpretability of recommendations through identifying the most important contexts [52, 66, 92].
[92] proposed a neural model, named Attentive Interaction Network (AIN), to enhance CARS through adaptively capturing the interactions between contexts and users/items, and [52]
Context-aware sequential recommendations have been extensively used to monitor the evolution of user tastes over time, which helps to improve the quality of contextual recommendations [107].
Reinforcement Learning (RL)   Most recommendation models consider the recommendation process as static, which makes it difficult to capture users’ temporal intentions and to respond to them in a timely manner.
[54] proposed a contextual bandit algorithm to decide which content to recommend to users, where the reward function takes into consideration contextual information, and [84] proposed a framework for online learning and adaptation to sequential preferences within a listening session
Multi-armed bandit is the most thoroughly studied RL problem. It is inspired by slot machines in a casino: for a bandit (slot) machine with M arms, pulling arm i will result in a random payoff (reward)
Multi-armed bandit is the most thoroughly studied RL problem. It is inspired by slot machines in a casino: for a bandit (slot) machine with M arms, pulling arm i will result in a random payoff (reward) r, sampled from an unknown and arm-specific distribution p i. The objective is to maximize the total reward of the user over a given number of interactions.
They present a context-aware collaborative filtering
Tripathi et al. [123] combine the potential of reinforcement learning and deep bidirectional recurrent neural networks for automatic personalized video recommendation. They present a context-aware collaborative filtering approach, where the intensity of user’s non-verbal emotional response toward the recommended video is captured through interactions and facial expression analysis.
In [121], a Q-learning-based travel recommender is proposed, where trips are ranked using a linear function of several content and contextual features including trip duration, price, and country, and the weights are updated using user feedback.
Context-awareness is being recognized as an important issue in many recommendation applications, which is evidenced by the increasing number of papers being published on this topic.
researchers have focused primarily on how to take advantage of contextual information in order to improve the quality of recommendations for different recommendation tasks and applications.
the main research issues, challenges, and directions can be broadly classified into the following four general categories [8]: Algorithms, i.e., developing recommendation algorithms that can incorporate contextual information into recommender systems in advantageous ways. Evaluation, i.e., in-depth performance evaluation of various context-aware recommendation approaches and techniques, their benefits and limitations. Engineering, i.e., designing general-purpose architectures, frameworks, and approaches to facilitate the development, implementation, deployment, and use of context-aware recommendation capabilities. Fundamentals, i.e., deeper understanding the notion of context and modeling context in recommender systems.
In particular, [101] show that context-aware recommendation techniques outperform canonical (non-contextual) approaches in terms of accuracy, trust, and several economics-based performance metrics across most of their experimental settings.
Another interesting example of user studies with context-aware recommender systems is the work by Braunhofer et al. [31]. In this study, the users, after receiving a recommendation from a context-aware system that recommends points-of-interest, were asked to evaluate the system’s performance on the following two dimensions: “Does this recommendation fit my preference?” (i.e., “personalization” performance) and “Is this recommendation well-chosen for the situation?” (i.e., “contextualization” performance).
One of the major issues that have slowed down the progress of the CARS field in the past was the availability of large-scale publicly available datasets on which novel CARS-based methods could be evaluated. This situation has improved significantly over the last few years when such datasets became publicly available. For example, DePaulMovie [142] and LDOS-CoMoDa [94] datasets contain movie ratings collected along with contextual information; InCarMusic [21], Frappe [20] and STS [32] datasets provide the apps usage logs.
Despite this progress, the CARS community would benefit significantly from additional publicly available, large-scale, context-oriented datasets, and this should also
Despite this progress, the CARS community would benefit significantly from additional publicly available, large-scale, context-oriented datasets, and this should also be an important priority for the research community.
the study by Hussein et al. [67, 68], where the authors introduce a service-oriented architecture enabling to define and implement a variety of different “building blocks” for context-aware recommender systems, such as recommendation algorithms, context sensors, various filters and converters,
Another important “Engineering” aspect is to develop richer interaction capabilities with CARS that make recommendations more flexible. As compared to canonical recommender systems, context-aware recommenders have two important differences. The first is increased complexity, since CARS involve not only users and items in the recommendation process, but also various types of contextual information. Thus, the types of recommendations can be significantly more complex in comparison to the canonical non-contextual cases.
The combination of these two features calls for the development of more flexible recommendation methods that allow the user to express the types of recommendations that are of interest to them rather than consuming standard recommendations that are “hard-wired” into
The second difference is increased interactivity, since more information (i.e., context) usually needs to be elicited from the user in the CARS settings. For example, to utilize the available contextual information, a CARS system may need to elicit from the user (Tom) with whom he wants to see a movie (e.g., girlfriend) and when (e.g., over the weekend) before providing any context-specific recommendations.
The combination of these two features calls for the development of more flexible recommendation methods that allow the user to express the types of recommendations that are of interest to them rather than consuming standard recommendations that are “hard-wired” into the recommendation engines provided by many current vendors.
The second requirement of interactivity also calls for the development of tools allowing users to provide inputs into the recommendation process in an interactive and iterative manner, preferably via some well-defined user interfaces (UI).
What are the tradeoffs of different modelling assumptions (e.g., static vs. dynamic context)? The recommender systems community has been moving towards studying some of these questions.
Moving forward, we believe that significantly more research is needed to better understand various aspects of latent and/or dynamic contextual information and how to leverage it in providing better recommendations. Another related and also under-explored research area is how to extract the contextual information from different data sources, such as user reviews, tweets, sensors, IoT devices, etc., that is not explicitly observed and/or recorded as context by the recommender system.
In addition, a user model in context-based RSs, as described in Chapter “Context-Aware Recommender Systems: From Foundations to Recent Developments”, is dynamic and changes based on contextual features such as temporal, location, mood or any other relevant features. Usually in context-aware RSs the users are known to the system and have user models that should be altered according to the contextual information to address the short-term preferences emerging in a specific context.
We should stress that utilizing the right data is fundamental to the performance of an RS. As described above, a variety of data and knowledge sources can be leveraged in various RSs techniques. The decision about the data to use for a system, and how to use it should be done carefully while considering availability of data, the recommendation algorithm, the required effort, and the available resources.
Explicit feedback is known to be more reliable than implicit one and provides a level of preference of the user for an item (e.g., ratings 1–5). However, explicit feedback is often not available, or very sparse, since many users would not bother to provide it. Ratings are the most popular form of explicit feedback data that RSs collect [59].
Users   Users of an RS, as we mentioned above, may have very diverse goals and characteristics. In order to personalize the recommendations and the human-computer interaction, RSs exploit a range of information about the users. This information can be structured in various ways. User data is said to constitute the user model [16, 30]. The user model profiles the user, i.e., encodes her preferences and needs. Various user modeling approaches have been used and, in a certain sense, an RS can be viewed as a tool that generates recommendations by building and exploiting user models [13, 14].
In context aware RS (see Chapter “Context-Aware Recommender Systems: From Foundations to Recent Developments”) the user model incorporates the contextual information to be utilized during the recommendation process in order to recommend items to users under specific contextual situations. For example, by using the temporal context, a travel recommender system would include contextual features describing the weather, and the time of year when the vacation was consumed, so that a vacation recommendation in winter can be very different from the one recommended in summer.
In Chapter “Deep Learning for Recommender Systems” the authors overview some key methods and highlight the impact of deep learning techniques in the recommender systems field. They present a range of challenging tasks in recommendation, such as cold-start problem, explainability, temporal dynamics, and robustness that can be addressed using deep neural networks.
Evidence suggests that people tend to rely more on recommendations from their friends than on recommendations from similar but anonymous individuals [69].
social recommender systems [31]
An essential goal of recommender systems is to help users make better choices [23, 34, 37].
research towards a more user-centric analysis. In this handbook some, already mentioned contributions, deal with these novel evaluation dimensions: Chapters “Fairness in Recommender Systems”, “Novelty and Diversity in Recommender Systems”, and “Value and Impact of Recommender Systems”.
Another notable example of RSs that emerged with the diffusion of new communication technologies are recommender systems related to the social web, and specifically those that target the social media domain. With the rise of social networks (e.g., Facebook, LinkedIn, Tweeter, Flickr, and others), users are overloaded with information, activities and interactions.
Social recommender systems are RSs that aim at assisting the user in identifying relevant content (e.g., tweets, feeds or images), and engage only in relevant activities and interactions (e.g., discussions, or comments).
Chapter “Social Recommender Systems” describes two main types: recommendations of social media content and recommendations of people. For recommendations of social content, the chapter reviews various social content media domains, and provides a detailed case study and insights learned from a recommender system operated in the enterprise which suggests mixed social media items.
investigated. The list includes: the need for explanation, privacy concerns, social relationships, trust and reputation, as well as the need to define special evaluation measures.
topics that should be considered and should be further investigated. The list includes: the need for explanation, privacy concerns, social relationships, trust and reputation, as well as the need to define special evaluation measures.
“Multimedia Recommender Systems: Algorithms and Challenges” the authors survey the state-of-the-art research related to multi-media RS and, in particular focusing on techniques that integrate item or user side information into a hybrid recommender.
number of open issues are related to the critical stage of acquiring reliable information about the user preferences in order to generate a useful user profile. It is clear that in many real-world applications of RSs, implicit feedback is much more readily available and requires no extra effort on the user’s side.
Hence, also when managing implicit feedback one has to consider the biases of the available data and the implication for the construction of effective and fair systems. These topics are further discussed in Chapters “Novelty and Diversity in Recommender Systems”, “Value and Impact of Recommender Systems”, “Multistakeholder Recommender Systems”, and “Fairness in Recommender Systems”. But more effective solutions for tackling it must be further developed.
in certain application domains explicit feedback is still central. This is clearly illustrated in Chapters “Social Recommender Systems”, “Group Recommender Systems:
Beyond preference aggregation”, and “People-to-People Reciprocal Recommenders”. These are application domains or settings where the RS cannot totally operate as a black box; users are interacting also with themselves, not only with items.
simplifying the cognitive cost of preference acquisition is of primary importance.
In these applications the RS acts as a mediator between users, enabling the users to better understand each other and browse recommendations that are often referring to actions or preferences of other users. This is clearly seen in a group RS where the goal of the system is to support the choices of a group of users, which often requires the reciprocal understanding
Therefore, simplifying the cognitive cost of preference acquisition is of primary importance.
aside from the algorithms, which are used to predict the user preference and behaviour, and compute the recommendations, the mechanism through which users provide their input and the means by which they receive the systems output, play a significant role and can play an even larger role in determining the success or failure of a recommender system.
Moreover, users often do not know or do not reflect on their preferences beforehand, especially when users approach an RS for information discovery. In such cases, the system-supported interaction and visualization contribute to the user construction of their preferences within a specific recommendation session.