文档库 最新最全的文档下载
当前位置:文档库 › Types of Recommendations

Types of Recommendations

CS630 Representing and Accessing Digital Information
Recommender Systems
Thorsten Joachims Cornell University
Recommender Systems
? ? ? ? Task definition Item-to-Item Similarity User-to-User Similarity Recommendation
– Content-based methods – Collaborative nearest neighbor methods – Collaborative model-based methods
? Adaptive Recommendation for Search Engines
Motivation
? Matchmaking between users and items
– – – – Filtering Exploration Marketing etc.
Example: Amazon
Data
? Explicit feedback
– – – – – – – – Ratings Reviews Auctions etc. Page visits Purchase data Browsing paths etc.
Types of Recommendations
? Item-to-Item associations
– More pages like this – “Users who bought this book also bought X”
? User-to-User associations
– Which other user has similar interests?
? Implicit feedback
? User-to-Item associations
– Rating history describes user – Items are described by attributes – Items are described by ratings of other users

Recommender Systems
? ? ? ? Task definition Item-to-Item Similarity User-to-User Similarity Recommendation
– Content-based methods – Collaborative nearest neighbor methods – Collaborative model-based methods
Item-to-Item Recommendation
? Content-based approach
– Item is described by a set of attributes
? Movies: e.g director, genre, year, actors ? Documents: bag-of-word
– Similarity metric defines relationship between items
? e.g. cosine similarity
– Examples
? “related pages” in search engine ? Google News
? Adaptive Recommendation for Search Engines
Item-to-Item Recommendation
? Collaborative filtering
– Item is described by user interactions
? Matrix V of n (number of users) rows and m (number of items) columns ? Elements of matrix V is user feedback
Recommender Systems
? ? ? ? Task definition Item-to-Item Similarity User-to-User Similarity Recommendation
– Content-based methods – Collaborative nearest neighbor methods – Collaborative model-based methods
– Examples:
? Rating given to item by each user ? Users who viewed this item
– Similarity metric between items
? E.g. cosine
? Adaptive Recommendation for Search Engines
User-to-User Similarity
? User is described by his/her ratings – Matrix V of n (number of users) rows and m (number of items) columns. Elements of matrix V is user feedback. ? Normalization – Mean rating of user a li = # of ratings ? Similarity measure between users – Cosine
Recommender Systems
? ? ? ? Task definition Item-to-Item Similarity User-to-User Similarity Recommendation
– Content-based methods – Collaborative nearest neighbor methods – Collaborative model-based methods
– Correlation ? Problems – data sparseness – Unknown vs. unseen
? Adaptive Recommendation for Search Engines

Content-Based Recommendation
? Use the ratings as feedback
– Binary – Ordinal
Collaborative Nearest-Neighbor Methods
? Idea: Recommend items that similar users like ? User is described by his/her ratings
– Matrix V of n (number of users) rows and m (number of items) colums. Elements of matrix V is user feedback.
? Represent items using a set of features
– Movies: e.g director, genre, year, actors – Documents: bag-of-word
? Normalization
– Mean rating of user a li = # of ratings (or Correlation)
? Learn function that predicts the rating for un-rated items
– Learn one function per user – Can use any machine learning method
? Similarity measure between users
– Cosine
? Prediction via linear combination
? Strengths and Weaknesses?
Collaborative Model-Based Methods
? Idea
– Learn a model offline – Use model to make predictions online
Joint Density Modeling
? Idea: Estimate distribution of ratings via mixture model ? Assumptions:
– K disjoint user-interest classes – Each user is in exactly one interest class – Users within one class behave according to simple model, e.g.
? Approach: Model joint density of user ratings
– Cluster users – Approximate joint density with mixture model
? Approach: Learn conditional model for each item
– Learn prediction rules – One rule for each item
? Prediction
– Classify user via mode – Bayesian classification
? Extensions
– User can be in multiple classes (Hofmann & Puzicha, 1999)
Conditional Models
? Idea: Learn a prediction rule for each item ? Learning Problem
– Classification: Predict rating class [Heckerman et al., 2000] – Regression: Predict rating score – Ordinal Regression: Predict ranking of items [Cohen et al., 1999]
Cold-Start Problem
? Problem: new users have too few ratings for effective recommendation ? Idea: Combine ratings with other user attributes
– Demographic attributes – Attributes from other domains – Questionnaires
? Challenges:
– Designing combined models – Trading-off user attributes with rating attributes
? Challenges:
– Handling missing ratings – Computational expense for learning m models – No ratings for new products

Evaluation
? Batch Evaluation
– – – – Use historical data Split into training and test part on a per-user basis k ratings to describe user, remaining ratings for testing Problems?
Recommender Systems
? ? ? ? Task definition Item-to-Item Similarity User-to-User Similarity Recommendation
– Content-based methods – Collaborative nearest neighbor methods – Collaborative model-based methods
? Online Evaluation
– Install recommender system in operational system – Controlled experiment with control group
? Does the recommender system increase sales? ? Does the recommender system make users return more often? ? etc.
? Adaptive Recommendation for Search Engines
Personalizing Search Engines
? Current Search Engines – One-size-fits-all – Hand-tuned retrieval function ? Hypothesis – Different users need different retrieval functions – Personlized retrieval big gain ? Explicit Feedback – Overhead for user – Only few users give feedback -> not representative ? Implicit Feedback – No Overhead – More difficult to interpret
Learning an Improved Retrieval Function
Assumption: If a user skips a link a and clicks on a link b ranked lower, then the user preference reflects rank(b) < rank(a). Example: (3 < 2) and (7 < 2), (7 < 4), (7 < 5), (7 < 6)
1. Kernel Machines http://svm.first.gmd.de/ 2. Support Vector Machine https://www.wendangku.net/doc/f46801246.html,/ 3. SVM-Light Support Vector Machine http://ais.gmd.de/~thorsten/svm light/ 4. An Introduction to Support Vector Machines https://www.wendangku.net/doc/f46801246.html,/ 5. Support Vector Machine and Kernel ... References https://www.wendangku.net/doc/f46801246.html,/SVMrefs.html 6. Archives of SUPPORT-VECTOR-MACHINES ... https://www.wendangku.net/doc/f46801246.html,/lists/SUPPORT... 7. Lucent Technologies: SVM demo applet https://www.wendangku.net/doc/f46801246.html,/SVT/SVMsvt.html 8. Royal Holloway Support Vector Machine https://www.wendangku.net/doc/f46801246.html,
Approach
? Learn a ranking function f, so that number of violated pair-wise preferences is minimized. ? Form of Ranking function: sort by rsv(q,di) = w1 * (#of query words in title of di) + w2 * (#of query words in anchor) +… + wn * (page-rank of di) = w * Φ(q,di) ? Select f so that if user prefers di to di for query q, then rsv(q, di) > rsv(q, dj)
Ranking Support Vector Machine
? Find ranking function with low error and large margin
? Convex quadratic program ? Implemented as part of SVM-light
δ
δ δ

Experiment
Meta-Search Engine – Implemented meta-search engine “Striver” that re-ranks the top 100 results from
? ? ? ? ? Google MSNSearch Altavista Hotbot Excite ? ?
1. 2. 3. 4. 5.
Which Search Engine is Better?
Use Clickthrough Data as Implicit Feedback Approach: Experiment setup for “unbiased” clickthrough
Kernel Machines http://svm.first.gmd.de/ Support Vector Machine https://www.wendangku.net/doc/f46801246.html,/ An Introduction to Support Vector Machines https://www.wendangku.net/doc/f46801246.html,/ Archives of SUPPORT-VECTOR-MACHINES ... https://www.wendangku.net/doc/f46801246.html,/lists/SUPPORT... SVM-Light Support Vector Machine http://ais.gmd.de/~thorsten/svm light/ 1. 2. 3. 4. 5. Kernel Machines http://svm.first.gmd.de/ SVM-Light Support Vector Machine http://ais.gmd.de/~thorsten/svm light/ Support Vector Machine and Kernel ... References https://www.wendangku.net/doc/f46801246.html,/SVMrefs.html Lucent Technologies: SVM demo applet https://www.wendangku.net/doc/f46801246.html,/SVT/SVMsvt.html Royal Holloway Support Vector Machine https://www.wendangku.net/doc/f46801246.html,
Google
1. 2. 3. 4. 5. 6. 7. 8.
Experiment Setup – User study on group of ~20 German machine learning researchers => homogeneous group of users – Asked users to use the system like any other search engine – Collect three weeks of clickthrough data for training ranking SVM – test on 2 further weeks
Kernel Machines http://svm.first.gmd.de/ Support Vector Machine https://www.wendangku.net/doc/f46801246.html,/ SVM-Light Support Vector Machine http://ais.gmd.de/~thorsten/svm light/ An Introduction to Support Vector Machines https://www.wendangku.net/doc/f46801246.html,/ Support Vector Machine and Kernel ... References https://www.wendangku.net/doc/f46801246.html,/SVMrefs.html Archives of SUPPORT-VECTOR-MACHINES ... https://www.wendangku.net/doc/f46801246.html,/lists/SUPPORT... Lucent Technologies: SVM demo applet https://www.wendangku.net/doc/f46801246.html,/SVT/SVMsvt.html Royal Holloway Support Vector Machine https://www.wendangku.net/doc/f46801246.html,
Learned
?
Theoretical and Empirical Result: – Clickthrough in combined ranking gives same results as explicit feedback under mild assumptions
Results
Ranking A Learned Learned Lerned Ranking B Google MSNSearch Toprank A better 29 18 21 B better 13 4 9 Tie 27 7 11 Total 69 29 41 ? ? ? ? ? ... ? … ? ? ... ? ? ? ? Weight 0.60 0.48 0.24 0.24 0.22 0.17 0.16 -0.15 -0.17 -0.32 -0.38
Learned Weights
Feature cosine between query and abstract ranked in top 10 from Google cosine between query and the words in the URL doc ranked at rank 1 by exactly one of the 5 engines host has the name “citeseer” country code of URL is ".de" ranked top 1 by HotBot country code of URL is ".fi" length of URL in characters not ranked in top 10 by any of the 5 search engines not ranked top 1 by any of the 5 search engines
? Toprank: rank by increasing minium rank over all 5 search engines ? Result:
– Learned > Google – Learned > MSNSearch – Learned > Toprank

相关文档