CoFi-points: Collaborative Filtering via Pointwise Preference - - PowerPoint PPT Presentation

cofi points collaborative filtering via pointwise
SMART_READER_LITE
LIVE PREVIEW

CoFi-points: Collaborative Filtering via Pointwise Preference - - PowerPoint PPT Presentation

CoFi-points: Collaborative Filtering via Pointwise Preference Learning on User/Item-Set Lin Li 1 , 2 , Weike Pan 1 , 2 , and Zhong Ming 1 , 2 , lilin20171@email.szu.edu.cn, { panweike,mingz } @szu.edu.cn 1 College of Computer Science and


slide-1
SLIDE 1

CoFi-points: Collaborative Filtering via Pointwise Preference Learning on User/Item-Set

Lin Li1,2, Weike Pan1,2,∗ and Zhong Ming1,2,∗

lilin20171@email.szu.edu.cn, {panweike,mingz}@szu.edu.cn 1College of Computer Science and Software Engineering

Shenzhen University, Shenzhen, China

2National Engineering Laboratory for Big Data System Computing Technology

Shenzhen University, Shenzhen, China

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 1 / 38

slide-2
SLIDE 2

Introduction

Problem Definition

One-Class Collaborative Filtering (OCCF) Input: (user, item) pairs, e.g., an observed (user, item) pair (u, i) denotes that a user u has a relatively positive feedback on an item i. Goal: recommend a personalized ranked list of unobserved items from I\Iu for each user u.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 2 / 38

slide-3
SLIDE 3

Introduction

Challenges

Data: lack of negative feedback in the training data. Model: how to facilitate effective pointwise preference learning for ranking-oriented tasks without the expense of accuracy.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 3 / 38

slide-4
SLIDE 4

Introduction

Overall of Our Solution

1

We first propose a new preference assumption for implicit feedback, i.e., pointwise preference assumption on user/item-set.

2

We then develop a novel recommendation solution with two specific algorithms based on the proposed assumption, i.e. CoFi-points: collaborative filtering via pointwise preference learning on user/item-set.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 4 / 38

slide-5
SLIDE 5

Introduction

Advantages of Our Solution

Our solution inherits the merit of preferences defined on a set of users or items, which are empirically shown more accurate than that on a single user or item only, and yield substantial improvements in handling uncertainty of the implicit feedback Our solution adopts the pointwise preference assumption with different strategies on observed and unobserved user/item-set, which transforms the limitation into an advantage we can benefit from. Our pointwise scheme naturally facilitates better adaptability and extensibility in comparison with the pairwise one.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 5 / 38

slide-6
SLIDE 6

Introduction

Notations (1/2)

Table: Notations and explanations

n number of users m number of items U = {1, 2, .., n} the whole set of users I = {1, 2, .., m} the whole set of items u ∈ U user ID i, j ∈ I item ID R = {(u, i)} training data of observed pairs Rte = {(u, i)} test data of observed pairs Iu the set of items that have been interacted with by u P ⊆ Iu a randomly sampled observed item-set A ⊆ I\Iu a randomly sampled unobserved item-set ˆ rui estimated preference of u to i ˆ ruP estimated preference of u to P ˆ ruA estimated preference of u to A

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 6 / 38

slide-7
SLIDE 7

Introduction

Notations (2/2)

Table: Notations and explanations (cont.)

Ui the set of users who have interacted with item i G ⊆ Ui a randomly sampled observed user-set interacted with item i ˆ rGi estimated preference of user-set G to item i ˆ rGui estimated fused preference of user-set G and user u to item i Uu· ∈ R1×d latent vector of user u Vi· ∈ R1×d latent vector of item i bu ∈ R bias of user u bi ∈ R bias of item i ρ sampling ratio for unobserved pairs γ learning rate T iteration number

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 7 / 38

slide-8
SLIDE 8

Method

Existing Preference Assumptions

Pointwise preference assumption on item: rui = 1, ruj = 0, i ∈ Iu, j ∈ I\Iu. (1) Pairwise preference assumption on item: rui > ruj, i ∈ Iu, j ∈ I\Iu. (2) Pairwise preference assumption on item-set: ruP > ruA, P ⊆ Iu, A ⊆ I\Iu. (3) Pairwise preference assumption on user-set: rGui > ruj, i ∈ Iu, G ⊆ Ui, j ∈ I\Iu. (4)

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 8 / 38

slide-9
SLIDE 9

Method

Illustration

Pairwise preference learning over items:

(John, Prince of Egypt) (John, Forrest Gump)

Point preference learning over items:

(John, Forrest Gump) (John, Prince of Egypt)

Pairwise preference learning over item-set:

(John, <Forrest Gump, A Beautiful Mind>) (John, <Prince of Egypt, The Matrix>)

Pointwise preference learning over item-set:

(John, <Forrest Gump, A Beautiful Mind>)

Pairwise preference learning over user-set:

(<John, Jacky, Rebecca>, Forrest Gump) (John, Prince of Egypt) (John, The Matrix)

Pointwise preference learning over user-set:

(<John, Jacky, Rebecca>, Forrest Gump)

  • bserved items

unobserved items John Rebecca Jacky

(John, Prince of Egypt) (John, The Matrix) (John, Prince of Egypt) (John, The Matrix)

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 9 / 38

slide-10
SLIDE 10

Method

Pointwise Preference Assumption on Item-set (1/2)

It is more natural to assume that a randomly sampled observed set is liked by a certain user, because it is more likely for a set to include some items that are liked by the user.

If John is a devoted fan of inspirational movies while Rebecca is an adventurer who enjoys diverse ones, then it would be obvious that Forrest Gump means much more to John compared with Rebecca. But a typical pointwise assumption assumes Forrest Gump contributes the same to John’s and Rebecca’s movie taste, i.e., (John, Forrest Gump) → 1 and (Rebecca, Forrest Gump) → 1.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 10 / 38

slide-11
SLIDE 11

Method

Pointwise Preference Assumption on Item-set (2/2)

Our assumption: ruP = 1, ruj = 0, P ⊆ Iu, j ∈ A, A ⊆ I\Iu, (5) where P is an observed item-set and A is an unobserved item-set. For the observed item-set P, we define the predicted preference

  • f user u to item-set P in the same way with that of

CoFiSet [Pan and Chen, 2013a], i.e., ˆ ruP =

i∈P ˆ

rui/|P|. For the unobserved item-set A, we define the preference of user u

  • n each item j ∈ A separately.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 11 / 38

slide-12
SLIDE 12

Method

CoFi-points(i): Likelihood

Based on the pointwise preference assumption on item-set, we can have the log-likelihood for a specific user u as follows: ln p((u, P)|Θ)ruP [

  • j∈A

(1 − p((u, j)|Θ))1−ruj]

1 |A| .

(6) We use the sigmoid function to approximate the probability, i.e., σ(ˆ ruP) for p((u, P)|Θ) and σ(ˆ ruj) for p((u, j)|Θ)). We can then rewrite the log-likelihood as: ln(σ(ˆ ruP)) + 1 |A|

  • j∈A

ln(1 − σ(ˆ ruj)). (7)

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 12 / 38

slide-13
SLIDE 13

Method

CoFi-points(i): Objective Function (1/2)

Finally, combining all possible observed item-sets and unobserved item-sets of each user u ∈ U, we reach the overall objective function:

  • u∈U
  • P⊆Iu
  • A⊆I\Iu

ln(σ(ˆ ruP))+ 1 |A|

  • j∈A

ln(1−σ(ˆ ruj)). (8)

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 13 / 38

slide-14
SLIDE 14

Method

CoFi-points(i): Objective Function (2/2)

Maximizing the overall log-likelihood in Eq.(8) is equal to solve the following optimization problem: min

Θ

  • u∈U
  • P⊆Iu
  • A⊆I\Iu

[fuP + fuA + R(Θ)], (9) where fuP = − ln(σ(ˆ ruP)) and fuA = − 1

|A|

  • j∈A ln(1 − σ(ˆ

ruj)) are loss functions for an observed item-set P and an unobserved item-set A, respectively, and Θ = {Uu·, Vi·, bu, bi|u ∈ U, i ∈ I} denotes the model parameters to be learned.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 14 / 38

slide-15
SLIDE 15

Method

Pointwise Preference Assumption on User-set

Our assumption: rGui = 1, ruj = 0, u ∈ G, G ⊆ Ui, j ∈ A, A ⊆ I\Iu, (10) where G is a randomly sampled set of users that have interacted with item i and A is an unobserved item-set. For the observed user-set G, we define the preference of user-set G on item i in the same way with that of GBPR [Pan and Chen, 2013a], i.e., ˆ rGi =

w∈G ˆ

rwi/|G|. We adopt the fused preference of the user-set preference and the individual preference, which combines the effect of user-set into individual preference in the form of ˆ rGui = pˆ rGi + (1 − p)ˆ rui. For the unobserved item-set A, we define the preference of user u

  • n each item j ∈ A separately.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 15 / 38

slide-16
SLIDE 16

Method

CoFi-points(u): Likelihood

The corresponding log-likelihood for a specific user u with a related user-set G and an unobserved item-set A ⊆ I\Iu is as follows, ln Prob((u, G, i)|Θ)rGui [

  • j∈A

(1 − Prob((u, j)|Θ))1−ruj ]

1 |A| .

(11) Similarly, we use σ(ˆ rGui) for Prob((u, G, i)|Θ) and σ(ˆ ruj) for Prob((u, j)|Θ)). We can then rewrite the log-likelihood in Eq.(11) as follows, ln(σ(ˆ rGui)) + 1 |A|

  • j∈A

ln(1 − σ(ˆ ruj)). (12)

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 16 / 38

slide-17
SLIDE 17

Method

CoFi-points(u): Objective Function (1/2)

Finally, combining all possible observed user-set and unobserved item-set of each user u ∈ U, we reach the overall log-likelihood,

  • u∈U
  • i∈Iu
  • G⊆Ui
  • A⊆I\Iu

ln(σ(ˆ rGui))+ 1 |A|

  • j∈A

ln(1−σ(ˆ ruj)). (13)

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 17 / 38

slide-18
SLIDE 18

Method

CoFi-points(u): Objective Function (2/2)

Our objective can be further converted to the following optimization problem, min

Θ

  • u∈U

i∈Iu

  • G⊆Ui

fGui +

  • A⊆I\Iu

fuA

  • + R(Θ),

(14) where fGui = − ln(σ(ˆ rGui)) and fuA = − 1

|A|

  • j∈A ln(1 − σ(ˆ

ruj)) are loss functions for an observed user u related with user-set G and an unobserved item-set A, respectively, and Θ = {Uu·, Vi·, bu, bi|u ∈ U, i ∈ I} denotes the model parameters to be learned.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 18 / 38

slide-19
SLIDE 19

Method

Optimization

We use the SGD (stochastic gradient descent) algorithm to learn the parameters in Eq.(9) and Eq.(14). In order to model the preference adequately, we conduct negative sampling on unobserved items and control its power by a sampling ratio ρ.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 19 / 38

slide-20
SLIDE 20

Method

Prediction Rule

The predicted preference of user u on item i: ˆ rui = Uu·V T

i· + bu + bi.

(15)

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 20 / 38

slide-21
SLIDE 21

Experiments

Datasets

Datasets n m |R| |Rte| MovieLens100K 943 1,682 27,688 27,687 MovieLens1M 6,040 3,952 287,641 287,640 UserTag 3,000 2,000 123,128 123,218 Netflix5K5K 5,000 5,000 77,936 77,936 Each dataset contains three copies of training data, validation data and test data: we firstly randomly select half of the observed pairs in each dataset to construct as a training data; we then randomly take 1 (user, item) pair for each user on average from the training data as a validation set for parameter tuning; we finally take the remaining half as a test data.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 21 / 38

slide-22
SLIDE 22

Experiments

Evaluation Metrics

We evaluate the performance via some commonly used ranking-oriented top-k evaluation metrics including: Precision@5 Recall@5 F1@5 NDCG@5 1-call@5

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 22 / 38

slide-23
SLIDE 23

Experiments

Baselines (1/2)

MF [Pan et al., 2008] is the basic matrix factorization method for modeling one-class feedback, which adopts a poitwise square loss defined on single items. BPR [Rendle et al., 2009] is the seminal work on pairwise preference learning over two items, which performs very well in most cases. LogisticMF [Johnson, 2014] is a recent and representative work of pointwise preference learning in recommendation with implicit feedback, which uses a logistic loss defined on single items. SVD++ [Koren, 2008] is an advanced matrix factorization method with an expanded prediction rule defined on a target (user, item) pair as well as the interacted items by the corresponding target user, which is then

  • ptimized via a pointwiwe square loss in the same way with that of the

aforementioned MF method. FISMauc [Kabbur et al., 2013] is a further extended matrix factorization method that represents each user by his/her interacted items for modeling one-class feedback, which is optimized via a pairwise loss.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 23 / 38

slide-24
SLIDE 24

Experiments

Baselines (2/2)

NeuMF [He et al., 2017] is a very recent state-of-the-art method built on a deep neural network that jointly trains a generalized matrix factorization (GMF) and a multi-layer perceptron (MLP) for item ranking, where the input is the same as that of LogisticMF [Johnson, 2014]. NeuPR [Song et al., 2018] is also a deep learning based method with a similar architecture to that of NeuMF, which is learned in an elegant and pairwise manner instead of the pointwise way in NeuMF. CoFiSet [Pan and Chen, 2013a] relaxes the pairwise preference assumption in BPR by defining the preference on item-set instead of on single items, which is shown to be more accurate in preference modeling and item recommendation. GBPR [Pan and Chen, 2013b] is an improved preference learning method by extending individual preference in BPR [Rendle et al., 2009] to group preference.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 24 / 38

slide-25
SLIDE 25

Experiments

Parameter Settting (1/2)

For non-deep models including MF, BPR, LogisticMF, SVD++, FISMauc, CoFiSet, GBPR and our CoFi-points, the best values of the hyper-parameters are searched in a similar and fair way. We search the tradeoff parameters (αu, αv, βu and βv) from {0.001, 0.01, 0.1}, the sampling ratio ρ from {0.2, 0.4, 0.6, 0.8}, and the iteration number T from {103, 104, 105}. We fix the number of latent dimensions d = 20 and the learning rate γ = 0.01. We fix the size of user/item-sets in our CoFi-points as |P| = |G| = |A| = 3. For CoFiSet, we set |P| = 3 and |A| = 1, which is thus denoted as CoFiSet(SO) meaning “Set vs. One”. For the value of p in the fused preference in GBPR and our CoFi-points(u), we use the best values reported in [Pan and Chen, 2013b].

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 25 / 38

slide-26
SLIDE 26

Experiments

Parameter Settting (2/2)

For the deep models NeuMF and NeuPR, we implement the methods using TensorFlow1 and keep the structure with the best performance as reported in [He et al., 2017] and [Song et al., 2018] containing four layers for the MLP component, and validate the performance under the instructions in the original paper. Notice that pre-training may be helpful for deep-learning-based methods to achieve better performance, but for fair comparison, all the compared methods are tested without pre-training in the experiments. Besides, unlike the evaluation protocol used in [He et al., 2017], where only 100 unobserved items are sampled to evaluate the final ranking performance, we rank all the unobserved items based on the predicted scores as adopted in most factorization-based recommendation methods [Rendle et al., 2009, Kabbur et al., 2013].

1https://www.tensorflow.org/ Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 26 / 38

slide-27
SLIDE 27

Experiments

Main Results (1/2)

Table: Recommendation performance of all the baselines above and our CoFi-points on MovieLens100K and MovieLens1M.

Dataset Method Prec@5 Rec@5 F1@5 NDCG@5 1-call@5 ML100K MF 0.3596±0.0078 0.0860±0.0040 0.1212±0.0054 0.3745±0.0087 0.7915±0.0194 BPR 0.3709±0.0066 0.0950±0.0014 0.1308±0.0026 0.3885±0.0107 0.8156±0.0015 LogisticMF 0.3629±0.0062 0.0944±0.0033 0.1294±0.0032 0.3832±0.0033 0.8124±0.0124 SVD++ 0.3899±0.0098 0.0989±0.0050 0.1366±0.0060 0.4054±0.0095 0.8251±0.0210 FISMauc 0.3694±0.0082 0.0962±0.0018 0.1326±0.0022 0.3813±0.0103 0.8294±0.0086 NeuMF 0.3648±0.0085 0.0936±0.0048 0.1293±0.0054 0.3789±0.0094 0.8057±0.0176 NeuPR 0.3370±0.0030 0.0819±0.0028 0.1153±0.0022 0.3470±0.0045 0.7940±0.0060 CoFiSet(SO) 0.4090±0.0077 0.1054±0.0036 0.1453±0.0043 0.4300±0.0065 0.8418±0.0093 GBPR 0.4051±0.0038 0.1046±0.0016 0.1445±0.0015 0.4201±0.0031 0.8414±0.0058 CoFi-points(i) 0.4289±0.0060 0.1100±0.0017 0.1518±0.0028 0.4527±0.0047 0.8602±0.0076 CoFi-points(u) 0.4072±0.0068 0.1047±0.0032 0.1448±0.0033 0.4280±0.0065 0.8396±0.0159 ML1M MF 0.4406±0.0044 0.0742±0.0004 0.1132±0.0011 0.4516±0.0046 0.8569±0.0028 BPR 0.4410±0.0008 0.0744±0.0003 0.1135±0.0003 0.4540±0.0009 0.8496±0.0047 LogisticMF 0.4410±0.0041 0.0745±0.0008 0.1136±0.0012 0.4544±0.0044 0.8542±0.0023 SVD++ 0.4143±0.0022 0.0695±0.0005 0.1061±0.0006 0.4219±0.0025 0.8366±0.0030 FISMauc 0.3657±0.0068 0.0591±0.0009 0.0908±0.0012 0.3771±0.0067 0.7952±0.0029 NeuMF 0.3995±0.0105 0.0658±0.0019 0.1011±0.0026 0.4143±0.0100 0.8176±0.0105 NeuPR 0.3493±0.0090 0.0533±0.0041 0.0830±0.0052 0.3622±0.0087 0.7630±0.0206 CoFiSet(SO) 0.4700±0.0023 0.0806±0.0007 0.1227±0.0006 0.4869±0.0024 0.8737±0.0023 GBPR 0.4494±0.0020 0.0781±0.0009 0.1188±0.0010 0.4636±0.0014 0.8670±0.0022 CoFi-points(i) 0.4725±0.0030 0.0810±0.0007 0.1233±0.0007 0.4915±0.0024 0.8780±0.0013 CoFi-points(u) 0.4747±0.0037 0.0813±0.0005 0.1238±0.0010 0.4926±0.0037 0.8766±0.0016

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 27 / 38

slide-28
SLIDE 28

Experiments

Main Results (2/2)

Table: Recommendation performance of all the baselines above and our CoFi-points on UserTag and Netflix5K5K.

Dataset Method Prec@5 Rec@5 F1@5 NDCG@5 1-call@5 UserTag MF 0.2719±0.0037 0.0404±0.0009 0.0641±0.0013 0.2802±0.0036 0.5701±0.0042 BPR 0.2969±0.0025 0.0476±0.0008 0.0740±0.0006 0.3072±0.0017 0.6172±0.0008 LogisticMF 0.2776±0.0033 0.0444±0.0005 0.0692±0.0003 0.2857±0.0052 0.5764±0.0030 SVD++ 0.2662±0.0029 0.0394±0.0006 0.0620±0.0008 0.2738±0.0022 0.5652±0.0040 FISMauc 0.2452±0.0166 0.0392±0.0021 0.0613±0.0037 0.2523±0.0190 0.5777±0.0144 NeuMF 0.2943±0.0076 0.0462±0.0008 0.0731±0.0013 0.3021±0.0086 0.6049±0.0100 NeuPR 0.2686±0.0071 0.0404±0.0016 0.0641±0.0022 0.2758±0.0072 0.5735±0.0128 CoFiSet(SO) 0.2778±0.0051 0.0435±0.0013 0.0680±0.0013 0.2875±0.0034 0.5821±0.0068 GBPR 0.3011±0.0008 0.0491±0.0014 0.0766±0.0012 0.3104±0.0009 0.6226±0.0019 CoFi-points(i) 0.3132±0.0027 0.0500±0.0014 0.0781±0.0014 0.3233±0.0027 0.6243±0.0017 CoFi-points(u) 0.3112±0.0033 0.0506±0.0016 0.0786±0.0015 0.3233±0.0048 0.6329±0.0027 NF5K5K MF 0.2276±0.0003 0.0943±0.0015 0.1033±0.0009 0.2450±0.0017 0.5734±0.0042 BPR 0.2318±0.0006 0.0945±0.0012 0.1046±0.0002 0.2508±0.0006 0.5683±0.0016 LogisticMF 0.2220±0.0027 0.0930±0.0002 0.1013±0.0004 0.2420±0.0033 0.5653±0.0018 SVD++ 0.2007±0.0029 0.0800±0.0012 0.0907±0.0015 0.2142±0.0037 0.5313±0.0050 FISMauc 0.1903±0.0023 0.0676±0.0019 0.0799±0.0022 0.1999±0.0028 0.5077±0.0071 NeuMF 0.2293±0.0078 0.0848±0.0016 0.0987±0.0033 0.2463±0.0077 0.5847±0.0143 NeuPR 0.1925±0.0030 0.0652±0.0022 0.0792±0.0019 0.2027±0.0025 0.5331±0.0151 CoFiSet(SO) 0.2476±0.0013 0.0992±0.0017 0.1106±0.0013 0.2699±0.0027 0.5878±0.0036 GBPR 0.2411±0.0027 0.0979±0.0013 0.1095±0.0013 0.2611±0.0025 0.5844±0.0015 CoFi-points(i) 0.2507±0.0042 0.1007±0.0024 0.1122±0.0023 0.2747±0.0054 0.5960±0.0094 CoFi-points(u) 0.2540±0.0024 0.1013±0.0011 0.1135±0.0008 0.2775±0.0029 0.5980±0.0055

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 28 / 38

slide-29
SLIDE 29

Experiments

Observations (1/2)

For the algorithms that incorporate the information of user/item-set, i.e., CoFiSet(SO), CoFi-points(i), GBPR and CoFi-points(u), we can see that they outperform the other seven strong baselines, i.e., MF, BPR, LogisticMF, SVD++, FISMauc, NeuMF and NeuPR, in most cases. This shows the advantage of the preference assumptions on user-set or item-set in comparison with that on single users or items only. For the first pair of algorithms with preference defined on single users or items (i.e., BPR and LogisticMF), the second pair of algorithms with preference defined on item-set (i.e., CoFiSet(SO) and CoFi-points(i)), and the third pair of algorithms with preference defined on user/item-set (i.e., GBPR and CoFi-points(u)), we can see that the performance in each pair are close. This justifies the capacity of the pointwise preference learning solutions, i.e., it can also perform well as long as with good use of their flexibility in dealing with set preferences for

  • bserved and unobserved feedback.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 29 / 38

slide-30
SLIDE 30

Experiments

Observations (2/2)

The very recent state-of-the-art methods NeuMF and NeuPR actually underperform other very competitive methods across the four datasets, which shows more complex models such as neural methods may not always be helpful to compete with the simple pointwise or pairwise methods, despite deep learning based recommendation models may be more competitive when additional information is available, such as textual or temporal information. Notably, our CoFi-points is a simple and flexible solution. Across the four datasets, for the algorithms that define the preference on the set of entities, i.e., CoFiSet(SO), GBPR, CoFi-points(i) and CoFi-points(u), the first two are built on a pairwise scheme and the last two are associated with a pointwise assumption. Notice that our proposed two pointwise preference learning methods are usually considered to be inferior for OCCF, but they achieve comparable or even significantly better performance.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 30 / 38

slide-31
SLIDE 31

Experiments

Sensitivity of the Set Size

1 2 3 4 5 Set size 0.35 0.4 0.45 0.5 NDCG@5 1 2 3 4 5 Set size 0.45 0.5 0.55 NDCG@5 1 2 3 4 5 Set size 0.28 0.3 0.32 0.34 NDCG@5 1 2 3 4 5 Set size 0.2 0.25 0.3 NDCG@5

ML100K ML1M UserTag NF5K5K

Figure: Recommendation performance (i.e., NDCG@5 score) of CoFi-points(i) with different set size where |P| = |A|.

1 2 3 4 5 Group size 0.35 0.4 0.45 NDCG@5 1 2 3 4 5 Group size 0.44 0.46 0.48 0.5 NDCG@5 1 2 3 4 5 Group size 0.28 0.3 0.32 0.34 NDCG@5 1 2 3 4 5 Group size 0.2 0.25 0.3 NDCG@5

ML100K ML1M UserTag NF5K5K

Figure: Recommendation performance (i.e., NDCG@5 score) of CoFi-points(u) with different group size where |G| = |A|.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 31 / 38

slide-32
SLIDE 32

Experiments

Observations

Both CoFi-ponts(i) and CoFi-points(u) with set/group, i.e., |P| = |G| = |A| = {2, 3, 4, 5} outperform a lot than that with single item/user (i.e., LogisticMF), which thus shows the effectiveness of setwise/groupwise preference under the pointwise preference learning scheme. Except MovieLens1M, the performance of CoFi-points(i) and CoFi-points(u) on the other three datasets has similar overall trend as the size increases, which means that the setwise/groupwise preference have certain similar characteristics in dealing with the one-class feedback.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 32 / 38

slide-33
SLIDE 33

Experiments

Sensitivity of the Number of Latent Dimensions

10 20 30

d

0.1 0.2 0.3 0.4 0.5

NDCG@5

ML100K ML1M UserTag NF5K5K

10 20 30

d

0.1 0.2 0.3 0.4 0.5

NDCG@5

ML100K ML1M UserTag NF5K5K

CoFi-points(u) CoFi-points(i) We can see that the performance is relatively stable with different sizes of latent factors, which actually shows one merit of our CoFi-points, i.e., the parameter d is not very sensitive in the final recommendation accuracy.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 33 / 38

slide-34
SLIDE 34

Experiments

Effectiveness of Treating the Sampled Negative Items Separately

ML100K ML1M UserTag NF5K5K

Dataset

0.1 0.2 0.3 0.4 0.5

NDCG@5

Not Separately Separately

ML100K ML1M UserTag NF5K5K

Dataset

0.1 0.2 0.3 0.4 0.5

NDCG@5

Not Separately Separately

CoFi-points(u) CoFi-points(i)

We can see that treating the sampled items in A separately is helpful across the four datasets. Compared with the items in a randomly sampled positive item-set P, the items in a randomly sampled negative item-set A tend to share much less similairty. It is likely that a user is interested in different aspects of different items in A, which causes the strategy

  • f treating items in A non-separately inappropriate.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 34 / 38

slide-35
SLIDE 35

Related Work

Related Work

Pairwise Preference Learning on Items e.g., BPR [Rendle et al., 2009] Pointwise Preference Learning on Items e.g., LogisticMF [Johnson, 2014] Pairwise Preference Learning on Item-Set e.g., CoFiSet [Pan and Chen, 2013a] Pairwise Preference Learning on User-Set e.g., GBPR [Pan and Chen, 2013b] Pointwise Preference Learning on User-Set or Item-Set The proposed CoFi-points(u) and CoFi-points(i)

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 35 / 38

slide-36
SLIDE 36

Conclusions and Future Work

Conclusions

We study an important recommendation problem, i.e., collaborative filtering with implicit feedback, via pointwise preference learning on user/item-sets. We propose a new pointwise preference assumption defined on user/item-set, and then develop a novel, simple and flexible recommendation algorithm called CoFi-points. We derive two specific recommendation algorithms, i.e., CoFi-points(i) and CoFi-points(u), which represent the preferences by incorporating correlations among items and interactions among users, respectively.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 36 / 38

slide-37
SLIDE 37

Conclusions and Future Work

Future Work

We will study the complementarity of user-set and item-set in modeling users’ implicit feedback via pointwise preference learning. We plan to study sophisticated set construction methods by exploiting the user-item subgroups [Bu et al., 2016], as well as other learning paradigms and frameworks. We are interested in integrating different types of auxiliary data such as social and cross-domain information [Zheng, 2015, Jiang et al., 2015] into the task of modeling implicit feedback. We are also interested in exploring the effect of our proposed preference assumption on user/item-sets when they are integrated with listwise ranking-oriented algorithms [Shi et al., 2012, Wu et al., 2018].

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 37 / 38

slide-38
SLIDE 38

Thank you

Thank you!

We thank the handling Associate Editor and Reviewers for their efforts and constructive and expert comments, and the support of National Natural Science Foundation of China Nos. 61872249, 61836005 and 61672358.

Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 38 / 38

slide-39
SLIDE 39

References Bu, J., Shen, X., Xu, B., Chen, C., He, X., and Cai, D. (2016). Improving collaborative recommendation via user-item subgroups. IEEE Transactions on Knowledge and Data Engineering, 28(9):2363–2375. He, X., Liao, L., Zhang, H., Nie, L., Hu, X., and Chua, T.-S. (2017). Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web, WWW ’17, pages 173–182. Jiang, M., Cui, P ., Chen, X., Wang, F., Zhu, W., and Yang, S. (2015). Social recommendation with cross-domain transferable knowledge. IEEE Transactions on Knowledge and Data Engineering, 27(11):3084–3097. Johnson, C. C. (2014). Logistic matrix factorization for implicit feedback data. In Proceedings of the Workshop on Distributed Machine Learning and Matrix Computations at NeurIPS 2014. Kabbur, S., Ning, X., and Karypis, G. (2013). FISM: Factored item similarity models for top-N recommender systems. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’13, pages 659–667. Koren, Y. (2008). Factorization meets the neighborhood: A multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’08, pages 426–434. Pan, R., Zhou, Y., Cao, B., Liu, N. N., Lukose, R. M., Scholz, M., and Yang, Q. (2008). One-class collaborative filtering. In Proceedings of the 8th IEEE International Conference on Data Mining, ICDM ’08, pages 502–511. Pan, W. and Chen, L. (2013a). CoFiSet: Collaborative filtering via learning pairwise preferences over item-sets. In Proceedings of SIAM International Conference on Data Mining, SDM ’13, pages 180–188. Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 38 / 38

slide-40
SLIDE 40

References Pan, W. and Chen, L. (2013b). GBPR: Group preference based Bayesian personalized ranking for one-class collaborative filtering. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence, IJCAI ’13, pages 2691–2697. Rendle, S., Freudenthaler, C., Gantner, Z., and Schmidt-Thieme, L. (2009). BPR: Bayesian personalized ranking from implicit feedback. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, UAI ’09, pages 452–461. Shi, Y., Karatzoglou, A., Baltrunas, L., Larson, M., Oliver, N., and Hanjalic, A. (2012). CLiMF: Learning to maximize reciprocal rank with collaborative less-is-more filtering. In Proceedings of the 6th ACM Conference on Recommender Systems, pages 139–146. Song, B., Yang, X., Cao, Y., and Xu, C. (2018). Neural collaborative ranking. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM ’18, pages 1353–1362. Wu, L., Hsieh, C.-J., and Sharpnack, J. (2018). SQL-Rank: A listwise approach to collaborative ranking. In Proceedings of the 35th International Conference on Machine Learning, pages 5311–5320. Zheng, Y. (2015). Methodologies for cross-domain data fusion: An overview. IEEE Transactions on Big Data, 1(1):16–33. Li, Pan and Ming (SZU) CoFi-points ACM TIST 2020 38 / 38