# CUHK-STAT3009 Quiz 2

## Name (Print): ______               Student ID: ________

• This exam contains 4 Problems: Problem 1 (1.1); Problem 2 (2.1, 2.2, 2.3); Problem 3 (3.1, 3.2); Problem 4 (4.1).

• You have 45 minutes to complete this exam. NO LATE Submission!

## Problem 1 (Baseline Methods)

Given glb_mean and user_mean methods as follows (exactly same with the code in the Github).

class glb_mean(object):
def __init__(self):
self.glb_mean = 0

def fit(self, train_rating):
self.glb_mean = np.mean(train_rating)

def predict(self, test_pair):
pred = np.ones(len(test_pair))
pred = pred*self.glb_mean
return pred

class user_mean(object):
def __init__(self, n_user):
self.n_user = n_user
self.glb_mean = 0.
self.user_mean = np.zeros(n_user)

def fit(self, train_pair, train_rating):
self.glb_mean = train_rating.mean()
for u in range(self.n_user):
ind_train = np.where(train_pair[:,0] == u)
if len(ind_train) == 0:
self.user_mean[u] = self.glb_mean
else:
self.user_mean[u] = train_rating[ind_train].mean()

def predict(self, test_pair):
pred = np.ones(len(test_pair))*self.glb_mean
j = 0
for row in test_pair:
user_tmp, item_tmp = row, row
pred[j] = self.user_mean[user_tmp]
j = j + 1
return pred


### Given train dataset, suppose we consider three baseline methods,

### Method A: user-mean
user_ave = user_mean(n_user=n_user)
user_ave.fit(train_pair=train_pair, train_rating=train_rating)
pred_A = user_ave.predict(test_pair)

### Method B: glb mean + user mean
glb_ave = glb_mean()
glb_ave.fit(train_rating)
pred = glb_ave.predict(test_pair)

train_rating_cm = train_rating - glb_ave.predict(train_pair)
user_ave = user_mean(n_user=n_user)
user_ave.fit(train_pair=train_pair, train_rating=train_rating_cm)
pred_B = pred + user_ave.predict(test_pair)

### Method C: user mean + glb mean
user_ave = user_mean(n_user=n_user)
user_ave.fit(train_pair=train_pair, train_rating=train_rating)
pred = user_ave.predict(test_pair)

train_rating_cm = train_rating - user_ave.predict(train_pair)
glb_ave = glb_mean()
glb_ave.fit(train_rating_cm)
pred_C = pred + glb_ave.predict(test_pair)


## Problem 2 (LFM)

### 2.1. Consider a LFM including all users (1, …, n) and items (1, …, m):

$(\widehat{\mathbf P}, \widehat{\mathbf{Q}}) = \text{argmin}_{P, Q} \frac{1}{|\Omega|} \sum_{(u,i) \in \Omega} \big( r_{ui} - \mathbf{p}_u^T \mathbf{q}_i \big)^2 + \lambda \sum_{u=1}^n \| \mathbf{p}_u \|_2^2 + \lambda \sum_{u=1}^m \| \mathbf{q}_i \|_2^2$

### Consider a LFM:

$(1) \quad (\widehat{\mathbf P}, \widehat{\mathbf{Q}}) = \text{argmin}_{P, Q} \frac{1}{|\Omega|} \sum_{(u,i) \in \Omega} \big( r_{ui} - \mathbf{p}_u^T \mathbf{q}_i \big)^2$

• Step 1. Based on (???), we can sequentially update $\mathbf{P}$ and $\mathbf{Q}$, that is, by fixing $\mathbf{P}$, we tend to solve via $\mathbf{Q}$: $(2) \quad \min_{Q} \frac{1}{|\Omega|} \sum_{(u,i) \in \Omega} \big( r_{ui} - \mathbf{p}_u^T \mathbf{q}_i \big)^2$ Then, by fixing $\mathbf{Q}$, we solve $\mathbf{P}$ as $(3) \quad \min_{P} \frac{1}{|\Omega|} \sum_{(u,i) \in \Omega} \big( r_{ui} - \mathbf{p}_u^T \mathbf{q}_i \big)^2,$
• Step 2. Based on (???), (2)-(3) can be reduced to item-wise/user-wise parallel updates (4)-(5), that is, $(4) \quad \widehat{\mathbf{q}}_i = \text{argmin}_{q_i} \frac{1}{|\Omega|} \sum_{u \in \mathcal{U}_i} \big( r_{ui} - \mathbf{p}_u^T \mathbf{q}_i \big)^2, \quad \text{for } i = 1, \cdots, m; \\ (5) \quad \widehat{\mathbf{p}}_u = \text{argmin}_{p_u} \frac{1}{|\Omega|} \sum_{i \in \mathcal{I}_u} \big( r_{ui} - \mathbf{p}_u^T \mathbf{q}_i \big)^2, \quad \text{for } u = 1, \cdots, n;$

## Problem 3 (Cross Validation)

### 3.2. If you get following feedback from LFM on a dataset, you may

	Fitting Reg-LFM: K: 3, lam: 0.00010
Reg-LFM: ite: 0; diff: 0.527 RMSE: 0.939
Reg-LFM: ite: 1; diff: 0.050 RMSE: 0.892
Reg-LFM: ite: 2; diff: 0.042 RMSE: 0.854
Reg-LFM: ite: 3; diff: 0.021 RMSE: 0.836
Reg-LFM: ite: 4; diff: 0.010 RMSE: 0.828
Reg-LFM: ite: 5; diff: 0.006 RMSE: 0.823
Reg-LFM: ite: 6; diff: 0.003 RMSE: 0.820
Reg-LFM: ite: 7; diff: 0.002 RMSE: 0.819
Reg-LFM: ite: 8; diff: 0.001 RMSE: 0.818
Reg-LFM: ite: 9; diff: 0.001 RMSE: 0.817
Validation RMSE for LFM: 1.975


## Problem 4 (Neural Network)

### Given a SideNCF network as follows,

class SideNCF(keras.Model):
def __init__(self, numA, numB, numC, embedding_size, **kwargs):
super(SideNCF, self).__init__(**kwargs)
self.numA = numA
self.numB = numB
self.numC = numC
self.embedding_size = embedding_size

self.embeddingA = layers.Embedding(
numA,
embedding_size,
embeddings_initializer="he_normal",
embeddings_regularizer=keras.regularizers.l2(1e-2),
)
self.embeddingB = layers.Embedding(
numB,
embedding_size,
embeddings_initializer="he_normal",
embeddings_regularizer=keras.regularizers.l2(1e-2),
)
self.embeddingC = layers.Embedding(
numC,
embedding_size,
embeddings_initializer="he_normal",
embeddings_regularizer=keras.regularizers.l2(1e-2),
)
self.concatenate = layers.Concatenate()

def call(self, inputs):
A_vector = self.embeddingA(inputs[:,0])
B_vector = self.embeddingB(inputs[:,1])
C_vector = self.embeddingC(inputs[:,2])
D_vector = self.embeddingC(inputs[:,3])
dot_ = tf.tensordot(A_vector, B_vector, 2) + tf.tensordot(A_vector, C_vector, 2) + tf.tensordot(A_vector, D_vector, 2)
return dot_