Trustworthy User Recommendation Using Boosted Vector Similarity Measure

. An online social network (OSN) is crowded with people and their huge number of post and hence filtering truthful content and/or filtering truthful content creator is a great challenge. The online recommender system helps to get such information from OSN and suggest the valuable item or user. But in reality people have more belief on recommendation from the people they trust than from untrusted sources. Getting recommendation from the trusted people derived from social network is called Trust-Enhanced Recommender System (TERS). A Trust-Boosted Recommender System (TBRS) is proposed in this paper to address the challenge in identifying trusted users from social network. The proposed recommender system is a fuzzy multi attribute recommender sys-tem using boosted vector similarity measure designed to predict trusted users from social networks with reduced error. Performance analysis of the proposed model in terms of accuracy measures such as precision@k and recall@k and error measures, namely, MAE, MSE and RMSE is discussed in this paper. The evaluation shows that the proposed system outperforms other recommender sys-tem with minimum MAE and RMSE.


Introduction
The Social network is congested by a large number of posts such as blog, reviews, opinions, image, video, etc. Extracting the required information from such a congested network is very difficult and time consuming task.An online recommender system helps to retrieve the desired information from this crowded network.For example, in Amazon's recommender system, item-to-item collaborative filtering approach is used for item recommendation.Similarly, Facebook, LinkedIn and other social networking sites to examine the network of connections between a user and their friends to suggest a new group based on interest.The downside of this online recommender system is that, the recommendations are generated based on anonymous people similar to the target user.This recommendation does not guarantee that the recommendation generated is from trusted people.Therefore, people tend to rely more on trusted person's recommendation than online recommendation [21].The recommender system designed for the trust network is called trust-based recommender system.When the trust model becomes potentially vulnerable then the transparency of the trust rating is lost [13].The critical analysis of content or web resources makes the trust rating transparent, which is made possible only by provenance.Provenance provides Meta information about the creation and processing of content.Thus, in this model, the trust rating is computed using the provenance data derived from W7 model.Also, the trust ratings derived in the models [1,2,8,9,10,14] are single rating or single preference.For example, in five rating scale, the trust value '4' represents high trust while trust value '1' represents very low trust.With a single rating or preference, the multiple aspects of the user or item cannot be expressed which will either directly or indirectly reduce the recommendation quality.Therefore, if the trust rating is derived using multiple criteria or features such as 'Originality of the content', 'Timeliness of the post', and 'Relevancy of the content' as (4,3,2), then evidently the quality of recommendation is improved.
The issues discussed above are handled by the proposed recommender system.One is multi-dimensional or multi attributes based trust evaluation than single dimension or a single attribute.If multiple aspects of users are analyzed for trust computation, then the impact of recommendation is stronger and positive.Next the attribute information gain is used as weight component and weighted similarity measure is computed.This multiple dimensions are easily represented using vector and hence vector similarity measures such as Jaccard, Dice and Cosine are used.Then this similarity is boosted by users trust degree or trust level and recommendation is made.The contributions of the proposed recommender system are as follows.
 Modeling the user  Formation of fuzzy vector space  Finding preference and recommending top-k users The structure of the paper is as follows.Section 2 briefs about the related research.The proposed recommender system is elaborately discussed in section 3. Performance evaluation is discussed in section 4. Finally, conclusion and future works are stated in section 5.

Related Work
The trust-based recommendation techniques depend on two important components, namely recommendation techniques and representation of trust models.

Trust-Enhanced Recommendation Techniques
The trust enhanced recommendation algorithms are generally an enhancement of standard recommendation techniques such as Simple mean; Pearson weighted mean, Pearson collaborative filtering.The former method receives recommendation from trusted peers, whereas the latter method received recommendation from normal users.
The most common trust enhanced recommender strategy is asking the users to explicitly mention the trust statements about other users.For instance the Moleskiing recommender system [3] uses FOAF files that contain trusted information scale ranging from 1 to 9. The Trust model proposed by A. Abdul Rahman and S. Hailes [1] for virtual communities grounded in real-world social trust characteristics, reputation or word-of-mouth.Falcone et.al proposed a fuzzy cognitive map model [8] to derive the trust based on belief value of an agent.This model shows how different component (belief) may change and how their impact can change depending on the specific situation and from the agent personality.The aim of a Golbeck's trust model [9] is, to determine how much one person in the network should trust another person to whom they are not directly connected.This algorithm accurately analyses the opinions of the people in the system.TidalTrust algorithm works based on trust-based weighted mean which uses the trust value of users as a weight for the ratings of other users.
Hang et al. [10] used a graph-based approach to recommend a node in a social network using similarity in trust networks.Massa and Aversani [14] proposed a trustbased recommendation system where it is possible to search for trustable users by exploiting trust propagation over the trust network.Andersen et al. [2] explored an axiomatic approach for trust-based recommendation and propose several recommendation models, some of which are incentive compatible.In MoleTrust method the similarity weight attributed to ratings by user.A trust-filtered collaborative filtering technique is used by O'Donovan and Smith in [4].Here the trust value is used as a filtering mechanism to choose only, the item raters who are trusted above a certain threshold.An Ensemble trust technique is proposed by victor et.al [17] aims to take into account all possible ways to obtain a positive weight for a rater of an item while favoring trust over similarity.

Trust Model representation
Trust representations can be classified from three different perspectives, namely (i) Probabilistic vs. gradual trust (ii) Single vs. multi-dimensional trust and (iii) Trust vs. distrust.Probabilistic representations use probabilities to indicate how much trust is placed by a user to another [17] Stronger trust corresponds to a higher probability.
Gradual representations [17] use continuous values to represent trust.The values can be any values so they cannot be explained as probabilities.The values directly indicate trust strengths.Here, (u, v, t) denotes that the trust value from u to v is t.Trust is a complex concept with multiple dimensions (i) Multifaceted trust and (ii) Trust evolution.It is an extension of single trust representations of multi-dimensional trust representations [11].Trust is context dependent.Trusting someone on one topic does not necessarily mean he will be trusted by others.The trust value is represented with <u, v, f, p>, where u trust v with probability p in the facet f.Also author suggests that trust evolves as humans interact over time T. Josang's subjective logic explores the probabilistic model [12] that considers both trust and distrust simultaneously.A gradual trust model for both trust and distrust can be found in [5,8,16].Guha et.al use a pair (t, d) [18] with trust degree t and distrust degree d and final suggested trust value is obtained by subtracting d from t i.e. t-d.

Proposed Recommender System
The proposed recommender system is built to recommend the top-k reviewers in a book-based social network.For this, the data about the reviewer and the review is collected from Goodreads, Google Books and Amazon using ad hoc-API and scrapping HTML pages.The fields collected from the social network are given in table  The collected data is preprocessed and from this the trust score of each reviewer is computed using W7 provenance model [22].Then, using DoT pruned Fuzzy Decision Tree (FDT) classifier [7] the reviewers are classified and fuzzy rules were generated.Finally, fuzzy rules are combined with a target user's request to perform recommendation.The major components of the proposed recommender system are as follows.Since, the data for PWHERE and PWHICH is not provided by the domain, these two elements cannot be modeled.Therefore, the core provenance elements taken for trust quantification are PWHAT, PWHEN, PWHO, PHOW and PWHY.Trust assessment algorithm quantifies these five provenance elements.This trust value is then given to the learning model to classify the users with various levels.

Fuzzy Decision Tree Based Classification
The learning model takes the quantified provenance value obtained using W7 model as a trust input.This is fuzzified using Triangular Membership Function (TMF) and rule base is constructed using Mamdani's 'If... Then' interpretation.Fuzzy Decision Tree (FDT) [19] takes the rule base and generates decision trees using a fuzzy ID3 [6] algorithm.To construct FDT, two criteria need to be evaluated, one is splitting criterion and the other is stopping criterion.The former one helps to choose the root node and child nodes.The latter one controls the growth of the tree.
In FDT, provenance element having highest information gain is assigned as the root node and leaf node denotes trust decision.Each distinct path from root to a leaf produces distinct rule.Each generated rule is assigned Degrees of Truth (DoT) [15] to state that how much truth value it holds.If DoT=1, then the rule is absolutely true and if DoT=0 then the rule is absolutely false.Sample fuzzy rules are shown in figure 1   In order to get better accuracy with minimum number of rules, the stopping criterion (β) is used.The value of β chosen is 0.9 and 1 and lengths of rules ranges from 2 to 5. The table 4 shows the number of rules generated.For example, rule #2 and #3 in figure 1 has length 2 and rule #5 has length 5.These fuzzy rules are taken as input to build a trust-boosted recommender system.

Trust-Boosted Recommender System
The proposed trust-boosted recommender system recommends the trustworthy users to the target user UT is shown in figure 2. The major components of this recommender system model are: The target user or requester (UT) sends a query as a request (Rq) asking for recommendations from the trust network.This query is sent to the trusted network and it checks whether the UT is new user or not.If UT is the existing user, then recommend the highly trusted users.Otherwise the request is sent to the profile learner where profile data (Pdata) are updated based on the query and existing profile information.Then this updated Pdata is sent to the trust network.In the trust network, each reviewer is grouped based on trust levels <VHGT, HT, MT, LT, VLWT>.From the set of fuzzy rules, extract the conditional attribute and the decision attributes.For each conditional attribute, generate fuzzy vector space (FVSP).The FVSP consists of a tuple <Attribute, Preference based Fuzzy Number>.The vector similarity measures such as Jaccard, Dice and Cosine is carried out to find how much target user is similar to the others in trust network.

Fig. 2. Proposed Trust Boosted Recommender System
The gain value of each attribute (AG) is assigned as a weight component and it is applied to the above mentioned three similarity measures and weighted similarity value is calculated.Then this similarity is boosted by the corresponding decision attribute's trust degree (trust level).Then, based on the boosted similarity value, the trusted users are ranked from highest to lowest.Finally, top-k users are recommended to the target user (UT).After a recommendation, the target user's feedback is collected and profile learner will update the Pdata accordingly.To collect the feedback, set of feedback query (FDqry) is formulated based on five attributes PWHAT, PWHEN, PWHO, PHOW and PWHY.For each FDqry, users are asked to provide a quantitative value in the scale 0 to 1.This recommendation process is repeated for each user request with the updated profile.

User Profile Learning Phase.
In user modeling phase, the fuzzy rules are extracted from the rule database derived using fuzzy decision tree (discussed in section 3.2).Each user may have one or more than one rule as shown in figure 1.Using rule matching algorithm, each users is assigned matched rule(s).Then the users are grouped based on above mentioned five different trust levels.For example, if the user U109; U169 is classified as a Low Trust (LT) then under LT these users are grouped.For the rest of the users similar procedure is carried out and users are grouped accordingly.If UT is an existing user then the details (profile) of the user are known already and can directly access the trust network.If UT is new user then profile of the user needs to be learned prior to access the trust network.The profile learning is depicted in figure 3.

Recommendation Phase.
The recommendation is carried out in two steps.One is creation of FVSP and the other is recommendation of the user.

Formation of FVSP.
The rules extracted from the trust network are partitioned into conditional attributes set (CAS) and decision attributes set (DAS).The CAS consists of all the trust attributes <PWHAT, PWHEN, PWHO, PHOW, PWHY>.The DAS consists of trust decision <VLWT, LT, MT, HT, VHGT>.
Step 1: For each attribute in the conditional attribute set, assign attribute grade.This is based on the position of the triangular fuzzy function and is given in table 5.The linguistics space of each attribute is given below.Step 2: Assign the fuzzy number for each linguistic term based on the grade.Since it follows the triangular fuzzy logic, the fuzzy number assigned for each grade is shown below.For example, the fuzzy number for the linguistic term for the attribute PWHAT is shown in table 6.For other attributes, fuzzy number is same as that of shown in table 5. Step 3: The fuzzy number for each attribute is now represented as a vector in FVSP.The FVSP for each rule is represented as a pair {<AK, FNAK>}.where,  K refers to a number of attributes, here K=5. AK represents the current attribute and  FNAK refers to the fuzzy number for the specified attribute AK.That is FVSP= {<A1, (a11, a12, a13) >, < A2, (a21, a22, a23) > … < A5, (a51, a52, a53) >}.Here (a11, a12, a13) is a triplet used in TMF to define the fuzzy number where 0 ≤ a11 ≤ a12 ≤ a13 ≤1.

Recommendation of the top-k users.
In the vector space there are some similarity measures between two vectors which have been successfully applied in fields such as pattern recognition, classification of

Grade
Fuzzy Number 1 (0.0, 0.0, 0.25) 2 (0.0, 0.25, 0.50) 3 (0.25, 0.50, 0.75) 4 (0.50, 0.75, 1.0) 5 (0.75, 1.0, 1.0) complex objects and other decision making problems.The vector similarity measures chosen in the proposed recommendation system are the Cosine similarity.Using this FVSP, above vector based similarity measure is carried out to find how much UT is similar to the other users in trust network.The gain value of each attribute (AG) is taken as a weight component and it is applied to the above mentioned measures and similarity value is calculated.Let X = UT = (a1, a2, a3) and Y = UN = (b1, b2, b3) is the fuzzy number of the target user (UT) and the other user (UN) from the trust network respectively, then the cosine similarity measure is given in equation ( 1) is as follows.
. √∑ ( where,  AG represents the attribute gain,  f represents the fuzzy number of values in each fuzzy number,  a1, a3, b1 and b3 are the endpoints of fuzzy numbers,  a2 and b2 are the peak point of fuzzy numbers After finding the similarity (S), boost this value by a corresponding trust score of the user UN given in equation (2).Boosting is linear, since it is done with associated trust level.Using this boosted similarity (Sb), prediction of the target user's trust score is carried out.The prediction formula is given in equation (3).
where,  Twt refers to trust weight assigned based on the trust level of user UN. (For e.g., VHGT has Twt of 1 and MT has Twt of 0.6 and VLWT has Twt of 0.2)     represents the trust value of the Target User UT presented in fuzzy number format as shown in Table 6. Ij represents items (books) which are not given any review  NB represents the number of neighbors chosen Consider the randomly chosen reviewer say reviewer 72 (R72) requesting for the recommendation of k users (Let k=10).The similarity (S) between the requester and the rest of the users is calculated.Then it is boosted using equation (2).The table 7 shows the similarity and boosted similarity (Sb) score of the top-k reviewer.The reviewers are sorted based on similarity from highest to lowest.Though both similarities show the highest score for the top reviewers, the trust level differs.The trust level of highly matched reviewer with R72 is 'HT'.Therefore, the top-10 reviewers are expected to have the trust level of 'HT'.But, in case of without boosting, top 4 th , 6 th and 10 th reviewers have other trust level ('MT') instead of 'HT'.Similarly, in case of boosting, the top 10 th reviewer has different trust level.Therefore, the prediction error is more in without boosting and lesser in boosted method.Boosting the similarity appropriately ranks the top-k reviewers .This way the proposed system gets a reduced MAE and RMSE.

Experiments and Result Analysis
To evaluate the performance of gain weighted trust boosted recommender system, experiments are conducted on the popular book based social network Goodreads data set.The aim of these experiments is to present a comparative study of proposed recommendation strategy in fuzzy trust concept.Also proposed trust boosted model is evaluated against other weight strategies.The performance of the proposed recommendation strategy is measured with respect to quality of predictions and quality of recommendations.The quality of prediction is done by measuring Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE).Similarly the quality of recommendation is done by measuring precision@k and recall@k and Average Precision (AP).Leave-one-out method is used to evaluate recommendation systems [14].This technique involves withholding one rating and trying to predict it with remaining ratings.Then the predicted rating can be compared with the actual rating and the difference will be considered as the prediction error.

Evaluation of different weight strategies
The different weight strategies considered for evaluation are expected weight method; preference based method [20] and proposed trust boosted method.The MAE, RMSE and AP measures are evaluated for the above mentioned weight strategies.The figure 4 shows the MAE value obtained for the cosine similarity method.From this figure, it is observed that the proposed trust boosted method shows the less prediction error (MAE) than the other two methods.Similarly, figure 5 shows the RMSE value obtained for the cosine similarity measure.From the figure it is observed that the proposed trust boosted method shows the less prediction error (RMSE) when compared with the expected weight method.The preference based method shows more error rate than the other two methods.The AP value is shown in figure 6 for the above similarity measure.The precision value for the proposed method is higher than the other two methods.The AP is almost same for top-5 and top-10 users.Up to top 20 users precision value is greater than or equal to 0.90.After that the precision value is start decreasing and for top-50 user, the precision value is very less in preference based method.

Comparison With Other Trust-Based Recommender System
The proposed recommender system is compared with other trust-based recommender system.The evaluation is done on MAE and RMSE.First, the proposed method (boosted similarity) is compared against without boosting the similarity.The MAE of this is shown in figure 7. The experiment is carried out with Jaccard, Dice and Cosine similarity measure.All these three measures show the lesser prediction error while boosting the similarity than without boosting the similarity.In Jaccard repetition of a word does not reduce the similarity but Cosine measure reduce the similarity.Similar to MAE, the RMSE value is checked with few randomly selected reviewers.The graph shows the reduced prediction error in the proposed method.

Conclusion and Future Work
In this work, trust-boosted recommender system is designed to recommend top-k reviewers of the book based social network.The use of provenance based trust computation from multiple aspects has improved the recommendation quality.Also performance of proposed trust boosted (the gain as weight) measure is compared with other weights such as expected value, and preference based method.The analysis shows that the precision@k is increased 10.166% when compared to the expected weight method and 2.186% when compared with preference based weight method.Also proposed approach is compared with other trust based methods and the results shows that the prediction is achieved with minimum MAE and RMSE.The future work is to recommend the top-k reviewers to a group of users.That is to develop a group recommender system . Here, reviewers are classified into 5 different trust levels as VHGT (Very High Trust), HT (High Trust), MT (Moderate Trust), LT (Low Trust) and VLWT (Very Low Trust).The abbreviation for the linguistic terms present in figure 1 is as follows.MSM (Moderately Same), HD (Highly Deviated), HSM (Highly Same), MD (Moderately Deviated), MITM (Moderately Ineffective Time Spent), HITM (Highly Ineffective Time Spent), HUTR (Highly Untruthful), HR(Highly Relevance), NTR (Neutrally Truthful), MIR (Moderately Irrelevance) and MDSML (Moderately Dissimilar).

Table 6 .
Fuzzy Number for the attribute PWHAT

Table 1 .
1.More than 61,000 reviews and associated reviewer's data available from 2007 to 2015 is collected and details of the dataset collected is given in table 2. Number of reviews and number of reviewers are not same always.But this dataset has single review from each reviewer.Fields collected from the social network

Table 3 .
Description of Provenance Elements

Table 4 .
Rules Generated

Table 7 .
Similarity and Boosted similarity score of top-k reviewer.