We aim to address the problem of missing data for recommendation systems with Python 2.7 and the H2O package. To this end, we “re-impute” artificially removed values into a dataframe with the help of two models: (1) Deep Learning with Autoencoder & (2) Generalized Low Rank Model (GLRM). To tune and evaluate both models, we implement a cross validation that optimizes the imputation accuracy of the artifically removed values. As a result, Deep Learning with Autoencoder consistently preforms better in terms of precision, wheras GLRM performs better in terms of execution time. Finally, we also present a new ranking formula which is inspired by Lucene Similarity but weighs the cosine similarity according to the percentage of matches.
It will be shown an implementation of the recommendation system to real estate data. We will show:
- some tuning plots of the models for imputation of missing features of real estate queries;
- a discussion about approximation of the scoring formula to rank the matching houses;
- a excerpt of the recommendation system front-end implemented in Flask.
To attend this talk, a basic knowledge about statistics and Python is helpful.
in __on sabato 21 aprile at 17:45 **See schedule**