Newly released Simblr, a Tumblr blogs recommender

I've developed "Simblr" experimentally.

What is Simblr?

A web application which recommends Tumblr blogs and posts for you by reference to your recent Tumblr posts.

You can use it when:
- you finish reblogging all posts on your dashboard but want to reblog more.
- you look for Tumblr blogs which match you.

Play with Simblr


You can see the source code here.
It gets notes-information of your recent posts to recommend Tumblr blogs and posts for you with algorithms of collaborative filtering and others.

Recommended Blogs := M of the most similar blogs to yours

Degree-of-Similarity of Blog (sim.) := Occurrence times in the notes-information of the recent N posts in your blog / N * 100
The more times a blog occurs in the notes-information, the more degree-of-similarity it has.

Recommended Posts := L of the most recommended posts among recent posts of recommended blogs
However, # of recommended posts may be smaller than L because overlapping posts are deselected.

Degree-of-Recommendation of Post (point) := Sum of degree-of-similarity of blogs which occur in notes-information of the post / Sum of degree-of-similarity of M recommended blogs * 100
The more similar blogs reblog the post, the more degree-of-recommendation it has.

If it gets more posts and notes-information, it can recommend better posts and blogs for you. But the trade-off for it is that you have to wait for a longer time to get data with Tumblr API. So the current specification uses N=100, M=10 and L=100.

It is developed using Ruby, Sinatra, Puma, Memcached, Heroku, jQuery, Bootstrap and others.

Simblr という Web アプリを実験的に作ってみました。

Simblr ってなに?

Tumblr ブログの直近ポストから、おすすめの Tumblr ブログとポストを紹介してくれる Web アプリです。


Simblr で遊んでみる


ソースコードは こちら です。

おすすめブログ(Recommended Blogs) := 類似度の大きいブログトップM

ブログの類似度(sim.) := 直近ポストN個のnotes情報中における当該ブログ出現回数 / N * 100

おすすめポスト(Recommended Posts) := おすすめブログの直近ポストのうち、おすすめ度の大きいポストトップL

ポストのおすすめ度(point) := トップMブログのうち、当該ポストのnotes情報に出現するブログの類似度の合計 / トップMブログ全部の類似度の合計 * 100

よりたくさんのポスト情報を入力値とすれば、よりよい精度のおすすめができそうですが、待ち時間(Tumblr API からの情報取得時間)とのトレードオフで直近100件の取得にとどめています。(現在の仕様: N=100, M=10, L=100)

なお、今回実装に使ったものは、Ruby、Sinatra、Puma、Memcached、Heroku、jQuery、Bootstrap などです。

0 件のコメント: