How does Area42 deals work?
TL;DR
Machines do all the hard work for you.
For the geek out there, the app is build relatively in a very simple manner. The steps involved in delivering you the relevant deals are:
- A simple python script fetches the deals from various deal sites.
- All the irrelevant words (stop words) are removed from the title of the deal
- Then a script creates a phonetic encoding of all the relevant keywords (like
ipod=APK, android=AADS
)
- Then an inverted index is created in redis in this structure:
Key: APK
Value: set(
{"title" : "Awesome deal on iPod for $10", "link" : "http://myawesomedeal.com/2dh33"},
{"title" : "Another awesome deal for iPod", "link" : "http:anotherawesomedealsite.com/dbkhdb"}
)
Key: AADS
Value: set(
{"title" : "Awesome deal on Android for $5", "link" : "http://myawesomedeal.com/ijkbqsdkjb"},
{"title" : "Another awesome deal for Android", "link" : "http:anotherawesomedealsite.com/wheqhwqw"}
)
- Finally, based on your likes (keywords submitted by you). A cron job aggregates the data and sends out an email!
Stuff that powers Area42 Deals
- Python: The language.
- Jellyfish: Jellyfish is the awesome python library which does the magic to construct the inverted index. For now, I am doing two kind of matching. For 1. String comparison:Levenshtein Distance and 2. For Phonetic encoding: Double Metaphone
- Redis: All the data aggregated from various data sources is stored in Redis in form of an inverted index
- Flask: The website runs on flask. The awesome python web framework!
- Fabric: I am using fabric to deploy the code to EC2.
- EC2: The app is hosted on Amazon's EC2