Thursday, January 5, 2017

BiasChecker.org in 30 days for 0$

Facebook and Excessive News Bias

During the months preceding the election of 2016, I noticed a disturbing trend in articles I was seeing posted on Facebook.  Many of my friends (and I'm pretty discriminatory about who my friends are) were re-posting news articles, on multiple sides of political issues, which I thought were clearly biased, or, more specifically, had an objective in the news story, and made very little attempt to hide it.  Memes are the most distinct form of this behavior that I could recall (as an illustrative example), but most articles run the gamut from meme-style blatant bias, to more difficult to distinguish bias articles.  Since I subscribe to the belief that people, once made aware of the bias presented to them, can make better decisions and think more critically, I decided to make it more apparent to people who choose to examine, what the bias is in their news articles.  BiasChecker.org was born from this concept.

Mission Accepted!

The mission was to create a website to evaluate news links for bias terminology and present those bias scores in a easy-to-understand format so that casual users can determine for themselves how much they should trust an article.

Gathering Functional Requirements

In order to present bias in a meaningful way, I had to figure out a way to evaluate for bias quickly, and then present that evaluation in a way that casual users can easily interpret.  Since my target users are on Facebook, I asked the community what they would like, and presented ideas on Facebook to get reactions.  Some of the requirements we distilled out are as follows:
  1. Need a way to check links from Facebook.
  2. Present the result automatically (if necessary).
  3. Share BiasChecker.org results with other Facebook users.
  4. Ensure that BiasChecker.org results are in fact useful.
  5. Allow individuals a way to check their own news links.
  6. Allow individuals to estimate bias on their own (a useful metric to compare algorithmic performance with perceived bias).
  7. Auto-monitor individual user's Facebook feed for links posted between App users so that they don't have to auto-import links.

Evaluating for Non-Functional Requirements

Non-functional requirements include the technological requirements and deployment infrastructure.  I needed an environment I could get up and running in quickly, and could still scale quickly (after all, Facebook has tons of users and it's good to be prepared).  I also wanted to work in a technology that was somewhat different than I use every day (to keep it interesting), but that I still had sufficient skill to be productive in.  I chose to use nodejs and Couchdb.  Some of my other technology decisions include:
  • Development
    • I settled on nodejs because it was different enough to keep me interested, and lightweight enough to do what I need to do without a lot of overhead.  One of the additional benefits was ease of deployment and scalability by using heroku.
  • Database
    • I settled on Couchdb because much of what I'd be doing is processing, then storing those results, and reporting summaries of those results.  Couchdb views were perfect for this, and I had the added benefit of putting the database in cloudant, IBMs couchdb in the cloud.  This was a really awesome experience, because my last couchdb foray had been 3 years prior, and couchdb has come a long way since then.
  • Web Server
    • I wanted to be able to present a service very quickly.  With expressjs, I could very quickly set up a web service in four lines of code.  Granted, getting it working past that takes a little more work, but still do-able.
  • User Interface
    • I wanted to present a very simple UX, using an existing framework, to minimize the UX work (since I'm traditionally a platform dev).  I chose materializecss for this, in part, because I would already be using jQuery, and it makes everything very easy to do.  I've worked with bootstrap.js before, and there was a lot of bouncing back-and-forth between css and js involved there.  I didn't evaluate angular since I've seen angular code before, and there seems to be a bit of a learning curve there I didn't feel like taking on yet.
  • Graphing
    • I wanted to do things like graphs to display aggregate totals for different tasks.  I've used jsflot in the past, and though I like the robustness of it, I opted for plot.ly instead, because I felt I would only really be doing basic things, and plot.ly has a very simple interface for the types of things (like histograms) I felt I would see.
  • Facebook SDK
    • I decided to use the Facebook javascript SDK, specifically opengraph v2.8 (which was the default at the time I started).
So far, I didn't need any physical infrastructure except my development laptop.  This was something I already had (as a developer, I've got a few).  Everything I've listed here has a low-cost option. Heroku deploys to dynos, at 0$ for personal use.  Cloudant offers the first 30 days for free.  plot.ly, meterializecss, expressjs are all open-source.

Result

The mission was a huge success!  In less than 30 days of coding and no local infrastructure, I was able to field https://www.biaschecker.org, with all of the features listed above.  I will be iterating on it, but here's an example of what kinds of useful information you can discover about an article using biaschecker:

Landing Page

The only allowed login is with Facebook.  This may change, but given that a lot of the functionality is for Facebook, it may not.

Consumable Bias Score

From a ratio of bias to non-bias terms in an article, and scaled to magnify differences, I produce a bias score between 1 and 10.  1 is the least biased, and 10 is the most biased.  I also offer you the ability to rate an article yourself, and then produce a consensus average of all individuals who have rated the same article.

Bias Term Histogram

The following is an example of what you can learn about your articles.  We show a histogram of all of the biasing terminology we've detected.  The bias terminology comes from a cornell study https://www.cs.cornell.edu/~cristian/Biased_language.html, which conveniently consolidated lexicons from earlier works.

Conclusion

In 30 days, I was able to (with a lot of help and guidance from friends) build biaschecker.org from scratch, and host in the cloud, with no up-front cost.  The recurring monthly costs will be determined by the usage.  Since I don't want to ever charge folks to use this, I also leveraged the new paypal.me feature, to put a link on the site.  Any money donated through the site will be used only for biaschecker expenses.  If you are interested in donating, follow this link to paypal.  Interested in helping out?  Add a comment here and I'll reply, or come message me on linkedin.  The source code is not yet available, but will be opensourced once we have a user base.