Maggie Nelson

databases and code goodness

  • Author: maggie
  • Published: Apr 23rd, 2009
  • Category: entry
  • Comments: 1

Weekend Reading: Fun with Data and Statistics

Tags: , , ,

I know, I know, it’s only Thursday, but a girl can dream, right?

At work, I design a lot of database systems that manage a lot of data. Most of these systems are put in front of real human beings who are expected to find meaningful data in a big big pile of it. The two main approaches are to use either a harsh, editorial-driven, curated system such as a category hierarchy (Rock falls under Music falls under Entertainment) or have a completely free-flowing, user-generated system such as tagging or description search. But in either case, there’s always something missing – you pick tagging, you wish people didn’t tag things with “boobies” so much. You pick a strict category structure and it just feels too restrictive. So what can you do? The March/April 2009 issue of IEEE Intelligent Systems Magazine has an article Unreasonable Effectiveness of Data.

We should stop acting as if our goal is to author extremely elegant theories, and instead embrace complexity and make use of the best ally we have: the unreasonable effectiveness of data.

The article broke my brain a little bit, but go read it, it’s interesting nevertheless.

While we’re talking about representations of data, go read about the Semantic Web – how can we tell computers and teh internets what we humans want?

If you want a little bit lighter reading, go read Bill Bryson’s books about language, specifically Mother Tongue and Made In America. Reading anything by Bill Bryson will make you a better person (or your money back).

Once you have your data, someone will inevitably ask to tell them what’s “popular”. I’m putting it in quotes, because it means so many things to so many people. Before you answer, learn a little bit about statistics. I recommend Statistics in a Nutshell from O’Reilly. Hint: “most popular” does not always mean “has most views”.

For some real-life scenarios of statistics, misuse of statistics, problems with polling plus a nice dose of politics, read Nate Silver’s FiveThirtyEight.com blog. He’s also a partner and analyst for Baseball Prospectus – you might fight baseball boring, but boy, does it lend itself toward awesome stats gathering and mangling. Reading the two might not be immediately applicable to software developers, but it’ll put your mind in a right context when trying to get meaning out of your giant pile of data.

I will expect your book reports by Monday.

  • Author: maggie
  • Published: Mar 10th, 2009
  • Category: entry
  • Comments: 1

All the Reference You Need…

Tags: ,

People borrow my database books and a bunch of them are in circulation at any given time. Also, a lot of database resources are online only (e.g. for Oracle and MySQL). Right now, this is the state of my database bookshelf:

l2db

l2db

Not bad! You can probably get through a lot with just these three…

© 2010 Maggie Nelson. All Rights Reserved.

This blog is powered by the Wordpress platform and beach rentals.