Maggie Nelson

Tag: reference

Weekend Reading: Fun with Data and Statistics

by maggie on Apr.23, 2009, under entry

I know, I know, it’s only Thursday, but a girl can dream, right?

At work, I design a lot of database systems that manage a lot of data. Most of these systems are put in front of real human beings who are expected to find meaningful data in a big big pile of it. The two main approaches are to use either a harsh, editorial-driven, curated system such as a category hierarchy (Rock falls under Music falls under Entertainment) or have a completely free-flowing, user-generated system such as tagging or description search. But in either case, there’s always something missing – you pick tagging, you wish people didn’t tag things with “boobies” so much. You pick a strict category structure and it just feels too restrictive. So what can you do? The March/April 2009 issue of IEEE Intelligent Systems Magazine has an article Unreasonable Effectiveness of Data.

We should stop acting as if our goal is to author extremely elegant theories, and instead embrace complexity and make use of the best ally we have: the unreasonable effectiveness of data.

The article broke my brain a little bit, but go read it, it’s interesting nevertheless.

While we’re talking about representations of data, go read about the Semantic Web – how can we tell computers and teh internets what we humans want?

If you want a little bit lighter reading, go read Bill Bryson’s books about language, specifically Mother Tongue and Made In America. Reading anything by Bill Bryson will make you a better person (or your money back).

Once you have your data, someone will inevitably ask to tell them what’s “popular”. I’m putting it in quotes, because it means so many things to so many people. Before you answer, learn a little bit about statistics. I recommend Statistics in a Nutshell from O’Reilly. Hint: “most popular” does not always mean “has most views”.

For some real-life scenarios of statistics, misuse of statistics, problems with polling plus a nice dose of politics, read Nate Silver’s FiveThirtyEight.com blog. He’s also a partner and analyst for Baseball Prospectus – you might fight baseball boring, but boy, does it lend itself toward awesome stats gathering and mangling. Reading the two might not be immediately applicable to software developers, but it’ll put your mind in a right context when trying to get meaning out of your giant pile of data.

I will expect your book reports by Monday.

1 Comment :, , , more...

All the Reference You Need…

by maggie on Mar.10, 2009, under entry

People borrow my database books and a bunch of them are in circulation at any given time. Also, a lot of database resources are online only (e.g. for Oracle and MySQL). Right now, this is the state of my database bookshelf:

l2db

l2db

Not bad! You can probably get through a lot with just these three…

1 Comment :, more...

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Tags

RSS Planet PHP

  • An intriguing use of lambda functions
    I’ve been working hard on Goodsie.com lately trying to bring it to launch. It’s been great being in on a new PHP project from (near) the beginning, as it frees up a number of things. One of those, is the fact that I can be using PHP 5.3 and all the new features that come with PHP 5.3. While I’ve used my fair share of the short-cut ternary already (?:), the […]
    Eli White
  • Accelerando
    This is a very unusual blog post for the open source area of my website, since it contains a recommendation for a science fiction book. The reason I've put it here instead of the private section is on the one hand, that it will definitely reach more geek - and therefor probably scifi interested - people here. On the other hand, the book I'm writing […]
    Tobias Schlitt
  • First year of Qaiku, and a travel writing challenge
    Qaiku, the conversational microblogging service that launched a year ago had a refresh that launched today. While it hasn't yet convinced the twittering masses, it has already proven itself as a lot more thoughtful platform for the Finnish online community, and as a valuable workstreaming tool. The new version looks quite nice and fresh. Notice the priv […]
    Henri Bergius
  • Getting started with the Midgard content repository
    I'm doing a talk today in the Bossa Conference about using Midgard as a content repository for mobile applications. As part of my presentation I wrote some simple example code for using the Midgard APIs in Python, and thought they would be good to share to those not attending the event as well. The idea of a content repository is that instead of coming […]
    Henri Bergius
  • Neural Networks in PHP
    Neural Networks in PHP By Louis Stowasser Neural networks allow emulating the behavior of a brain in software applications. Neural Networks have always had a too steep learning curve to venture towards, especially in a Web environment. Neural Mesh is an open source, pure PHP code based Neural Network manager and framework that makes it easier to work with […]
    PHP Classes