Archive: Data

Page 3 of 3 1 2 3

March 7, 2007

Mash Up the Hackszine Tag Cloud

Hackszine Tag Cloud

Love tag clouds? Check out this page, which displays the top 20 search terms that drive people to each O'Reilly domain, including Hackszine. As noted on that page, here are a few things to keep in mind about these visualizations:

  • The terms are organic, which means that these are terms that someone typed into a search engine (e.g., Google) and then followed a resulting link. (In contrast to a search term that someone entered into our own search box.)
  • While the keyword frequency does give some idea of what people are looking for, keep in mind that the word had to already be on our site in order for it to appear, and it had to be ranked highly enough for someone to find it.
  • These are raw search terms, so similar but slightly different terms will appear twice. For example, "web 2.0" and "web 2" may both appear.
Tired of tag clouds? We'd love to see how you'd process the data. Here's the data for Hackszine, formatted as JSON.

Posted by | Mar 7, 2007 05:27 AM
Data, Google, Web | Permalink | Comments (0) | TrackBack | Digg It | Tag w/del.icio.us

February 26, 2007

Translate SQL Syntax Between Databases

SQL::Translator is an interesting Perl module that, among other things, can convert database table definitions to and from several DB platforms. Essentially, this could allow you to write and maintain your table definition code for a single platform, say MySQL, and then use Translator to output table structure into Oracle, Sybase, or PostgreSQL dialects.

Manipulation of data, such as SELECT, INSERT, UPDATE and DELETE are not yet supported, so you're still on your own there if you're writing platform agnostic code. That said, this is an incredibly useful tool. Just consider this example that Chris Dolan posted on use Perl:

MySQL understands this syntax:

create table book (
id int,
author_id int,
FOREIGN KEY fk_author_id (author_id) REFERENCES author (id)
) TYPE=InnoDB;

but not this nicer syntax (it silently ignores the "references" clause):

create table book (
id int,
author_id int references author (id),
) TYPE=InnoDB;

Perl to the rescue! I can write my schema in the latter syntax and use SQL::Translator to rewrite into the supported syntax.

References:

Posted by Jason Striegel | Feb 26, 2007 12:14 AM
Data, MySQL, Perl | Permalink | Comments (0) | TrackBack | Digg It | Tag w/del.icio.us

February 7, 2007

Sneakernet: The High Bandwidth Wireless

sneakernet_20070207.jpg
"Never underestimate the bandwith of a station wagon full of tapes hurtling down the highway"-Andrew S. Tanenbaum.

We've got a really fast connection at work, but I still occasionally run into situations where it's faster, and often more economical, to overnight data on an external hard disk instead of transfering it over the wire. Even within the office, if I'm moving a large file from one machine to another, I've found that good 'ol sneakernet can save me a lot of time, especially when other people are using the network.

Jeff Atwood posted a great article on the economics of bandwidth the other day. He puts some current cost figures towards Jim Gray's 2003 ACM interview, in which Jim describes the efficiencies of packing and shipping a whole computer instead of copying a terabyte of data over the net:

It's cheaper to send the machine. The phone bill, at the rate Microsoft pays, is about $1 per gigabyte sent and about $1 per gigabyte received—about $2,000 per terabyte. It's the same hassle for me whether I send it via the Internet or an overnight package with a computer. I have to copy the files to a server in any case. The extra step is putting the SneakerNet in a cardboard box and slapping a UPS label on it. I have gotten fairly good at that. Tape media is about $3,000 a terabyte. This media, in packaged SneakerNet form, is about $1,500 a terabyte.

According to Jeff's calculations, the effective sneakernet transfer rate for a terabyte of data is about 9.1 MBps at $0.06/GB. Only an OC-3 would be faster, which costs roughly $0.15/GB for both the sending and receiving end. Want to send 2 terabytes of data? Factoring in the extra time to copy to and from the disk, it works out to about 14.6 MBps at about the same cost per GB. Sneakernet scales.

Related:

Posted by Jason Striegel | Feb 7, 2007 12:50 AM
Data | Permalink | Comments (0) | TrackBack | Digg It | Tag w/del.icio.us

February 5, 2007

Automate Your Backups

mooninitehd_20070204.jpg
There's a classic horror story that keeps me from sleeping at night sometimes. I've heard it told a few different ways. I've even told the story myself more than once, but Phil's version that he posted yesterday morning was one of the most frightening:

A couple weeks ago a flood hit my apartment/office area and soaked the desktop system, monitors, equipment *and* back up drive (along with a ton of other stuff) - luckily I have a daily back up on a Powerbook. But, of course the Powerbook decided to completely stop working while at our ETSY event before that could be backed up too. Zapping the PRAM revealed the hard drive failed, so the usual steps of Disk Util, TechTool and then finally drive removal and DiskWarrior were attempted - for the most part the drive seems completely dead - there might be a chance to recover some data under linux, or from a data recovery shop, but it's not looking good.

According his latest update, the backup drives dried out okay and appear to be working fine, so I guess that means he's managed to survive the perfect storm, but it got me thinking - how many of us ever keep a regular, daily backup in the first place? I've suffered several near-misses in the past, and I'm still guilty of not keeping good backups.

Never Again
So, February isn't too late for a new year's resolution. Don't go another day without your important files backed up. Let's sit down for 15 minutes, right now, and set up an automated backup system for ourselves. All you need is an external hard disk or a remote server with sufficient storage for a couple copies of your data. Based on Phil's story, you might want to situate your backup system on an elevated surface and not beneath any water pipes.

We're not focusing on a perfect backup solution here, with off-site, fire proof, vault storage. Don't let the nay-sayers stop you with the long list of things that can go wrong with a simple back-up solution, or explanations of how to do it the "right way". In 15 minutes you are going to be significantly more protected from data loss, and this will give you the time you need to relax and find a good price on your fire proof vault.

Read full story

Posted by Jason Striegel | Feb 5, 2007 01:14 AM
Data, Linux, Mac, Productivity, Windows | Permalink | Comments (2) | TrackBack | Digg It | Tag w/del.icio.us

Page 3 of 3 1 2 3

Bloggers

Welcome to the Hacks Blog!

Brian Jepson.Brian Jepson


Jason Striegel.Jason Striegel


Philip Torrone.Phillip Torrone



See all of the books in the Hacks Series!
Advertise here.

Recent Posts

www.flickr.com
photos in Hacks More photos in Hacks