Archive: Software Engineering
November 7, 2008
Hand gesture multitouch using only a webcam
Andy Wilson of MS Research—a name you may recognize from yesterday's $1 gesture recognition post—is responsible for a number of pretty unbelievable projects involving image processing and human computer interfaces. It's the sort of stuff that really blurs the boundaries between real and digital environments.
I was blown away by the video above, in which Andy demonstrates a multitouch-like hand gesture interface. Get this. It uses only a standard webcam.
The webcam is positioned to watch your keyboard and by simply making a pinching gesture with your thumb and index finger, you can grab and move objects on the screen, or rotate them by twisting your hand. Pinching with two hands, you can control two separate points on the screen, allowing you to easily perform more complex zoom and rotation actions by pulling your hands apart or moving them relative to each other.
I haven't seen source for this anywhere, but he does describe the technique, which is quite clever. By subtracting the background and examining the topology of the remaining image (just the solid background and your hands), you can easily determine how many shapes are made by the background.
With fingers unpinched, the background is a single shape, albeit with a hand shaped isthmus pushing into it. When you pinch and form a circle with your thumb and forefinger, things change. A little island is created in the middle of your fingers and the background becomes two distinct shapes. The position and rotation of the inner shape provides you enough information to control objects on the screen.
Hand Gesture Multitouch [via Procrastineering]
Andy Wilson
Previously:
Gesture recognition for Javascript and Flash
Posted by Jason Striegel |
Nov 7, 2008 07:05 PM
Design, Software Engineering, User Interface |
Permalink
| Comments (0)
| TrackBack
| Digg It
| Tag w/del.icio.us
November 6, 2008
Gesture recognition for Javascript and Flash

The "$1 Recognizer" is a simple gesture recognition algorithm created by Andy Wilson from Microsoft Research and Jacob Wobbrock and Yang Li from the University of Washington.
By simple, I mean that it's under 100 lines of code that you can quickly add to your application to give it gesture recognition capabilities.
To enable novice programmers to incorporate gestures into their UI prototypes, we present a "$1 recognizer" that is easy, cheap, and usable almost anywhere in about 100 lines of code. In a study comparing our $1 recognizer, Dynamic Time Warping, and the Rubine classifier on user-supplied gestures, we found that $1 obtains over 97% accuracy with only 1 loaded template and 99% accuracy with 3+ loaded templates. These results were nearly identical to DTW and superior to Rubine.
It works by using a simple 4-step process, which basically amounts to:
- Resampling the recorded path into a fixed number of points that are evenly spaced along the path
- Rotating the path so that the first point is directly to the right of the path's center of mass
- Scaling the path (non-uniformly) to a fixed height and width
- For each reference path, calculating the average distance for the corresponding points in the input path. The path with the lowest average point distance is the match.
What's great is that the output of steps 1-3 is a reference path that can be added to the array of known gestures. This makes it extremely easy to give your application gesture support and create your own set of custom gestures, as you see fit.
Give the demo a try. I was pretty surprised at how accurate the results were, even with single-temple custom gestures that I quickly scribbled out.
$1 Gesture Recognizer - Examples and Source (Javascript, Actionscript, and C#)
Posted by Jason Striegel |
Nov 6, 2008 07:27 PM
Ajax, Flash, Software Engineering, User Interface |
Permalink
| Comments (0)
| TrackBack
| Digg It
| Tag w/del.icio.us
October 31, 2008
The Skein hash function and Threefish block cipher
The National Institute of Standards and Technology is holding a competition to design a new hash function to replace the current SHA family of functions and become SHA-3. The deadline for submissions was today, and the submissions will be evaluated over the coming years until a final proposed standard is made in 2012. Bruce Schneier posted some information about his team's entry, Skein, and the whole selection process:
NIST is holding a competition to replace the SHA family of hash functions, which have been increasingly under attack. (I wrote about an early NIST hash workshop here.)
Skein is our submission (myself and seven others: Niels Ferguson, Stefan Lucks, Doug Whiting, Mihir Bellare, Tadayoshi Kohno, Jon Callas, and Jesse Walker)....
The selection process will take around four years. I've previously called this sort of thing a cryptographic demolition derby -- last one left standing wins -- but that's only half true. Certainly all the groups will spend the next couple of years trying to cryptanalyze each other, but in the end there will be a bunch of unbroken algorithms; NIST will select one based on performance and features.
The Skein hash function is based on a the Threefish block cipher, which is also released as part of the submission. Source has been released to the public domain, which you can download from the Skein website.
Schneier on Security: The Skein Hash Function
Skein Submission Paper - Design, Usage, and Preliminary Cryptanalysis (PDF)
The Skein Hash Function Family Website
Posted by Jason Striegel |
Oct 31, 2008 11:10 PM
Cryptography, Software Engineering |
Permalink
| Comments (0)
| TrackBack
| Digg It
| Tag w/del.icio.us
September 17, 2008
Stanford Engineering Everywhere

Standford's Stanford's School of Engineering has released a number of Computer Science and Electrical Engineering courses online, in their entirety, called Standford Engineering Everywhere. The online program includes all course materials—notes, tests, and complete lecture recordings—free for students or educators to use under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
SEE users may pick and choose the materials that best meet their needs and interests. Want a refresher course on a particular programming concept? View a video lecture that covers the basics. Are you a programming novice? Spend several weeks viewing lectures, reading course materials and tackling class assignments. Test your knowledge by taking quizzes and exams.
As an example, here's the first lecture in the Machine Learning course, taught by Professor Andrew Ng:
The ten courses that are available cover a healthy range of topics. It's basically 3 or 4 semesters worth of EE and Comp Sci. education that you can brainload for free. Here's the current selection:
Introduction to Computer Science
Artificial Intelligence
Linear Systems and Optimization
- The Fourier Transform and its Applications
- Introduction to Linear Dynamical Systems
- Convex Optimization I
- Convex Optimization II
If you've ever wanted to go to Standford, but didn't have the time, money, or grades, here's your chance.
Stanford Engineering Everywhere [via Creative Commons]
Previously
Lecturefox: free university lectures
Bootstrap Education
Posted by Jason Striegel |
Sep 17, 2008 08:36 PM
Education, Electronics, Life, Lifehacker, Software Engineering |
Permalink
| Comments (1)
| TrackBack
| Digg It
| Tag w/del.icio.us
September 6, 2008
Write a Hadoop MapReduce job in any programming language
Hadoop is a Java-based distributed application and storage framework that's designed to run on thousands of commodity machines. You can think of it as an open source approximation of Google's search infrastructure. Yahoo!, in fact, runs many components of its search and ad products on Hadoop, and it's not too surprising that they are a major contributor to the project.
MapReduce is a method for writing software that can be parallelized across thousands of machines to process enormous amounts of data. For instance, let's say you want to count the number of referrals, by domain, in all the world's Apache server logs. Here's the gist of how you'd do it:
- Get all the world to upload their server logs to your gigantor distributed file system. You might automate and approximate this by having every web administrator add some javascript code to their site that causes their visitor's browsers to ping your own server, resulting in one giant log file of all the world's server logs. Your filesystem of choice is HDFS, the Hadoop Distributed Filesystem, which handles partitioning and replicating this enormous file between all of your cluster nodes.
- Split the world's largest log file into tiny pieces, and have your thousands of cluster machines parse the pieces, looking for referrers. This is the "Map" phase. Each chunk is processed and the referrers found in that chunk are output back to the system, which stores the output keyed by the referrer hostname. The chunk assignments are optimized so that the cluster nodes will process chunks of data that happen to be stored on their local fragment of the distributed file system.
- Finally, all the outputs from the Map phase are collated. This is called the "Reduce" phase. The cluster nodes are assigned a hostname key that was created during the Map phase. All of the outputs for that key are read in by the node and counted. The node then outputs a single result which is the domain name of the referrer, and the total number of referrals that were produced from that referrer. This is done hundreds of thousands of times, once for each referrer domain, and distributed across the thousands of cluster nodes.
At the end of this hypothetical MapReduce job, you're left with a concise list of each domain that's referred traffic, and a count of how many referrals it's given. What's cool about Hadoop and MapReduce is that it makes writing distributed applications like this surprisingly simple. The two functions to perform the example referrer parsing might only be about 20 lines of code. Hadoop takes care of the immense challenges of distributed storage and processing, letting you focus on your specific task.
Since Hadoop is written in Java, the natural way for you to create distributed jobs is to encapsulate your Map and Reduce functions into a java class. If you're not a Java junkie, though, don't worry, there's a job wrapper called HadoopStreaming which can communicate with any program you write with the usual STDIN and STDOUT. This lets you write your distributed job in Perl, Python or even a shell script! You create two programs, one for the mapper and one for the reducer, and HadoopStreaming handles uploading them to all of the cluster nodes and passing data to and from your programs.
If you want to play around with this, I really recommend a couple of howtos written by German hacker Michael G. Noll. He put together a walkthrough for getting Hadoop up and running on Ubuntu, and also a nice introduction to writing a MapReduce program using HadoopStreaming (with Python as an example).
Are any Hackszine readers using Hadoop? Let us know what you're doing and point us to more information in the comments.
Hadoop
Running Hadoop On Ubuntu Linux
Writing An Hadoop MapReduce Program In Python
Posted by Jason Striegel |
Sep 6, 2008 09:58 PM
Data, Software Engineering |
Permalink
| Comments (1)
| TrackBack
| Digg It
| Tag w/del.icio.us
September 4, 2008
Objective-J and Cappuccino released

Hackszine reader Math Campbell writes:
As promised when they released their demo application 280 Slides, 280 North, the startup that invented a whole new language (Objective-J) to run their Cocoa-like Javascript framework, Cappuccino on, has released both Objective-J and Cappuccino as open-source under the LGPL. They're also providing documentation, tutorials and forums to help you master this new and exciting way of writing web-apps.
This project came to my attention in June when 280 North released their web-based, Powerpoint-like presentation application, 280 Slides. The team has re-implemented a significant portion of the Cocoa API in Objective-J, so developers who are familiar with writing applications for Cocoa or GNUstep can easily port over their skill set, and possibly their applications, to the web.
Now I've got both iPhone and Cappuccino development giving me a reason to start kicking around a common development platform for web and mobile applications. Have any of you Hackszine readers started playing with Objective-J/Cappuccino? If so, what's been your experience so far?
Cappuccino: open source web application framework
Cappuccino tutorials
Posted by Jason Striegel |
Sep 4, 2008 08:26 PM
Ajax, Software Engineering, Web |
Permalink
| Comments (0)
| TrackBack
| Digg It
| Tag w/del.icio.us
September 2, 2008
Google Chrome's comic-strip technical overview

While I'm waiting for the Mac version of Google's new web browser, wondering what complications this has in store for me as a web developer, I couldn't help but notice how many non-hackers I've bumped into that could speak to the merits of processes versus threads or describe the benefits of Chrome's garbage collection model or security architecture. Two days ago, most of these folks wouldn't know a thread from the underside of their denims, but Scott McCloud's comic changed that.
So why is this a hack? Well, the strip that announced Chrome's release nicely bridged the nerd gap and managed to communicate some fairly technical content to a non-technical, though otherwise savvy potential user base. It's not easy to get people's attention when talking about memory management. It's difficult to communicate an esoteric architecture decision, and to both explain the decision and demonstrate its importance while not boring your audience is even more challenging.
I'm not saying comics are the way to go for all future technical documentation, but there's something to be learned here in terms of expanding your audience without dumbing down the content.
Google Chrome - Behind the Open Source Browser Project
Also worth noting: if you want to participate in the development of Chromium (the platform behind Google Chrome) you can download the source and communicate in the forums on the Chromium project page.
Posted by Jason Striegel |
Sep 2, 2008 08:24 PM
Software Engineering |
Permalink
| Comments (1)
| TrackBack
| Digg It
| Tag w/del.icio.us
August 25, 2008
Wii Physics
Wii Physics is a clever little homebrew app. You use the Wiimote to rotate, size and place objects on a stage. Pulleys, ropes, gears and joints can be used to connect objects together, and when you press the play button, a 2D physics system is turned on, causing the objects to fall and interact with each other.
You can download this for free and run it from the Homebrew Channel. If you're ambitious, you can also download the source, add new features, or base a new game off of it. It's written using libwiisprite, a library you'll want to check out if you're thinking of doing any 2D game dev for the Wii.
Posted by Jason Striegel |
Aug 25, 2008 07:38 PM
Gaming, Retro Gaming, Science, Software Engineering |
Permalink
| Comments (0)
| TrackBack
| Digg It
| Tag w/del.icio.us
August 22, 2008
The smallest program ever
Brian Raiter wrote an article many years ago in which he documented his quest to make the smallest possible Linux ELF executable, a stripped-down program that returns the answer to life, the universe, and everything.
While the standard gcc-compiled version of the application nets out at 3998 bytes, Brian discovered that the smallest possible size for an ELF executable that will still run correctly is 45 bytes:
This forty-five-byte file is less than one-eighth the size of the smallest ELF executable we could create using the standard tools, and is less than one-fiftieth the size of the smallest file we could create using pure C code. We have stripped everything out of the file that we could, and put to dual purpose most of what we couldn't.Of course, half of the values in this file violate some part of the ELF standard, and it's a wonder than Linux will even consent to sneeze on it, much less give it a process ID. This is not the sort of program to which one would normally be willing to confess authorship.
It's not an easy process creating the smallest possible program. To get there, you need to dissect the inner workings of the operating system and the ELF file format, which is really what the article is about. If you've ever wondered about the mysterious events that happen before main() and after return(), here's your chance to take the red pill.
A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux
Posted by Jason Striegel |
Aug 22, 2008 05:55 PM
Linux, Software Engineering |
Permalink
| Comments (2)
| TrackBack
| Digg It
| Tag w/del.icio.us
August 7, 2008
Ken Schwaber on Scrum
Scrum is a collection of tools for agile software development and project management. It helps to focus small software development teams into delivering a complete, tested, and quality product by breaking the development into small iterative chunks with a concrete output. Scrum doesn't necessarily help a team produce code faster, but it allows a team to find out very early in the process whether the development goals will be completed in the planned timeline.
Ken Schwaber, one of the developers of the Agile process and a major Scrum evangelist, presented Scrum to a group at Google. This video is a must watch for anyone involved in software development, whether you're a programmer or a project manager. In addition to introducing the management concept, he gives some insight into what makes a project fail (and how to know right away when it's going to), how to deliver on a tightened deadline without sacrificing quality (hint: half the features, not half baked), and why "core" functionality in a product tends to become unmanageable after 5 major development initiatives.
Google Tech Talks: Scrum et al.
Posted by Jason Striegel |
Aug 7, 2008 08:55 PM
Software Engineering |
Permalink
| Comments (0)
| TrackBack
| Digg It
| Tag w/del.icio.us
August 6, 2008
Memcached and high performance MySQL
Memcached is a distributed object caching system that was originally developed to improve the performance of LiveJournal and has subsequently been used as a scaling strategy for a number of high-load sites. It serves as a large, extremely fast hash table that can be spread across many servers and accessed simultaneously from multiple processes. It's designed to be used for almost any back-end caching need, and for high performance web applications, it's a great complement to a database like MySQL.
In a typical environment, a web developer might employ a combination of process level caching and the built-in MySQL query caching to eke out that extra bit of performance from an application. The problem is that in-process caching is limited to the web process running on a single server. In a load-balanced configuration, each server is maintaining its own cache, limiting the efficiency and available size of the cache. Similarly, MySQL's query cache is limited to the server that the MySQL process is running on. The query cache is also limited in that it can only cache row results. With memcached you can set up a number cache servers which can store any type of serialized object and this data can be shared by all of the loadbalanced web servers. Cool, no?
To set up a memcached server, you simple download the daemon and run it with a few parameters. From the memcached web site:
First, you start up the memcached daemon on as many spare machines as you have. The daemon has no configuration file, just a few command line options, only 3 or 4 of which you'll likely use:
# ./memcached -d -m 2048 -l 10.0.0.40 -p 11211This starts memcached up as a daemon, using 2GB of memory, and listening on IP 10.0.0.40, port 11211. Because a 32-bit process can only address 4GB of virtual memory (usually significantly less, depending on your operating system), if you have a 32-bit server with 4-64GB of memory using PAE you can just run multiple processes on the machine, each using 2 or 3GB of memory.
It's about as simple as it gets. There's no real configuration. No authentication. It's just a gigantor hash table. Obviously, you'd set this up on a private, non-addressable network. From there, the work of querying and updating the cache is completely up to the application designer. You are afforded the basic functions of set, get, and delete. Here's a simple example in PHP:
$memcache = new Memcache; $memcache->addServer('10.0.0.40', 11211); $memcache->addServer('10.0.0.41', 11211);$value= "Data to cache";
$memcache->set('thekey', $value, 60);
echo "Caching for 60 seconds: $value <br>\n";$retrieved = $memcache->get('thekey');
echo "Retrieved: $retrieved <br>\n";
The PHP library takes care of the dirty work of serializing any value you pass to the cache, so you can send and retrieve arrays or even complete data objects.
In your application's data layer, instead of immediately hitting the database, you can now query memcached first. If the item is found, there's no need to hit the database and assemble the data object. If the key is not found, you select the relevant data from the database and store the derived object in the cache. Similarly, you update the cache whenever your data object is altered and updated in the database. Assuming your API is structured well, only a few edits need to be made to dramatically alter the scalability and performance of your application.
I've linked to a few resources below where you can find more information on using memcached in your application. In addition to the documentation on the memcached web site, Todd Hoff has compiled a list of articles on memcached and summarized several memcached performance techniques. It's a pretty versatile tool. For those of you who've used memcached, give us a holler in the comments and share your tips and tricks.
Memcached
Strategies for Using Memcached and MySQL Better Together
Memcached and MySQL tutorial (PDF)
Posted by Jason Striegel |
Aug 6, 2008 10:37 PM
Data, Linux, Linux Server, MySQL, Software Engineering |
Permalink
| Comments (1)
| TrackBack
| Digg It
| Tag w/del.icio.us
July 29, 2008
DJBDNS, DNS exploits, Bernstein, Schneier, and security by design
If you haven't been living under a rock, you've probably heard of the DNS vulnerability that Dan Kaminsky announced about a half year ago. The plan was that Kaminsky would be working with DNS server vendors to provide a patch, giving ample time for administrators to upgrade before the details of the exploit were released later this year. Unfortunately the exploit was leaked prematurely, causing a general freak-out mode amongst people that administer DNS systems.
When I read the article on Slashdot, the "all name servers should be patched as soon as possible" quote dropped a bit of scare on me too. What about my sad little DNS server? I envisioned spending an evening working through a time consuming process of patching and reconfiguring things that I haven't had to touch in years. Much to my pleasant surprise, djbdns, D. J. Bernstein's DNS server, was not vulnerable. My decision to use djbdns a number of years ago was primarily due to his vocal philosophy of engineering security by design instead of by response.
Bruce Schneier's analysis of things is spot on as usual. It's a solid case study for hygienic software engineering practices and the design of secure systems.
The real lesson is that the patch treadmill doesn't work, and it hasn't for years. This cycle of finding security holes and rushing to patch them before the bad guys exploit those vulnerabilities is expensive, inefficient and incomplete. We need to design security into our systems right from the beginning. We need assurance. We need security engineers involved in system design. This process won't prevent every vulnerability, but it's much more secure -- and cheaper -- than the patch treadmill we're all on now.
What a security engineer brings to the problem is a particular mindset. He thinks about systems from a security perspective. It's not that he discovers all possible attacks before the bad guys do; it's more that he anticipates potential types of attacks, and defends against them even if he doesn't know their details. I see this all the time in good cryptographic designs. It's over-engineering based on intuition, but if the security engineer has good intuition, it generally works.Kaminsky's vulnerability is a perfect example of this. Years ago, cryptographer Daniel J. Bernstein looked at DNS security and decided that Source Port Randomization was a smart design choice. That's exactly the work-around being rolled out now following Kaminsky's discovery. Bernstein didn't discover Kaminsky's attack; instead, he saw a general class of attacks and realized that this enhancement could protect against them. Consequently, the DNS program he wrote in 2000, djbdns, doesn't need to be patched; it's already immune to Kaminsky's attack.
The djbdns server wasn't pre-installed on the Linux distro I based my poor old server on. DJB's deamontools package, which manages the startup and shutdown of the service, was annoying to deal with when every other application just uses a normal init rc script. The dns server configuration and setup was also unfamiliar to me, having previously only worked with BIND zone files.
There's one other thing that has really been different with djbdns than any other DNS server I've ever administered: I've never had to patch it. I've only had one other software experience like this, with the qmail mail transfer system. Qmail is also designed by Bernstein. Hmm.
If you're upgrading your DNS server anyway, maybe now is the time to start thinking about your alternatives.
Daniel J. Bernstein's djbdns server
Schneier - The DNS Vulnerability
DJB on DNS forgery
Slashdot - Kaminsky's DNS Attack Disclosed, Then Pulled
Posted by Jason Striegel |
Jul 29, 2008 08:52 PM
Cryptography, Network Security, Software Engineering |
Permalink
| Comments (6)
| TrackBack
| Digg It
| Tag w/del.icio.us
July 27, 2008
Cyber Security Awareness Week

Dan Guido from the Information Systems and Internet Security Lab at the Polytechnic Institute of NYU wrote in about the Institute's 5th annual Cyber Security Awareness Week. If you're in high-school or a college undergraduate program, this is a great opportunity to test your infosec skills against your peers, and hopefully earn a little prize money in the process.
ISIS Lab is organizing NYU-Poly's 5th annual Cyber Security Awareness Week (CSAW) where students can compete and win prizes in a variety of information security challenges. There will be door prizes, raffles for participating, and bonus prizes for undergrad and high school participants. Qualified finalists will receive a travel scholarship to attend the awards ceremony in New York City.
There are a number of events, including an application security "capture the flag" challenge, a security quiz which covers everything from cryptography to risk management, and a 5-day forensics puzzle. There's even an embedded systems challenge where teams are tasked with trying to find hardware and software bugs in a mock control system.
This looks like a lot of fun. Some of the contest materials become available at the beginning of September, so sign up soon if you're interested in participating.
Cyber Security Awareness Week 2008
Posted by Jason Striegel |
Jul 27, 2008 09:28 PM
Cryptography, Electronics, Network Security, Software Engineering |
Permalink
| Comments (1)
| TrackBack
| Digg It
| Tag w/del.icio.us
July 26, 2008
MySQL performance tuning
Jay Pipes, MySQL employee and co-author Pro MySQL, gave a great presentation to Google employees which covers a number of techniques for tuning performance on MySQL. His examples include debugging and analyzing problems as well as best practices for table and index design, query and join operations, and server variable adjustments.
It's a little over 40 minutes long, but incredibly informative, whether you're a casual querier or a power MySQL user. Though some of this stuff is MySQL (or MyISAM or InnoDB) specific, the majority of the content is essential material for the average database application developer.
If you don't have time to sit through it (shame on you) or you're looking to jump right to a specific topic, there's a nice time-coded dissection of the talk over at Peteris Krumins' blog. There's something so appropriate about adding a search index to a video about MySQL optimization.
Performance Tuning Best Practices for MySQL
Video Index
Posted by Jason Striegel |
Jul 26, 2008 12:11 PM
MySQL, SQL, Software Engineering |
Permalink
| Comments (0)
| TrackBack
| Digg It
| Tag w/del.icio.us
July 15, 2008
When to denormalize
There's been a bit of a database religious war on Dare Obasanjo and Jeff Atwood's blogs, all on the subject of database normalization: when to normalize, when not to, and the performance and data integrity issues that underly the decision.
Here's the root of the argument. What we've all been taught regarding database design is irrelevant if the design can't deliver the necessary performance results.
The 3rd normal form helps to ensure that the relationships in your DB reflect reality, that you don't have duplicate data, that the zero to many relationships in your system can accommodate any potential scenario, and that space isn't wasted and reserved for data that isn't explicitly being used. The downside is that a single object within the system may span many tables and, as your dataset grows large, the joins and/or multiple selects required to extract entities from the system begins to impact the system's performance.
By denormalizing, you can compromise and pull some of those relationships back into the parent table. You might decide, for instance, that a user can have only 3 phone numbers, 1 work address, and 1 home address. In doing so, you've met the requirements of the common scenario and removed the need to join to separate address or contact number tables. This isn't an uncommon compromise. Just look at the contacts table in your average cell phone to see it in action.
Jeff writes:
Both solutions have their pros and cons. So let me put the question to you: which is better -- a normalized database, or a denormalized database?Trick question! The answer is that it doesn't matter! Until you have millions and millions of rows of data, that is. Everything is fast for small n.
So for large n, what's the solution? In my personal experience, you can usually have it both ways.
Design your database to 3NF from the beginning to ensure data integrity and to allow room for growth, additional relationships, and the sanity of future querying and indexing. Only when you find there are performance problems do you need to think about optimizing. Usually this can be accomplished through smarter querying. When it cannot, you derive a denormalized data set from the normalized source. This can be as simple as an extra field in the parent table that derives sort information on inserts, or it can be a full-blown object cache table that's updated from the official source at some regular interval or when an important even occurs.
Read the discussions and share your comments. To me, the big takeaway is that there's no one solution that will fit every real world problem. Ultimately, your final design has to reflect the unique needs of the problem that is being solved.
When Not to Normalize your SQL Database
Maybe Normalizing Isn't Normal
Posted by Jason Striegel |
Jul 15, 2008 08:47 PM
Data, Software Engineering |
Permalink
| Comments (0)
| TrackBack
| Digg It
| Tag w/del.icio.us
Bloggers
Welcome to the Hacks Blog!
Categories
- Ajax
- Amazon
- Android
- AppleTV
- Astronomy
- Baseball
- BlackBerry
- Blogging
- Body
- Cars
- Cryptography
- Data
- Design
- Education
- Electronics
- Energy
- Events
- Excel
- Excerpts
- Firefox
- Flash
- Flickr
- Flying Things
- Food
- Gaming
- Gmail
- Google Earth
- Google Maps
- Government
- Greasemonkey
- Hacks Series
- Hackszine Podcast
- Halo
- Hardware
- Home
- Home Theater
- iPhone
- iPod
- IRC
- iTunes
- Java
- Kindle
- Knoppix
- Language
- LEGO
- Life
- Lifehacker
- Linux
- Linux Desktop
- Linux Multimedia
- Linux Server
- Mac
- Mapping
- Math
- Microsoft Office
- Mind
- Mind Performance
- Mobile Phones
- Music
- MySpace
- MySQL
- NetFlix
- Network Security
- olpc
- Online Investing
- OpenOffice
- Outdoor
- Parenting
- PCs
- PDAs
- Perl
- Philosophy
- Photography
- PHP
- Pleo
- Podcast
- Podcasting
- Productivity
- PSP
- Retro Computing
- Retro Gaming
- Science
- Screencasts
- Security
- Shopping
- Skype
- Smart Home
- Software Engineering
- Sports
- SQL
- Statistics
- Survival
- TiVo
- Transportation
- Travel
- Ubuntu
- User Interface
- Video
- Virtualization
- Visual Studio
- VoIP
- Web
- Web Site Measurement
- Windows
- Windows Server
- Wireless
- Word
- World
- Xbox
- Yahoo!
- YouTube
Archives
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- September 2006
Recent Posts
- SlugPower - Linux controlled power switch
- Play backed-up Wii games
- Quick workaround for the T-Mobile G1 root shell bug
- Hand gesture multitouch using only a webcam
- Gesture recognition for Javascript and Flash
- Programming DNA
- Live via hologram
- Top 5 election day mashups
- Telescope control with stepper motors
- CSSHttpRequest - cross browser AJAX without JSON
www.flickr.com
|






Recent comments