Simulating Missing Language Features (with Cheap Tricks)
October 1, 2008
If you have ever implemented “real-world” application logic in an uncommon or less advanced programming languages, you may have had to deal with the fact that features you take for granted in other languages are not available. When I encounter such obstacles I am often fascinated by the challenge of trying to add the missing functionality to the language in one way or another. Sometimes I will implement the feature “properly”, but often I find it easier and more convenient just to simulate the behavior I want. Over the years I have been playing around with a fair share of odd programming environments, so here I present you with a collection of some cheap tricks I have used to overcome language limitations in the past.
Simulating Hash Maps with Arrays
These days, any “respectable” language has some kind of support for hash maps or associative arrays, either in the language itself or via standard libraries. In Java there is the Map interface and the HashMap implementation, in C++ there is std::map and std::hashmap, while in Perl, PHP and Python, associative containers are a part of the language itself. But what do you do if you want the convenience of easy access to key/value pairs in languages that don’t provide any such features?
Of course, you could always set aside a couple of days to implement a real hash map, but unless you are trying to impress your university professor or performance actually matters to you, it will probably just be a waste of time. Implementing one may also prove rather difficult if your language lacks other “essential” features, like data records, structures or object encapsulation (i.e. classes).
However, most languages support arrays, and as it turns out, we can easily simulate a mapping container using two arrays:
function findIndex(array, value):
for index from 0 to size of array-1:
if array[index] equals value:
return index
return -1 // value not found
function setMapValue(keyArray, valArray, key, value):
index = findIndex(keyArray, key)
if index equals -1
append key to keyArray
append value to valArray
else
valArray[index] = value
function getMapValue(keyArray, valArray, key):
index = findIndex(keyArray, key)
if index equals -1
return nothing // key not found
return valArray[index]
Using the two utility functions setMapValue() and getMapValue(), we can now store and retrieve key/value pairs from our (simulated) map:
setMapValue(keyArray, valArray, "sortValue", 15) setMapValue(keyArray, valArray, "refCount", 0) ... sortValue = getValue(keyArray, valArray, "sortValue") refCount = getMapValue(keyArray, valArray, "refCount")
It may look a bit dirty, but compared to “manually” manipulating array pairs spread around the code, I find this technique very useful when working in “less sophisticated” environments.
Simulating Arrays with Variables
Hash maps and associative containers are somewhat advanced data structures, so I don’t expect every home-grown programming language or proprietary third-party scripting engine to support them, but what about arrays? What if your language has no concept of a sequence of (related) values?
The answer is simple. Just simulate arrays using regular variables:
function setArrayValue(index, value)
if index equals 0
arrayVariable0 = value
return
if index equals 1
arrayVariable1 = value
return
if index equals 2
arrayVariable2 = value
return
function getArrayValue(index)
if index equals 0
return arrayVariable0
if index equals 1
return arrayVariable1
if index equals 2
return arrayVariable2
return nothing // index out of bounds
As the example illustrates, this technique requires a lot of typing and code duplication, and is therefore somewhat tedious to implement for large arrays. Also, your language may require variables to be defined before use, resulting in even more verbose code. Another disadvantage is that since the variable names are hard-coded, you need a separate function for each array and care must be taken to avoid name collisions between distinct arrays.
This may sound like a debugging nightmare waiting to happen, but it is also an excellent opportunity to utilize your favorite—hopefully more advanced—scripting language and write a simple source code generator, saving you all the trouble of extensive manual typing and hard-to-find copy/paste errors.
An interesting feature of these “manually implemented” arrays is that they will allow you to have arrays with holes in them, or arrays using arbitrary indices of your choosing. If your implementation language supports dynamic variable typing, you can even use this technique to simulate arrays of mixed data types.
Utilizing Dynamic Variable Names
If your language supports rune-time generation of variable names (often called dynamic variables or variable variables), the array implementation described above can be rewritten to something much simpler (variable variable syntax borrowed from PHP):
function setArrayValue(array, index, value)
variableName = array + index
${variableName} = value
function getArrayValue(array, index)
variableName = array + index
return ${variableName}
In fact, the same trick can also be used to simplify the “hash map” implementation as well, eliminating the need for two array parameters:
function setMapValue(array, key, value):
keyArray = array + "Key"
valArray = array + "Value"
index = findIndex(${keyArray}, key)
if index equals -1
append key to ${keyArray}
append value to ${valArray}
else
${valArray}[index] = value
function getMapValue(array, key):
keyArray = array + "Key"
valArray = array + "Value"
index = findIndex(${keyArray}, key)
if index equals -1
return nothing // key not found
return ${valArray}[index]
Note that dynamic variables are considered by many as one of the most dangerous features of (modern) scripting languages, so be careful not to shoot yourself in the foot. They remind me of the good old days of self-modifying assembly code: great fun for (obscure) hacks and very hard to debug when things go wrong.
Simulating Random Access to List Elements
Some languages support iteration over lists, but not direct access to elements via indices. If you want to “break the rules”, and index lists directly anyway, you can often achieve this with a custom function or macro—assuming the language provides you with a way to define those—like this:
function getValueByIndex(list, index)
i = 0
for each element in list
if i equals index
return i
return nothing // index out of bounds
Again, iterating over the entire list to find your element may seem a bit odd, but when your data set is small, or you don’t care about efficiency, this approach works surprisingly well.
Simulating Data Records with (Simulated) Hash Maps
Have you ever wanted composite data structures while working in a language that doesn’t support them? Well, I have, and I usually get my will anyway, one way or the other.
Using the map implementation from before, we can simulate a three-field Person record with ease:
setMapValue("person", "firstName", "Johnny")
setMapValue("person", "lastName", "Mnemonic")
setMapValue("person", "age", 37)
...
firstName = getMapValue("person", "firstName")
lastName = getMapValue("person", "lastName")
age = getMapValue("person", "age")
Also here we can use dynamic variable names to simulate more complex data structures, like an array of Person records:
function setPerson(array, index, firstName, lastName, age, occupation)
recordName = array + index
setMapValue(${recordName}, "firstName", firstName)
setMapValue(${recordName}, "lastName", lastName)
setMapValue($(recordName}, "occupation", occupation)
Unfortunately, unless your language supports multiple return values or pass-by-reference parameters, retrieving the data from the array will involve a bit more “manual labor”:
function printRecords(prefix, arraySize)
for index from 0 to arraySize-1
recordName = prefix + index
firstName = getMapValue(${recordName), "firstName")
lastName = getMapValue(${recordName), "lastName")
occupation = getMapValue(${recordName), "occupation")
print "Record #" + index
print "First name: " + firstName
print "Last name: " + lastName
print "Occupation: " + occupation
This may not be the best way to represent thousands of records from a collection of database tables, but if you only need to read a few fields from a configuration file or a convenient way to organize an internal data structure, this approach can make things just a bit more readable and maintainable—at least in situations where the alternative would be manually managing separate variables for each field in each record, which can be a real drag, both to implement and maintain.
Simulating Variables with …
No, really. You don’t want to do that.
Conclusion
Often when suggesting feature simulations like these to people struggling with language limitations, I get negative reactions. Their first response is usually something like “No, that’s ugly” or “No, we can’t do that. It will be too slow”. Sometimes these concerns are valid, but most of the time the assumptions are just wrong.
When it comes to ugliness, I will be the first to admit that tricks like these are simple and ugly, but as it turns out, they often get the job done, and if you are lucky, they may even make your code more readable and maintainable. As for speed concerns, if you are worried about speed you should not be using a language requiring hacks to support these features anyway. You would probably be better off chosing a different approach, preferably by using a “real” langauge. In fact, if you encounter problems like these, and struggle with language limitations, on a regular basis, chances are you are using the wrong language for your tasks.
Please, Ask Stupid Questions
September 3, 2008
Last week I read about Jeff Atwood having some database problems on his new site. Being a popular blogger, he soon had hundreds of comments, most of them telling him how elementary his problem was and how disappointed people were that someone of his skill and reputation did not know how to solve such a trivial task as avoiding database deadlocks. Luckily, one person was wise enough to provide a different view (quoting an earlier comment):
“It’s always a little disturbing to see a well-known coder ask a dumb question, but come on, database locks?”
Wrong. Dead wrong and incredibly dangerous. And egotistical. If I were interviewing you I would immediately flip the bozo bit and thank you for your time.
It’s impossible to know everything. The hallmark of a good programmer is not what they know but their ability to learn what they don’t. If you do not promote an environment where any question can be asked, no matter how naive and trivial, problems become intractable because people are too afraid to ask for help. Just because someone has been in the industry for a while doesn’t mean they know everything.
Chris on August 25, 2008 12:09 PM
As children, we ask “stupid” questions all the time, and (hopefully) nobody gets their head bitten off for doing so. We might laugh and smile, but after all, we expect children to ask about everything, as this is how they learn. But does this method of learning stop being valid once we grow up? I hope not.
I remember once sitting on a public city bus observing a small child asking her father if the buses were on teams. The father, somewhat baffled by the question, did not seem to know what to answer. After a pause of silence, his expression of surprise transformed to a smile, followed by a short “no”, while trying to hide his laughter. The question may have appeared silly to the father, but considering we were on a red bus and a yellow one had just passed us, it made perfect sense to me. Assuming the child was familiar with the tradition of sports teams wearing different colors, it seemed logical that she would infer this also to be valid for buses. The analogy of sports teams might seem strange applied to public transportation, but even though both the yellow and red buses were operated by the same company at the time, the green ones were not.
In my opinion, there are no stupid questions, only stupid answers. Never bite someone’s head off for asking a “stupid” question—it can be very destructive. In these situations I think it is best to follow the good advice of Marge Simpson’s mother. If you don’t have anything nice to say, don’t say anything at all. Belittling someone by lecturing them on how trivial their question is, or how what they are asking about is something they should already know, does not really help anyone. If you don’t want to or don’t have the time to help someone, tell them so, but don’t ridicule them for asking.
Also, if asked about something you think the person should be capable of finding out themselves, you don’t have to give a direct answer. Often I find it better to provide the person with some reference material or relevant search keywords. That way they can do their own research, and hopefully gain a better understanding of the subject than a quick explanation or a direct answer to their question would have given them. When people do make an effort to conduct their own research, you should also be more willing to answer specific questions that may come up while they are studying the topic.
Often, asking a “stupid” question can reveal holes or misconceptions in your mental representation of a problem or knowledge domain that would otherwise go undetected. I much prefer someone to be honest about their uncertainties, and ask a stupid question, than pretending to understand something they don’t. In many cases you may not even be aware of your own lack of understanding until you do ask that “stupid” question, which is what makes asking them so important in the first place.
That being said, there is a huge difference between asking about something because you are unsure if you understand it correctly, or something is unclear to you, and asking because you are simply too lazy to look up the answer yourself or think through the problem properly. Personally, I ask stupid questions all the time, although these days most of them can be answered by Google or Wikipedia. However, if you do indeed make an effort to find and answer and you are not successful, please, don’t be afraid of asking someone.
How I Got Started Programming
August 21, 2008
Browsing through programming blogs at wordpress.com I noticed an interesting post (inspired by a similar post at toxicsoftware.com) asking people how they got started programming. It turned out this has been going on for a while, with various people answering and passing the questions on to new people, and so on. Even if nobody actually asked me, I decided to answer anyway. I don’t know who started it, but I tracked down what most people seem to consider the original source, a blog post by Michael Eaton, and copied the questions from there.
1. How Old Were You When You Started Programming?
I am not exactly sure. My family got our first PC when I was nine (1989), but I can’t remember how long we had it before I started with programming. My guess would be I was about ten. Before then I had also experimented with some BASIC on the C64. I didn’t have one myself, but some of my friends did. We mostly copied the examples from the manuals and modified them, so I wasn’t really doing much programming on my own at that point. It was not until I had regular access to a PC that I started to do some “serious” programming.
2. How Did You Get Started in Programming?
I don’t remember exactly. I learned to use Norton Commander for editing batch files and hacking save games very early. Later I discovered GW-BASIC—and the manual that came bundled with our PC—which I used to write text adventure games. I think one of the main reasons I started with programming was because I enjoyed being able to control the computer and make it do what I wanted.
3. What Was Your First Language?
MS-DOS batch scripting and GW-BASIC. Batch files was my first introduction to variables, conditions and control structures, but GW-BASIC was what I eventually used for “real” programs.
4. What Was the First Real Program You Wrote?
Again, it’s hard for me to remember, since it’s so many years ago, but I do remember writing a rather long text adventure game that could be played with multiple outcomes and various ways to complete the game. Later on I also wrote more graphics-oriented games, usually cloning classics like Space Invaders, Pong and Snake.
5. What Languages Have You Used Since You Started Programming?
In no particular order: MS-DOS and 4DOS batch files, GW-BASIC, QuickBASIC, Turbo Basic, Commodore BASIC, Turbo/Borland Pascal, x86 assembly, M68k assembly, C64 assembly, C, C++, Delphi, Java, JavaScript, JSP, C#, Visual Basic, VBScript, ASP, Cg, HLSL, PHP, Python, Perl, Ruby, mIRC scripting, Bash scripting, Lua, BeanShell, TI-82 BASIC and TI-82 assembly, in addition to various template languages, a few scripting languages I wrote myself, and possibly some I don’t remember.
6. What Was Your First Professional Programming Gig?
I guess that depends on what is meant by “professional programming gig”. The first time I earned money from programming was when I won the 4k intro competition at The Gathering in 1997. I had won some other competitions earlier, but The Gathering was the first event where the prize was actually payed in cash.
I also did some programming and scripting while working as a computer technician for the City of Oslo’s school districts during the summer and fall of 1997, but I was not actually hired to do programming. My main job was building computers, setting up servers and cabling networks.
I would have to say the first professional gig was when I started working full-time as a game programmer in 1999.
7. If You Knew Then What You Know Now, Would You Have Started Programming?
I don’t know, but probably, yes. I never planned on becoming a programmer, it just happened. I don’t think I would have chosen differently, because I only did what I thought was fun and kept on doing it.
8. If There is One Thing You Learned Along the Way that You Would Tell New Developers, What Would It Be?
Programming is about solving problems. To be good at it you have to practice it, at lot. Try and fail, and when you fail, try again. Stay up-to-date with current technologies and always try to educate yourself by reading books, discussing problems with your peers and experimenting—a lot. Technologies always change and (good) programmers never stop learning.
9. What’s the Most Fun You’ve Ever Had … Programming?
Actually, I have so much fun programming that I already wrote an article about it.
The Programming “High”
August 12, 2008
I recently read an interesting blog post asking “what’s the most fun you’ve ever had… programming?” After thinking about it for a while, I realized I wasn’t able to answer. Not because I don’t have fun programming, but because I have so much fun programming, I can’t easily single out one project as the one I enjoyed the most. Of course, not every project is fun all the time, but I find programming to be great fun most of the time.
I think almost any project—be it programming or something else—can be fun if you want it to be. If you tell yourself “this project is going to be boring”, or “this task is boring”, chances are you will be bored. I have the most fun when I achieve something and when I learn something, and in almost any project you can accomplish at least one of the two. Most of the time you can get both. Even a seemingly trivial task, like creating a simple snake game, can be challenging if you want it to be.
When I was a teenager, I thought games and graphics was the most fun things to program, and I couldn’t understand how anyone could possibly enjoy themselves writing “boring” enterprise code in systems with no graphics at all. To me, real-time graphics and interactivity were essential requirements for a “fun project”. As I grew older and learned more about programming, I realized there was a vast landscape of challenges out there, and the opportunities for fun and learning was by no means limited to visual effects and interactive games. I discovered that writing a script parser in a high-level language or implementing a routing algorithm can be just as much fun as pipeline-optimizing rendering loops in assembly or programming an animation engine.
Flow
To me, an essential part of having fun while programming is being able to enter a state of flow. The more often I can work in flow, the more often I will have fun. The flow state can be very consuming—almost intoxicating—giving you a great feeling inside. I like to describe this feeling as the programming “high”. For me, the feeling is especially strong if I am learning new things and discovering new truths while working in flow. The point when I realize I have solved a problem or mastered a new skill can be very exciting and rewarding. When entering flow state as a group, i.e. in a meeting or brainstorming, the effect can be even more powerful, often giving a major productivity boost. If you are able to enter flow on a regular basis and create challenges for yourself while working, it doesn’t really matter what kind of project you are on, you will have fun anyway.
Personal Accomplishment
Another variety of the programming “high” is the feeling I get when I have achieved something. This is of course not unique to programming, but is common to many areas of life and is a basic human emotion. For me, this “high” comes in two types. One is the feeling I get when I have completed or accomplished something, like solving a complex problem or managed to get a date with a cute girl. The feeling is usually immediate and comes right after completing the task or event that triggers the emotion. The strength of the feeling is often directly related to the complexity or difficulty of the challenge. I can only imagine, but I guess this is also something like the feeling—in a very strong form, I am sure—athletes have when they win an event or break a record. The feeling is not based on any external feedback, and will mostly be determined by what the accomplished challenge means to you, personally, rather than how impressive someone else may think it is.
External Feedback
The other form is sometimes more subtle, but can be even stronger and more overwhelming when it first happens. This is the feeling I get when someone appreciates my work or gives me a compliment. This feeling can come long after the initial accomplishment, and it may even come as a response to something you did not consider a big accomplishment in the first place. It is triggered by external feedback and can be very strong, filling your body with an overwhelming rush. Again, I can only imagine, but I think this is how musicians, actors and performance artists may feel when they are on stage. It is also interesting to note that if the receiver does not think the feedback is justified or honest (i.e. the task for which you are complimented was trivial to you), the feeling may not trigger at all. I think this is one of the reasons why this feeling can be so strong when it does trigger, because it is not directly caused by yourself, like the first variety. However, when the feeling is genuine, it can boost your motivation and self-esteem for days.
Whether it’s working in flow, the rewards of personal accomplishment or feedback on your work that makes a project fun, it’s up to you to find ways to trigger those emotions as often as possible.
New Challenges
In a recent interview, Steve McConnell was asked what had been been his toughest challenge in the past. I don’t know McConnell personally, but having read some of his material, the answer did not surprise me:
I believe that if you’re not struggling, you’re not growing. And if you’re not growing you’re probably decaying or dying. So my life has been characterized more by “the challenge of the month” than by any one toughest challenge.
To me, this is as logical as Boolean algebra. If you constantly seek out new challenges, the recent ones will always be the most difficult you have encountered. If not, you are not evolving. And we should all be evolving, as professionals and as human beings. When you have challenge you have learning, when you have learning you have fun. If you have to think back a long time to find the “the most fun you have ever had”, you are probably not having fun on a regular basis.
Please share your thoughts.
Best Practices for Version Control
July 28, 2008
Source code version control systems have been around for decades, but sometimes I suspect people are using them just because everybody else is, or because their manager told them to do so, or because it’s company policy. Although most people will agree that using version control is a prerequisite for any serious software project, many programmers only utilize a small percentage of the possibilities and advantages such systems can provide.
Here are some of my thoughts on what I consider best practices for using version control systems. In short, they can be described with seven basic sentences:
- Put everything under version control.
- Create sandbox home folders.
- Use a common project structure and naming convention.
- Commit often and in logical chunks.
- Write meaningful commit messages.
- Do all file operations in the version control system.
- Set up change notifications.
These recommendations are based on my own experience and preferences with using CVS and Subversion over the years, but the principles should easily transfer to other systems as well.
1. Put Everything Under Version Control
Any files associated with any project you are working on that may be of interest to anyone else—or even only to yourself—should be put under version control. Note that this is not limited to source code and files related to the implementation of a project, but also includes documents such as meeting minutes, specifications, architecture and design documents, artwork, configuration files and install scripts. When doing research for a project and gathering information from external resources, I also like to add those to the repository. Some examples are product brochures, protocol specifications, book references and links to company web sites. E-mail correspondence, scans of whiteboard notes or a concept drawing on a napkin are also useful to store for later reference.
Although some people think it’s silly to archive files that never change in a version control system, I find great value in having every document related to a project stored in the same place. It makes finding things so much easier—which can save you a lot of time when you don’t have to dig through hundreds of e-mails to locate that specification you got six months ago but didn’t have time to start implementing until now. Also, in the area of software development, there is no such thing as a document that never changes (or at least, there shouldn’t be, because you always remember to update your documentation, right?). If you are working on a project where many documents are produced by non-technical or non-programming people (i.e. people who don’t use version control), consider setting up automatic synchronization between project file shares and the version control repository.
When documentation is kept in a wiki, things might be a bit different. If the wiki itself keeps track of changes—which any decent wiki will do—there may be no need to store this data in a separate system. If your wiki is backed by a database, you may consider putting the database itself under version control, but some people will view this as redundant (after all, you have automated backups of all your databases, right?). I don’t have any preferences on how this should be solved, as long as all documents related to a project is stored on a central server with associated revision history.
For document formats that require processing before being readable, such as DocBook, LyX and LaTeX files, I prefer also committing them in a more readable form, like PDF or HTML. Some may argue this violates the DRY principle, but it also makes the documents easier to read for people who don’t have the required processing tools installed (or who are just lazy). This can be very useful when distributing documents by linking to them directly in the repository (i.e. via HTTP), but do take care to update both versions when making changes to such files—or even better, automate it.
2. Create Sandbox Home Folders
To encourage developers to use the version control system also for their own documents, (experimental) projects and tools, I recommend creating home folders in the repository, giving each user a sandbox to play with. In my experience, many useful tools have started out as simple scripts in a developer’s home folder and evolved into powerful utilities over time, so why not keep the revision history from day one? This also allows less experienced developers to experiment with branching, tagging and merging, hopefully encouraging them to use those features in “real” projects as well.
3. Use a Common Project Structure and Naming Convention
I recommend a consistent naming convention for all files and folders in a project. Preferably, an effort should be made to maintain the convention between projects throughout the repository. This makes it easier to locate files by partially guessing their name or location. For example, finding the source code for a project with many sub-folders will be much easier if the folder containing source code is named src rather than something totally arbitrary.
Using a common project structure can also be valuable for automated tools. For example, if all projects have a readme.txt or readme.html in their root folder, one can easily implement a script to generate a web page with a brief description of each project in the repository. If you are using an automated build system, such as Apache Maven, some of this structure may already defined for you. Ideally, the project structure and naming policies should be described in your coding conventions or similar guidelines.
4. Commit Often and in Logical Chunks
It’s better to have a broken build in your working repository than a working build on your broken hard drive.
I prefer to follow the basic work cycle described in the Subversion book. This means that you should always update your working copy before doing any changes to files. In general it’s preferred to commit changes in logical chunks. Changes that belong together should be committed together, changes that don’t shouldn’t. This can make the resulting revision history significantly more useful on systems with atomic commits when changes span multiple files.
If you are doing many changes to a project at the same time, split them up into logical parts and commit them in multiple sessions. This makes it much easier to track the history of individual changes, which will save you a lot of time when trying to find and fix bugs later on. For example, if you are implementing feature A, B and C and fixing bug 1, 2 and 3, that should result in a total of at least six commits, one for each feature and one for each bug. If you are working on a big feature or doing extensive refactoring, consider splitting your work up into even smaller parts, and make a commit after each part is completed. Also, when implementing independent changes to multiple logical modules, commit changes to each module separately, even if they are part of a bigger change.
Ideally, you should never leave your office with uncommitted changes on your hard drive. If you are working on projects where changes will affect other people, consider using a branch to implement your changes and merge them back into the trunk when you are done. When committing changes to libraries or projects that other projects—and thus, other people—depend on, make sure you don’t break their builds by committing code that won’t compile. However, having code that doesn’t compile is not an excuse to avoid committing. Use branches instead.
5. Write Meaningful Commit Messages
If you have nothing to say about what you are committing, you have nothing to commit.
Always write a comment when committing something to the repository. Your comment should be brief and to the point, describing what was changed and possibly why. If you made several changes, write one line or sentence about each part. If you find yourself writing a very long list of changes, consider splitting your commit into smaller parts, as described earlier. Prefixing your comments with identifiers like Fix or Add is a good way of indicating what type of change you did. It also makes it easier to filter the content later, either visually, by a human reader, or automatically, by a program.
If you fixed a specific bug or implemented a specific change request, I also recommend to reference the bug or issue number in the commit message. Some tools may process this information and generate a link to the corresponding page in a bug tracking system or automatically update the issue based on the commit.
Here are some examples of good commit messages:
Changed paragraph separation from indentation to vertical space.
...
Fix: Extra image removed.
Fix: CSS patched to give better results when embedded in javadoc.
Add: A javadoc {@link} tag in the lyx, just to show it's possible.
...
- Moved third party projects to ext folder.
- Added lib folder for binary library files.
...
Fix: Fixed bug #1938.
Add: Implemented change request #39381.
Many developers are sloppy about commenting their changes, and some may feel that commit messages are not needed. Either they consider the changes trivial, or they argue that you can just inspect the revision history to see what was changed. However, the revision history only shows what was actually changed, not what the programmer intended to do, or why the change was made. This can be even more problematic when people don’t do fine-grained commits, but rather submit a week’s worth of changes to multiple modules in one large pile. With a fine-grained revision history, comments can be useful to distinguish trivial from non-trivial changes in the repository. In my opinion, if the changes you made are not important enough to comment on, they probably are not worth committing either.
6. Do All File Operations in the Version Control System
Whenever you need to copy, delete, move or rename files or folders in the repository, do so using the corresponding file operations in the version control system.1 If this is done only on the local file system, the history of those changes will be lost forever. I consider structural changes just as important as changes to the files themselves, so there is no reason why not to let the version control system keep track of them. Also, when people know all their changes can be undone, the threshold for doing radical restructuring and major refactoring will be lowered, which can have a significant impact on preventing the build-up of technical debt.
7. Set Up Change Notifications
To monitor changes in the repository as they happen, I recommend setting up change notifications to send out an e-mail or update an RSS feed whenever a commit is made. Some systems support notifications directly via event hooks—sometimes with default implementations provided—while others may require external cron jobs, daemons or custom scripts to provide this feature.
My recommendation is that all developers subscribe to change notifications, since they can have many advantages. Obviously, they are useful if you want to see what changes are being done to projects you are working on or have an interest in (i.e. a library your project is using), but they might also encourage—or scare—people into writing more useful commit messages, since they know someone might actually be reading them.
Typically the notifications will also contain extracts of the files that were changed, making them useful for light-weight code reviews. Programmers who monitor source code changes can keep an eye out for code smells or violations of the coding conventions, and if you are lucky, you might even learn something by reading other people’s code.
Here’s an example of what a commit notification e-mail can look like:
From: svn-commit@company.com
Sent: Wednesday, March 05, 2008 11:23 AM
To: svn-commit@company.com
Subject: [SVN:CompanyRepository] r6523 - trunk/documents/templates
Author: anders
Date: 2008-03-05 11:23:08 +0100 (Wed, 05 Mar 2008) New Revision: 6523
Modified:
trunk/documents/templates/document.lyx
Log:
Changed paragraph separation from indentation to vertical space.
Modified: trunk/documents/templates/document.lyx
===================================================================
--- trunk/documents/templates/document.lyx 2008-03-05 09:22:49 UTC (rev 6522)
+++ trunk/documents/templates/document.lyx 2008-03-05 10:23:08 UTC (rev 6523)
@@ -32,7 +32,7 @@
\footskip 1cm
\secnumdepth 3
\tocdepth 3
-\paragraph_separation indent
+\paragraph_separation skip
\defskip medskip
\quotes_language english
\papercolumns 1
If you are working on a large project or there are many active projects in your repository, you may find it useful to create separate notifications for each module or project. If notifications are sent via e-mail, you can also configure the subject field to indicate which module or repository the notification belongs to, making them possible to process with standard e-mail filtering rules.
Conclusion
If you are already doing all of the above, great for you! If not, adding even a few of these to your work habits can make a difference. Of course, not everyone is in a position to change the structure of their project or the repository configuration, but any programmer can make their life easier with logically grouped commits and meaningful commit messages. Consider giving it a try, you might like it.
Please share your thoughts.
Notes:
How Important Is Your Keyboard?
July 20, 2008
As a programmer I spend a significant amount of my time punching keys on a keyboard while writing code, and even documentation. Over the years I have also accumulated a wide variety of shortcut key combinations that I use for everyday tasks. Because of this, a keyboard’s layout and physical design is very important to me. In fact, I’m so dependent on a decent keyboard that I bring my own keyboard to work.
Although programming and writing in general is possible with almost any input device, I prefer my keyboards to have certain qualities:
-
It should have the standard 104/105 key layout. If the keyboard provides additional keys, they should not be located in places where any of the standard keys normally are. Manufacturers not following this simple rule is especially annoying for people like me, who use keyboard shortcuts from muscle memory without conscious effort. For example, I once had a keyboard with a “power off” button in the Pause/Break location. After accidentally shutting down Windows several times when trying to access the System Properties dialog (Win+Break), I removed the offending keys permanently.
-
The keys should be somewhat durable. I don’t require the stamina of a Model M, but the keys shouldn’t fall off during the first week either.
-
When the keys do fall off, they should be possible to put back on. This is useful for fixing stuck keys or cleaning the keyboard.
-
The keys should be fast and agile. When I press a key it should respond immediately, and when released it should pop right back into place. I’m not a fan of the “machine gun” sound of buckling springs, but the keys must feel “real”.
-
It must be able to keep up with my typing speed. I find this especially annoying with many RF-based wireless keyboards, as they don’t respond fast enough. As a minimum it should be able to cope with the typematic rate settings of shortest repeat delay and fastest repeat rate.
-
It should be as slim and minimalistic as possible. I’m not fond of extra stuff like embedded palm wrists and other “ergonomic improvements”. I want the keyboard to take up as little space on my desk as possible.
Of course, these are only my personal preferences. What works for me may not work for you, but if you are looking for a decent keyboard for programming or extensive writing, here are some of my recommendations:
- KeyTronic KT2001
- Das Keyboard (review on Sladshdot)
- Unicomp Customizer (based on the old IBM Model M) (review)
- Logitech Deluxe 250 and Logitech Internet 350
- Dell Multimedia Keyboard (also available as wireless Bluetooth kit with mouse and Bluetooth dongle)
- Logitech UltraX Premium
So, how important is your keyboard? Are you comfortable typing on anything from a Happy Hacking to a Microsoft Natural, or do you have some special preferences?
A Tribute to Snake
June 30, 2008
In the history of computer games, I think Snake might be one of the most cloned games ever. At least, it has always been my favorite game to implement when learning a new programming langauge or trying a new platform. To me, Snake is the “Hello, World” of game programming.
Even though other classics like Pong, Space Invaders and Tetris are also fun to program, I always preferred Snake for its simplicity. The game is so simple, it can be implemented in less than 50 bytes on a PC, yet even the most minimalistic version still has good entertainment value. Also, considering that you can get it on anything from calculators and mobile phones to advanced gaming consoles, I guess I can’t be the only fan.
I think one of the main reasons I like Snake as a programming exercise is because it’s an easy way of applying known algorithms and concepts in a new environment. I already know where I’m going, I just have to find out how to get there.
Most of my Snake implementations require a minimum of capabilities from the programming language or operating environment, like:
- Drawing characters or pixels at given coordinates on screen1.
- Reading characters or pixels at given coordinates on screen (not required, see below).
- Accepting input from keyboard.
- Invoking a real-time rendering loop or frequent events/callbacks.
- Obtaining (pseudo)random numbers (for placing food).
If I’m able to implement Snake in a new programming langauge, at least I know the language is not completely useless. I may not know if it’s turing complete, but at least I know it’s “Snake complete”.
Limitations
Another reason I like to program Snake in different languages is the fun of the challenge when there are limitations in your environment. If you are a seasoned programmer, doing Snake in mainstream languages like Assembler, C/C++, Java, Visual Basic, C#, Python or Perl may be a trivial task. However, if you try setting some boundaries for your program, things might change. For example, Snake is an excellent candidate for size optimizing, both for smallest binary and smallest source code. Or how about programming a Snake in Excel, using the cells for “pixels”? Personally, I find the “odd ones” the most fun to write.
For example, the 4DOS batch processing language doesn’t allow reading characters from arbitrary locations on the screen. Unfortunately, the only storage space provided by the language is environment variables (and files, but I didn’t want my program to have “external dependencies”). However, the game only operates in 80×50 resolution, so the screen is small. I was therefore able to encode the entire screen into strings stored in environment variables and updating them whenever the snake moved. This allowed me to check for collisions in much the same way as reading a character from the screen would2.
snake.bat, a snake game implemented in the 4DOS batch file processing language.
I also implemented the Snake in the mIRC scripting language3, but this provides functioanlity to read pixels from the screen, so the screen array was not required.
snake.ini, a snake game implemented in the mIRC scripting language.
Extensions
If you want to go beyond the basics, you can easily extend the game with more features as your experience level with the language, toolkit or platform increases. For example, you may add things like:
- computer opponent(s) with AI.
- two-player or multiplayer mode (with networking).
- sound effects.
- improved graphics and visual effects.
- barriers and other obstacles, like walls or enemies (extended collision detection).
- guns and ammunition.
- support for peripheral input devices.
So, do you have any game, algorithm or concept you like to implement to learn a new language, toolkit, platform or system? Is Snake still a usable “Hello, World” exercise for people learning programming today, or are there better ways?
Please share your thoughts on this. All comments are welcome.
Notes:
- OK, I have to admit, I did implement a Snake in PHP and JavaScript that would redraw the entire screen (HTML page) on the server side, but you get the idea.
- The tail array could also have been be implemented in this way to remove the current length limit.
- In case you are wondering, yes, I did have a lot of spare time when I wrote these. Both the 4DOS and mIRC snakes were written about ten years ago, when I was still in high school.
Configuration, preferences, settings, options, properties—whatever you call it—all but the simplest applications allow the user to customize some of their functionality (and often appearance). But how do you implement this? Is there a best practices for programming application preferences in a clean, easily maintainable and well-structured way?
I recently began working on my latest hobby project, a desktop application for viewing photographs using (hardware-accelerated) 3D rendering, implemented in Qt and OpenGL. It’s been a while since using Qt, so I started small with a basic skeleton application, adding a menu bar, a toolbar and a status bar. I also implemented menu items to toggle the toolbar, the status bar and a full screen viewing mode.
The application doesn’t do anything interesting yet, but even with this simple logic there are already several variables that could (and, in my opinion, should) be stored between sessions:
- Show/hide the toolbar?
- Show/hide the status bar?
- Show application in full screen, normal or maximized mode?
- Size and position of main window.
The Configuration Object
Most application frameworks (and some programming languages) provide utility classes for reading and writing persistent variables from and to configuration files or the system registry, like QSettings, wxConfig and java.util.Properties. The simplest way of providing persistence for application settings is to use these classes directly whenever needed. However, using them directly will often lead to duplication of code and possibly troublesome maintenance if access is spread across many modules and classes. Because of this—and other reasons I will mention later—I prefer to collect all variables and the code to read/write them in a separate configuration object.
For my simple Qt application, the code might look something like this:
class Config {
public:
bool maximized;
bool fullScreen;
bool showToolBar;
bool showStatusBar;
QString windowPos;
void read() {
QSettings settings("phex3d", "phex3d");
maximized = settings.value("maximized" , false).toBool());
fullScreen = settings.value("full_screen" , false).toBool());
showToolBar = settings.value("show_toolbar" , false).toBool());
showStatusBar = settings.value("show_statusbar", true ).toBool());
windowPos = settings.value("window_pos" , "" ).toString());
}
void write() {
QSettings settings("phex3d", "phex3d");
settings.setValue("maximized" , maximized);
settings.setValue("full_screen" , fullScreen);
settings.setValue("show_toolbar" , showToolBar);
settings.setValue("show_statusbar", showStatusBar);
settings.setValue("window_pos" , windowPos);
settings.sync();
}
};
By making an instance of the configuration object available as a global variable (or a singleton, if that makes you sleep better), I can now easily reference persistent settings from anywhere in the application. For example, the event handler for toggling the toolbar could be something like this:
void Window::toggleToolbar(void)
{
bool visible = toolbar->isVisible();
config.showToolBar = !visible;
if (visible)
toolbar->hide();
else
toolbar->show();
}
As long as I remember to call Config::read() on startup and Config::write() on exit, the settings will be saved and restored without any extra work needed.
Centralized Information
Although simple, the above solution will get somewhat messy if many variables are involved. For every new variable, extra code must be added to read() and write(). If we also think ahead a little, and take into account that these options will need to be exposed in a configuration dialog, allowing the user to change their values, we can recognize the need to associate some more information with each of them:
- The name of the option, typically used to identify it in a configuration file or the registry.
- A short description of what the option means and what part of the application is affected by changing it. This text will typically be used as a label for the check box, edit field or other widget used to change the variable in the configuration dialog.
- A more elaborate help text, suitable for use as a tool tip or in a separate help dialog.
- The default value, useful if you want to allow the user to reset something to “factory defaults”.
- The data type of the option.
You may wonder why the variable names and data types are relevant in this context—they could stay hardcoded, like before—but if you consider more advanced configuration interfaces, like the Firefox about:config feature, they can be very useful.
To collect all information about an option in one place, you might define an Option class looking something like this:
class Option {
public:
enum OptionType { INT, STRING };
private:
QString name;
QString desc;
QString help;
int defInt;
QString defString;
OptionType type;
void *value;
public:
Option(int *var, const QString &name, int def, const QString desc = "", const QString &help = "" ) {
this->type = INT;
this->value = var;
this->name = name;
this->defInt = def;
this->desc = desc;
this->help = help;
}
Option(QString *var, const QString &name, const QString &def, const QString desc = "", const QString &help = "" ) {
this->type = STRING;
this->value = var;
this->name = name;
this->defString = def;
this->desc = desc;
this->help = help;
}
const QString &getName() { return name; }
const QString &getDescription() { return desc; }
const QString &getHelpText() { return help; }
OptionType getType(void) { return type; }
int getInt(void) { return *((int *) value); }
int getDefaultInt(void) { return defInt; }
const QString &getString(void) { return *((QString *) value); }
const QString &getDefaultString(void) { return defString; }
void setInt(int value) { *((int *) this->value) = value; }
void setString(const QString &value) { *((QString *) this->value) = value; }
};
If you are wondering why the option value is stored as a pointer and not a local variable inside the Option class, I did this because I wanted them to reference the corresponding class variables in the configuration object, thus allowing me to continue accessing hem directly elsewhere in my application. It’s kind of a hack, I know, but it works. If you don’t like it, you can always use the get*() functions instead. Also, if you know a better solution, or how to solve the type info situation with templates, please share.
Now that we have all the information wee need about each option wrapped in a class, we can add a list of Option instances in our Config class to simplify and generalize the read() and write() implementations. The new Config class might look something like this:
class Config {
public:
int maximized;
int fullScreen;
int showToolBar;
int showStatusBar;
QString windowPos;
Option *_maximized;
Option *_fullScreen;
Option *_showToolBar;
Option *_showStatusBar;
Option *_windowPos;
QList<Option *> option_list;
Config() {
settings = new QSettings("phex3d", "phex3d");
_maximized = addOption<int> ("maximized" , &maximized , true );
_fullScreen = addOption<int> ("fullscreen" , &fullScreen , false);
_showToolBar = addOption<int> ("show_toolbar" , &showToolBar , false);
_showStatusBar = addOption<int> ("show_statusbar", &showStatusBar, true );
_windowPos = addOption<QString>("window_pos" , &windowPos , "" );
}
~Config() {
for (QList<Option *>::Iterator i = option_list.begin(); i < option_list.end(); i++)
delete *i;
delete settings;
}
template <typename T>
Option *addOption(const QString &name, T *var, T value, const QString &desc = "", const QString &help = "" ) {
Option *option = new Option(var, name, value, desc, help);
option_list.append(option);
return option;
}
void read() {
for (QList<Option *>::Iterator i = option_list.begin(); i < option_list.end(); i++) {
Option *option = *i;
if (option->getType() == Option::INT) {
QVariant value = settings->value(option->getName(), option->getDefaultInt());
option->setInt(value.toInt());
}
else {
QVariant value = settings->value(option->getName(), option->getDefaultString());
option->setString(value.toString());
}
}
}
void write() {
for (QList<Option *>::Iterator i = option_list.begin(); i < option_list.end(); i++) {
Option *option = *i;
if (option->getType() == Option::INT)
settings->setValue(option->getName(), option->getInt());
else
settings->setValue(option->getName(), option->getString());
}
settings->sync();
}
private:
QSettings *settings;
};
Outstanding Issues
The above solution works fine for my current needs in the application, but as we all know, needs change over time. I can already think of several outstanding issues that are not covered by this design, which might be needed in the future.
It would be useful to provide a list of valid values for each option. For example, if an integer option can only be between 1 and 100, this information should also be stored in the Option object. The same with text to display in a drop-down list or for auto-completing commonly used values as they are being entered in an edit field. If advanced validation or many different validation algorithms are used, this might be better solved by adding a reference to a validation object responsible for validating values for a given option.
As time passes, code is typically rewritten and programs restructured. This can eventually lead to the need, or simply the desire, to also rename options. For example, if the option window_pos describes the main window position, you may want to rename it to main.window_pos when more windows are added to the application. The issue can also arise when option names are automatically generated from class names (which is quite common for Java properties). When the name of the class changes, so will the option name. For this reason, it could be useful if the Option class was extended to provide a list of name aliases. The read() function could then be updated to use the value from an alias if found, but write() would only save it under the new name.
Another disadvantage of my simple design is that the Config class must know about all the options in the application, and therefore could get a tighter coupling with the various application modules than you might prefer. If your application supports custom extensions via plug-ins, it might be useful to implement functionality to add options to the Config object at run-time. For options that are added this way it will also be useful to provide a lookup function for retrieving the Option object based on name, i.e. by storing the objects in a map keyed by the option name as it was provided when the option was added.
I am likely to discover even more issues once I start implementing the configuration dialogs, but at least now I have an application that remembers where I left the window.
Please share your thoughts on this. Any feedback is appreciated.
Books Every Programmer Should Read
May 11, 2008
After reading Jeff Atwoods comments on the fact(?) that programmer’s don’t read books I was a bit upset. Not because I disagree with him; my experience also shows that most programmers, at least of the ones I have met and worked with in the past, neither read nor own books about their profession—I even remember reading the mentioned paragraph in Code Complete years ago and laughing at just how well I recognized this phenomenon.
However, I personally think reading books is one of the best ways to learn new stuff and gain a better understanding of the world, so I feel a bit sad that not more people are doing it. For this reason, I now present you with my must-have list of books for programmers:
Code Complete: A Practical Handbook of Software Construction
by Steve McConnell
If you are only ever going to read a single book about computer programming, this should be the one.
This book opened my eyes to a whole new world of computer literature. Until I found this well of wisdom, I had only read technology-specific books such as the GW-BASIC User’s Guide, Mastering Turbo Assembler and the Beginner’s Guide to You-Name-It. Finally I had found a book that actually gave me pratical advice on how to become a better programmer. I was surprised to discover that there were in fact people out there who had the same thoughts about programming as me, and even better, they were writing books about it!
I have only read the first edition of this book (the 1993 one), but the second edition has been updated with example code in Java and probably more neat stuff, so I assume it’s even better now.
The Pragmatic Programmer: From Journeyman to Master
by Andrew Hunt and David Thomas
If you are only ever going to read two books about computer programming, this should be the second one.
This book is a great collection of software development techniques and practical programming advice. Simply put, if you read this book and apply the techniques and principles described here, you will most likely write better programs and have more fun doing so.
Peopleware: Productive Projects and Teams
by Tom DeMarco and Timothy Lister
Software managers don’t manage software. Project managers don’t manage projects. They all manage people!
In order to understand how to write good software you must first understand the people who write it. This book will help you do just that. If you are anything like me, you will probably see yourself in much of what is described here, and even if you don’t, you are likely to learn some new things about programmers, how they act and how they think.
The Mythical Man-Month: Essays on Software Engineering, Anniversary Edition (2nd Edition)
by Frederick P. Brooks
There is no silver bullet.
This book is so timeless it’s scary. Even after more than three decades, the topics covered by these essays are just as relevant today as they were in 1975. The book proves, once again, that people never change. We still do the same mistakes over and over again.
Again, I have only read the first anniversary edition, so I’m not sure what the second edition of the anniversary edition (does that even make sense?) is all about, but I’m sure it can’t be anything bad. Go buy it and find out for yourself!
The Psychology of Computer Programming: Silver Anniversary Edition
by Gerald M. Weinberg
Programming is a human activity. Humans are strange creatures who behave in odd ways sometimes. Programmers possibly even more so than others.
This book explains how programmers think and how they act as human beings. Again, the argument that peolpe really don’t change applies, so the book is just as relevant today as it was in the 1970s. It’s fascinating to read how the problems of punch card programmers on ancient mainframes were exactly the same as the challenges facing software developers today. If nothing else, the book will give you an insight into how programming was done in 1971.
The Design of Everyday Things
by Donald A. Norman
If you are ever going to design anything, ever, you should read this book.
And even if you are not, you should read it for a good laugh. It’s hilarious, although also a bit sad, to see how many flaws in the design of everyday objects that we interact with in our daily lives are almost directly transferable to software development and programming. The mistakes are the same, only the implementations differ. Instead of a buggy program with an inherently complicated user interface you have a door that won’t open or a watch you can’t figure out how to use.


