Tag Archives: programming

Elementary Software Design

There is a process to be followed when designing or redesigning software systems, whether that process is the same for every system or not.  There are just some things that need to happen that will help solidify the product and get some kind of stability in the software and in the customer’s confidence.  This is a general overview, if for no other purpose than to remind myself of it.

1. You must have a problem.  If not, what the hell are you doing?  Software tells computers to solve problems in most cases.  In all others, the problem is that the programmer doesn’t know something and is moving through a project to learn an aspect of development.  Those learning projects are the hardest to get through sometimes.  Anyway, either you or your customer has a problem that can be solved with computers, and that’s why you’re here.

2. Gather requirements.  This should always be the next step, and it should always be very early – before you pick a platform, language, development team, or flavor of Jolt, gather the requirements from the customer and start hacking at the root of the problem.  This is the most important step, because it is from here you will begin to draft your agreement with the customer about the scope of the project.  Requirements gathering should be done as diligently as possible to ensure you and your team know the scope or constraints of the solution you’re developing.

3. The agreement.  This is where you hammer out exactly what problem you’ll solve and every aspect of the solution.  This statement of scope will define the limits of the product and the customer should be told that they cannot add anything once an agreement is reached.  Imagine getting to the end of a nine-month project and the customer changes the requirements…  It will waste time and money, and WAY more than you might think at first.  You’ll never really know what it’s like to have an insatiable desire to strangle someone until you have a customer trying to chisel something else into the project during testing.  Make sure everything is in the design to begin with, and work hard until it’s all agreed upon.  It’s much cheaper to work something in at the beginning than to wedge it in at the end.  Just ask Microsoft.

4. The Design.  Once the requirements and scope have been determined and agreed upon, it’s time to begin the specification for the software system.  This includes a functional design specification and a software design specification.  The two are closely related, but still distinct.  A functional design specification is a document describing what functions the system performs at a high level.  Since you’re building the software from the ground up, a system that performs these functions likely doesn’t exist.  If it did, we might buy that one instead of building this one.  So – given its uniqueness, we need to define its functions and general workflow.  A software design specification takes those functions defined in the functional spec and describes them in a way the code monkeys can understand.  The software design specification takes into account the limitations and nuances of programming languages and how computers work and applies that to the problem at hand to perform the function necessary in each part of the system.  This is the nitty-gritty spec that should describe everything down to the last detail so that when it’s handed over to a programming team, there will be no questions.

5.  Code.  But there will always be questions, so be ready.  During the implementation phase (the part where code is written and actually run on a computer), if the system is large enough there will be many iterations and backs-and-forths about minute details ranging from data types to interface colors.  Again, most of the details should be in the software design specification but, well, we’re human.  It always happens.  Only when the machines start writing perfect code can we relax about this phase.

6.  Test.  Often the most painful phase of a software project is the testing phase.  If it’s not painful, you’re not doing it right.  In my opinion, most of the time testing teams are needed, not just a single software testing engineer.  You have to throw absolutely everything you have at the code to see how bulletproof it is.  Testing tools, load generators, keyboard monkeys, children, everything.  Also, continual testing is necessary for ongoing maintenance to make sure newly-discovered vulnerabilities are being plugged and tested against in each iteration of deployment.  In short, leave no stone un-turned and never let a place get dusty.  Be very diligent at all times.  Software testers are not temps.

7. Document.  Every large team has turnover, no matter what the rate is.  People quit, die, get offers from their dream company, have mental breakdowns, etc.  In these cases where you’ve got a new person replacing a project veteran, documentation is essential to reducing wasted time.  First, the functional and software design specs should be up-to-date (and also interesting enough not to put the reader to sleep) and ready for anyone to review.  Each developer team member should also keep a work log of some sort to keep track of those small daily decisions he or she makes during the coding phase; the next guy may not know why you did what you did.

In conclusion, I can’t say that these perspectives on the development process are always necessary with every software development project, but some aspect of these steps will come in to play in everything that isn’t a two-line script.  Whether anything is documented during the conception or planning stages is irrelevant; those things happen whether you write them down or not.  Each project has a beginning, a development, and (it is hoped) a use afterwards in the real world.

No matter what development methodology is used, design, testing, and documentation are all of equal importance.  If you’re managing your software project by the seat of your pants, get a grip on things and start documenting.  Stop lying to yourself about that timeline and conjure up some real figures.

I’ll end with some books I recommend:

Joel Spolsky, Joel on Software: And on Diverse and Occasionally Related Matters That Will Prove of Interest to Software Developers, Designers, and Managers, and to Those Who, Whether by Good Fortune or Ill Luck, Work with Them in Some Capacity

Steve McConnell, Rapid Development: Taming Wild Software Schedules
and Code Complete: A Practical Handbook of Software Construction, Second Edition

Andrew Hunt and David Thomas, The Pragmatic Programmer: From Journeyman to Master


What I Could Do

I’ve written quite often about how I don’t like where I live and that I should get out and go to a big city where people are more diverse and there are more opportunities.  I have an apartment in Atlanta now and while I really hate admitting this, I was wrong.

There is an overwhelming number of assholes here.  I hadn’t thought about that.  I’m an asshole, but usually only in my head.  Too many of the people here are assholes out loud.  I am a social person and I can’t help that.  But I prefer to be a recluse when everyone around me can’t stop yapping about themselves.  The competition here is so fierce that nothing gets done and people get hung out to dry instead of properly informed and trained.

Today I’m reading my Eclipse IDE: Pocket Guide in preparation for beginning Android application development.  My plan is to have such a grasp on that platform that I could work for anyone, from anywhere – including my house in the country.  I am also working through Hello, Android and then off to two other books on the platform.  The last time I started working on Android I began coding on the first day of study and got locked into an all-night hacking session trying to work out my project while researching the SDK.  Not anymore.  I’m giving myself the fundamental education so I don’t have to do so much hacking and have so many problems at once.  If you’re interested in this progress, keep a watch out at blog.twoleg.com.  I expect to release a simple app for free just to get the hang of it.  It probably won’t be anything groundbreaking – probably an enhanced flashlight application or something.  Not a whole lot of design considerations in that realm.  I will try to post progress at least weekly.


Some Books

I got my courage up Saturday and ordered the books from O’Reilly. This press has long been highly regarded by technologists, whether they are programmers, IT professionals, or just geeks. Go ahead – ask a geek if he/she has a camel book, and chances are they’ll know what you’re talking about (and it will be within reach). Don’t tell them what it is if they don’t know.

I’m posting this to chronicle my efforts to build a web crawler and eventually a search engine. I expect to make further posts about how this project develops, and perhaps what I’ve found in these books that helped.

I have ordered three books. I went there for one, but there’s always a deal to get three for the price of two, plus free shipping. And I can always find another book to get. So:

Perl & LWP. This one I’ve borrowed before, and it opened my eyes to the possibilities of automated web surfing using Perl. I built a small script one time that looked up my SMTP server’s IP at spamcop, then e-mailed me if my mail server was ever blacklisted. It was fun and quite easy, but since I can’t find that script right now I’ll have to post it later.

Spidering Hacks. I ordered this one for obvious reasons. This book’s excerpts is where I found that little bit on needing my spider registered. I expect to learn a lot and become very frustrated with what I find here.

Perl Cookbook. This was the third choice because I needed three. Also because it’s $50 and I could use the discount. There apparently is a series of “cookbooks” that have really cool stuff (recipes) in them. There is also the PHP Cookbook, the C# 3.0 Cookbook, and more. I expect to find shortcuts and things I’d never thought of in this book.

Light Reading

I’m taking a class right now on software requirements engineering (does one actually engineer the requirements, or did they just want to make this class sound hard?) and I came across something I might use with the web crawler project.

In the chapter about “The Software Process” which talks about the processes necessary for an individual or team to succeed at building a quality piece of software or system, I came across the Personal Software Process, or PSP. The book simply states that every developer has a process, whether anyone can see it or not. Either way, there is a proper way to go about producing software at a personal level, and here is the gist (Pressman, 2005, p.37):

Planning. This activity isolates requirements and, based on these, develops both size and resource estimates. In addition, a defect estimate (the number of defects projected for the work) is made. All metrics are recorded on worksheets or templates. Finally, development tasks are identified and a project schedule is created.
High-level design. External specifications for each component to be constructed are developed and a component design is created. Prototypes are build when uncertainty exists. All issures are recorded and tracked.
High-level design review. Formal verification methods… are applied to uncover errors in the design. Metrics are maintained for all important tasks and work results.
Development. The component level design is refined and reviewed. Code is generated, reviewed, compiled, and tested. Metrics are maintained for all important tasks and work results.
Postmortem. Using the measures and metrics collected (a substantial amount of data that shoul be analyzed statistically), the effectiveness of the process is determined. Measures and metrics should provide guidance for modifying the process to improve its effectiveness.

I’m not sure if what I’m doing will fit into this personal model of development, but it’s thought provoking. Even if I don’t collect data about what my problems might be and then analyze the data about what actually went wrong, I can still hold myself to some kind of process. Even though I don’t have a deadline or an antsy customer to deliver this to, I can possibly eliminate shortfalls if I just think it out before delving into code.

But then what fun would that be?

Reference (in our favorite APA format):

Pressman, R.S. (2005). Software engineering: A practitioner’s approach. New York: McGraw-Hill.

Executive Decision

After toying with C# today, I’ve decided that it is way to process-intensive to write the application on a runtime environment like .NET or Java. What I need is a simple language that can download a page, rip through text like a bandit, write the necessary fields to the database, and move on. I can organize the data when the search engine extracts that data.

I can’t commit to anything yet, but my spidey-sense is telling me that the crawler will be written in Perl with LWP. I suppose I could look at Ruby, too, but I already have my Camel book and have worked with LWP before. I haven’t tied Perl to a RDBMS, but I have done it with PHP and it must be similar. Perl can also do some limited recursion from what I understand, and if it can’t I may can use a database back-end to save the stacks of URLs.

I was ready to buy books at O’Reilly today (I chickened out of spending the money) and found a book on writing spiders. From the preview I surmised my crawler/spider must be registered. That means I have to go mainstream, doesn’t it?

And now after some more reading, I have discovered that this crawler can be used to build an index for special purposes. I can build my own search engine for this site, for example, and get much better results than I can searching the Google index for benrehberg.com. I have searched for things I know I wrote about, but never found them with Google. Building my own search engine and maintaining my own index of the site can prove useful if I keep writing about programming.

Update: I have created a new label “Web Crawler” for all posts related to this project.

How to Write a Search Engine

It seems a bit strange using the world’s best search engine to find out how to build your own. Google is my first resource in this project, though Google itself provides nothing but the idea. There is a paper at Stanford by Larry and Sergey, and that basically is the starting point. That is Google’s only contribution so far aside from the many searches I will perform.

There are three main parts to the search engine: the crawler, which tirelessly captures data from the web, the database to hold everything, and the actual search engine – the queries that put the data together in a meaningful format for you.

I could write a search engine that actually crawls the web looking for my search criteria, but that is very VERY inefficient. Google (and many others) have solved this inefficiency by effectively downloading the Web (that’s right – as much of it as they can) to their computers so it can search it much faster and have it available in one place. They’ve done a whole lot more to increase efficiency and effectiveness of searches, but downloading the web was the first thing they did. It turns out they needed a lot of computers.

I’m going to start with two. I have three desktops that no one wants to buy, and I am really tired of looking at them. I will probably need more if I get this index working soon, but there will be software considerations to make too. You can’t fit the web on one computer, no matter how big. I will learn a lot.

I have always had an interest in distributed systems and cluster computing, so this will be fun. I have a lot to learn about distributed databases and algorithm analysis. But all that is later – I haven’t even really finished thinking out the preliminaries yet. So one development/crawling machine, and one database machine. After I figure out how to crawl the web, I will begin work on performing searches. If this project holds my interest long enough, I might publish statistics at 49times.com, so keep looking. I will be posting here if I come up with anything worth publishing. I’m going to try to journal my progress and decisions without publishing code, but I realize that I very well could lose interest in this. If I get started, I will likely enjoy it and keep going, but no one can say. If you have some confidence that I will continue, you can subscribe to this blog and get the updates. Beware, though, that you’ll get everything else I write too.

And I’m Spent…

It is working. After a long battle all day yesterday (and giving up on Apache), Ruby on Rails is running. The rest of my configuration is yet to be done (no database yet), but all in good time. Take nothing for granted: this is a very powerful server. Here are the specs (and yes, it is 2008):

Fedora Core 8
450MHz Pentium II

Should serve very well for the amount of traffic I expect at 49times.com.