HotDog's Blog

Hotdog (Robert Verpalen) about C# and vb.net

vbCity Blogs moved to:
http://cs.vbcity.com/blogs
  Home :: Syndication  :: Login

JanFebruary 2010Mar
SMTWTFS
31123456
78910111213
14151617181920
21222324252627
28123456
78910111213

Articles

Archives

Topics

CONTACT

Fun but useful linkies

General

VS 2005

Wolfenstein ET

This guide holds a couple of points to what my personal view is regarding what sort of code is well maintainable. They are only my opinion and therefore by no means actual rules. Some subjects may seem quite obvious, but nonetheless these points are often not used.
The points are used as general as possible to programming languages in general. Most things are the same in procedural as well as object oriented environment. Sometimes reference are made to language specific elements in eg vba, vb6, vb.net or C#, but mostly it's just general programming.
The most simple rule: most copy and paste is bad! Cut and paste to a new function or method is much better.

The guide is a work in progress one, adding things as they come along. At this stage it's a looooong way from finished. For pseudo code a vb'ish approach is chosen, because this can be read by most programmers.

Index:
Constants
Structures
Loops
Split code
Use objects

Structures
Wrappers

Documentation

 


constantly use constants (or variables)

It doesn't have to be a constant, it can also be a variable, but reuse that value! If you have to use a value twice, put it in a constant if possible, otherwise a variable. It's that simple. It may be more work to set up, but it can save you a whole lot of headaches! The same with calculations, don't copy and paste them! If you need them twice, put the result in a variable or function.

A simple example: a tax percentage in calculations. Now here's something that's used a lot and might never change, but what if it does? In the Netherlands the percentage is 19% (for normal products and rates, food products and some other exemptions use a different percentage) It's very easy to simply use 1.19 each time to calculate the price including taxes. A previous employer used to do that and of course everything worked fine. Now the luck was, so far that value itself hasn't changed. The consequences would be enormous. It is used in global code, in forms, in reports and a simple search and replace of 19 would not be possible. What would be safer is to define a constant (double 0.19) and reuse that. For getting the price including taxes, you could use a function. That way: change the constant, and everything works ok. Reusing a value hard coded will give non compile errors which means a lot of harm can be done before the user ever finds out that he has calculated profits wrong for an extended period. And you know who'll get the blame.. (and rightly so in this case :-/ )
Having the calculation and value in one place is not only the best for if the value itself changes. Once I got the request of having the taxes variable per clienth, since the customer also had Belgian and German customers himself, for whom the percentage was different. He didn't expect this to be a big adaptation, but it certainly was! It would have been an adaptation anyhow, but there was no way I could guarantee that everything was calculated in the right manner, because all over the place the 19 was hard coded. This particular piece of software was written in Access (hey not my choice ;-) ) which made it harder: it could be used in reports, in calculated fields in a form and in macro's. Luckily the latter weren't used, but in forms and reports, that value could not be found unless going through all the seperate forms in designmode: no search and replace possible in those. As a sidenote, I ended up writing a tool that searched through all fields in all forms and in all reports, but all that could have been avoided simply by using function. It did have to be a function in this case, because the customer country is important as well, so the function needed at least 2 parameters. Changing the function would cause compilation to show all the points that need to be altered. Access is a bad example in this case because the calculated fields in forms and reports are never compiled. That's why I think Access is great for (and intended for) quick projects, but not for resale projects
The above is an example with a very rigid value, but I've seen primary keys used in this way -shock- . Yep, using a pkey hard coded is dangerous in itself, but it was used multiple times in the code and I don't mean with a variable.... You can imagine the fun that could create when dealing with a development backend versus a production backend... Besides: the readability is often much easier when using named variables.
Don't even get me started on repeated calculations :-p Ever seen this: If (somecalculation) > 0 then functionA(somecalculation) else functionB(somecalculation) A setup like this means that the calculation is always done twice (once for the comparisson and once for one of the blocks). Not only does this give an overhead, but it also means that if you want to alter the calculation, you need to do it three times. If the blocks are this small, you will notice that the calculation is redone, but with a couple of lines between them, you may miss that second calculation. All that could be prevented by simply assigning the result to a temporary variable. Not that much effort and much safer.

wrong:

function PrepareProductA_Order(Q as integer)    
      if Q mod 4 <> 0 then       
             display message: "the quantity has to be a multitude of 4"    
      else
               if CheckNumberOfBoxes(Q/4)
                    CreateLabel(Q/4,4) 
                    SubtractBoxes(Q/4)
end of function

In the example CreateLabel is a function that creates a barcode label and needs the number of boxes and the number of products per box. SubtractBoxes edits the value of boxes in stock. A logical next step would be to subtract the ProductA quantity or whatever, but more things miss from the code. No result is being returned for example, but that's outside the focus of what the example is meant for ;-) Better would be:
Constant ProductAperBox = 4

function PrepareProductA_Order(Q as integer)
    if Q mod ProductAperBox <> 0 then
       display message: "the quantity has to be a multitude of " & ProductAperBox
    else
       boxes as integer = Q/ProductAperBox
       if CheckNumberOfBoxes(boxes)
          CreateLabel(boxes,ProductAperBox )
          SubtractBoxes(boxes)
 end of function

Lots of variations might be possible on the code, but the general line is: don't hard code values unless necessary (retrieving the value out of an user editable backend might be better) and if you do hard code them: do it once.

Another example: this I saw not only at my previous work, but is a very commonly seen scenario. Of course it isn't wrong, but if you add or remove a control in this scenario, you have to do it at 2 places. Sounds easy enough, but it's also easily forgotten.
This example only has 4 fictional controls, but actual code exists with a page filling blocks.
wrong:

if SomeCriteriumStatementHere
      Control1.Enabled = true
      Control2.Enabled = true
      Control3.Enabled = true
      Control4.Enabled = true

else
      Control1.Enabled = false
      Control2.Enabled = false
      Control3.Enabled = false
      Control4.Enabled = false

Better:

booleanVariable b = SomeCriteriumStatementHere
      Control1.Enabled = b
      Control2.Enabled = b
      Control3.Enabled = b
      Control4.Enabled = b

Better still: use loops The example above still depends on a repitition of code which could also be 'dangerous'. See the loops section if you want to know why ;-)

Back to index


 

Structurize with structures

One much occuring scenario is that an address (like in the real world you know, not like one of those pointer thingies) is used. For that address a couple of variables of different types are used. But those variables are rarely needed individually, mostly the entire address is used. Then can't we pass the address as a whole, as a single element. Of course we can :)

For an example why don't we take something not so everyday as an address of person data. With those by now we know what to expect and use structures right away (don't you ;-) )
Let's take a simple scenario where a program keeps score. Important for procedures in this point in time is just the Name and score.

-- under construction --

IIn the beginning days the only choice we had of keeping data indexed, was with a 'multi-dimensional' array. As long as you're working on that bit of code your head knows exactly what arr(0,1,3) points to. You may even have it documented (you should anyway ;-) ), but having to change something a couple of months later on is a headache generating business.
And what for? If we make a structure (as it's called in .net, the vb6 and vba equivalent is a Type) or even a class, that array can contain a much, much clearer picture of what it contains. Granted, if you want to Loop a lot, the multidimensional array can be more convenient, but languages such as the .net ones, can implement that behaviour with reflection.

Can't remember of languages of Turbo Pascal had something as types, but they have been around for quite some time now. Still they were one of the unknowns to a lot of vb6 programmers. The main reason of this was probably that many available examples still used multi-dimensional arrays. Don't get me wrong, sometimes multi-dimensional arrays are still needed. It will depend on the situation whether a structure is used or a multidimensional array, but often the latter is used unnecessarily.

--Example coming later ---

Back to index


 

Use loops

Loops are one of the first thing that were introduced into programming, because that simply was the only way that you could ensure for example going through an entire collection of unknown length. At the most basic level, this simply means that the 'next' returns to the memory address the first line of the 'for' began, until a criterium is met which jumps to the address after the 'next'.

Loops are mostly used for going through collections, but why limit yourself to existing collections? In the constants section, the following example was used:

booleanVariable b = SomeCriteriumStatementHere
      Control1.Enabled = b
      Control2.Enabled = b
      Control3.Enabled = b
      Control4.Enabled = b

Nothing wrong with that code on itself, but what a shame of putting .enabled every time... This on itself may be quickly enough done, but now you decide, it has to be enabled, but readonly. Search and replace, yeah sure, why not. Now you decide you want to make it readonly and set a tooltip. You get the picture.

Array controls = {Control1,Control2,Control3,Control4} 
booleanVariable b = SomeCriteriumStatementHere
foreach LoopControlVariable in
controls
     LoopControlVariable.Enabled = b

Adding controls is simply adding to the array. Changing the behaviour for all is simply changing inside the loop. The array creation itself depends on the language being used, but there's always an easy way to create one(although in vb6/vba I generally had to create inbetween functions using a paramarray :-/ ) The controls in the example are called neatly called ControlX, making another loop possible, but actual controls won't be that neatly numbered of course. If you use the same collection multiple times, that of course can be put in its own functions or declared at the form initialization.

Back to index

 

Split up code

Long blocks of code are often the hardest to debug. It is unclear, what portion does what. Even when commented start of this and end of that, what the debugger is looking at, is a long list of text. Splitting that block of code into several sub-methods is easier to look at and helps with the reusability.

This section will be a somewhat abstract one, since there are no clear outlines as to when code should be split up. Then again, if you have seen methods that fill a couple of pages, you'll hopefully agree with the need of splitting up code.

In the 'old' days, blocks of code were all that was possible. Methods did not exist. The code lines were numbered and you could goto/gosub such a line, but no matter how much I liked the old C64, that's not something I'd choose for now ;-)
Even with the procedural programming that was introduced next, you'll still have some scenarios that will call for long blocks of code because of all the variables used, but in the OOP world, that's no excuse either.

For more on the latter, there's the 'Use objects' section, but to start with the more general approach, imagine a form constructor (or load event), which initializes that form. It sets up a new dataconnection, it determines which controls should be disabled for the current user and builds up a (context)menu specific for that same user.

You could say they sound like specific steps, and so they also can be easily implemented: a method for each step. This may sound like overkill at first, but keeping those steps separated will also prevent variables interfering with each other. Though the downside could be that more variables may have to be declared, because they can no longer be reused, this also helps with adapting the code at a later point. -- if there is an amount of variables reused in the methods, implementing an object instead should be considered --

If the 1st advantage is readability (and thus maintainability), the 2nd advantage would be the reusability. If the current-user changes or his/her rights change, steps 2 and 3 would have to be done a 2nd time. Keeping that in mind, the structure of 3 methods for the 3 steps, is not entirely what we want.

As you might have commented in yourself already, having steps 2 and 3 completely separated, would call for determining the current user twice and the way the current user is determined may change some time as well. The first step there of course would be to use a function.
The 2nd is to have one method that calls the methods for steps 2 and 3. This method can be seen as the method that sets the user options. All in all, instead of one potentially large block, the layout would look more like:

Method SetDataConnection

UserObject function GetCurrentUser()
(this can be a global function )

Method SetUser
       UserObject UserVariable = GetCurrentUser()
       SetControls(UserVariable)
       CreateMenu(UserVariable)

End Method

(for those languages that support overloading, an extra option would be to insert an extra method:
     Method SetUser(UserObject))

Method SetControls(UserObject)

Method CreateMenu(UserObject)

I realize all too well, that this would be too much overkill if the steps themselves are just a couple of lines of code. Unless I need to reuse them, I would keep those lines together in the load event as well. But still neatly in their own little blocks, so if they needed to be split after all, methods could be created quickly.
All in all, it depends on how large the block is becoming and not always will we be able to have anticipated the code-growth, but it's something to keep in the back of our heads, just in case we will need to split that code up later. And of course, if it's known from the beginning that the code will become a large block, split it up right away.
Finally the example above would cause double coding if the menu creating and control enabling were entangled in the same loop or used a lot of the same variables. Be sure to think of creating an object in that case!

Back to index


 

Use objects

As described in the Structures section, keeping multiple related values inside a structure (= also an object) keeps them nicely together and easier to pass around. But now we want to be able to do all sorts of things with those values, we need to check if an address is valid or if a user has specific rights. Or maybe we just have to do a lot of calculations and keep a subresult.

Now what is often seen, is that a bunch of methods are added, that take the structure as a parameter or worse yet if no structures are used, a lot of loose variables. But the whole intention of OOP is that a method can be called from within that object itself.

Although the difference in a lot of modern OOP languages no longer exists in this manner, for this guide we'll see structures as objects that just contain values (think of vb6 types) and objects as elements that can also contain methods/functions and properties. In this viewing point, all structures are objects, but an object doesn't have to be a structure.

 For the sake of mentioning it: in modern languages as .net, structures can have methods and functions as well. The difference is in how they are stored, but that's outside the scope of this guide. There are several articles on structures versus objects on the internet if you look for them.

Objects and OOP is not just something invented for the fancy words. They really can make your programming life a lot easier. Further than that, they make some programming that seemed almost impossible, much easier. We just mustn't forget to use those options.

Back to index


 

Keep things strongly typed

Back to index


 

Wrap things up

 

Back to index




Document your code

Yeah, yeah, I know, is probably your first reaction. At least it would be if I read this :D Still, this very point is the most easily neglected. Especially when in the middle of developing, the start goes well, documenting the important functions. Then you delete that function and all documentation was for nothing. That's the main reason I skip documentation, until there's a form of final draft. Only.... then you're busy with something else.

I wouldn't be able to tell you at which stage you should document. There may be alot of opinions on that subject, but the truth is: "it always depends". Many times documentation you will call a function and find it undocumented. Although late, this might be a good time to do a quick documentation (assuming it is your function). Since you're calling it, you'll probably know at that time what it does. The details will come back to you when going through it. But 2 months later, you'll have to dive much deeper to recollect the same. Let alone when working with multiple people on the same project. Especially .net has great methods to document functions and classes (with the summarize tags), making even creating an msdn style help interface a breeze, but you will still need to document.
The bottom line: if you skip such a basic part as doing a basic documentation, you may have a lot more work laid out for you later on.
NB: I'm absolutely not talking about those anoyingly long 'rules' that some companies implement on each and every function described with every function described. Commentary should be useful, not obligatory.

eg. some pseudecode (green is commentary, it's only pseudocode, so I didn't include any tags or commentary characters):

This is useless

----------------------------------------------------------------------
function: GetCurrentUserName
----------------------------------------------------------------------

description: returns name of current user

returns: name of current user

returntype: string

parameters: none

----------------------------------------------------------------------

string-function GetCurrentUserName
    return CurrentUser.Name
 end of function
Some companies expect comment blocks like this, so the programmer makes them, but adds nothing. The only effect the commentary has, is a Homer Simpson 'duh' reaction. Now imho it only makes finding code harder. I've seen the example above where the comment block is much larger then the function itself with non helpful commentary. Even where the function was only one line and the name of the function said everything that it did, it had about 6 lines of commentary formed as in the example above.
If you have a piece of code with a lot of these functions underneath eachother, this can be very annoying. Of course this is just an opinion. If you include commentary, make sure it says something (and otherwise keep it short ;-) ), eg:
returns the name of the user as it was validated in the current thread.
For more info see reference to CurrentUser property
---------------------------------------------------------------------

string-function GetCurrentUserName
    return CurrentUser.Name
 end of function

The line  (------) is optional and will depend on the IDE being used. In some IDE's the separation between functions wil be clear enough. In the .net designer, the summery and other xml tags make lines as this unnecessary.

Back to index

 


Back to index
posted on Wednesday, May 11, 2005 8:55 AM