Monday, 29 January 2007

Pointers demystified

Note: The RichText blog has moved to www.ricroberts.com

Over the last few months, my day-job has required that I do a bit of C++ development. Previous to this, the last time I did commercial C++ was in one of my first developer roles back in 2002, so I've recently been rediscovering it's joys.

The thought of C++ often scares people, but that needn't be the case. I was lucky enough to come to C# after learning C++, but the transition the other way shouldn't be that difficult either. In fact, I would definitely recommend that C-sharpers get to know C++, if they don't already, as it provides a greater understanding of C#'s inner workings..

Much of the syntax is similar, and one of few big differences is memory management (in the form of pointers). This post hopes to demystify pointers and thus hopefully remove one of the final barriers from people having a go at C++.

What the heck is a pointer anyway?

A pointer is just a variable that holds a memory address. (i.e. the location of a value in memory). They perform a similar function to reference-types in C#, which have a value on the stack which is just an address of somewhere on the heap.

Making a pointer

To make a pointer, you just declare variable of a certain type, but precede the name with an asterisk:

int *pMyPointer = NULL;
(I just said that a pointer is a variable that holds a memory address, so you might be thinking: "Why do you need to give it a specific type?". Well, the type you specify is the type of the variable that will be stored at the memory address to which the pointer "points". This lets the compiler know how much space to reserve at that memory location e.g. 4 bytes for a 32 bit integer etc.)

Anyway, in the example above, I created a pointer which will point to an integer, and I gave it the value NULL. i.e. it points to nothing yet. (Note that NULL in C++ just means 0).

The opposite of a pointer

The opposite of a pointer, I suppose, is the address-of operator (&). This lets you get the memory address of a variable.
int myCount = 132;
pMyPointer = &myCount;
Here, the address of the myCount variable is stored in the pMyPointer pointer. So now, it points to our integer with value 132.

Assigning values at memory locations

You can assign to the location in memory to which you're pointer points by using the syntax:
*pMyPointer = 12;
Here I've asked to store the value 12 in the memory address of the pointer pMyPointer. i.e. where we've stored myCount. This is called indirection. (Notice that the asterisk means a different thing here, than when you declared your pointer. Here the asterisk means "the value held at...").

You can also use indirection to get stuff back out of addresses referenced by pointers. e.g. to get the value held at the address in pMyPointer:
int myNumber;
myNumber = *pMyPointer;
Since we've changed the value stored at the location to which pMyPointer points, myNumber will get assigned the value 12.

Heaping it up

In C++, when you use the 'new' keyword, it returns a memory address on the heap, and you need to assign it to a pointer. e.g. to assign enough space on the heap for an int, and put the memory address in a pointer called pYourPointer:
int *pYourPointer = new int;
Then you can assign a value like before.
*pYourPointer = 500;

My memory is leaking!

Now, when you're working with things on the heap, they don't go out of scope when your method ends, unlike variables on the stack. As pointers are variables like any other, if ones go out of scope you lose the only thing that told your program where to look in memory for your precious object. That memory is now unavailable until the program quits. Continuing with our example, adding the line:
delete pYourPointer;
deletes/frees up the memory that the pYourPointer pointer points to, so it can be used again. If we forgot to do that, and then pYourPointer went out of scope, we would have 'leaked' memory.

You can also leak memory if you reassign a pointer without calling delete first. e.g. if we didn't call delete on the 2nd line of this bit of code.
MyObject *pPointy = new MyObject(1,3);
delete pPointy;
pPointy = new MyObject(4,5);
The final line above sets pPointy to another memory address, and without deleting the contents at the original location, it would be lost forever! (or at least until the program ends). You might think: "My computer's got loads of memory, and the program will end soon enough!"... but what if it was a Windows Service that ran indefinitely? If there were lots of loops leaking memory then it might not be long until you ran out.

After you call delete on an object, for safety's sake you should always null-out the pointer to avoid leaving a pointer dangling, and accidentally using the deleted pointer again (which would cause your app to crash). Note that it's always safe to call delete on a null pointer.

pPointy = NULL;

Destructors

The example above illustrates another point (no pun intended). Calling delete on an object calls its destructor. If your class MyObject had member variables which themselves were stored on the heap, then you would delete them in the destructor. C++ destructor has the same syntax as a C# destructor i.e.
~MyObject()
{
// do cleaning up here!
}
but C++ destructors are called manually, rather than by the Garbage Collector as in C#.

Geting at your members

One of the weirdest things for me, coming back to C++ after years programming mainly C# was getting used the way you access members on objects on the heap again. Say you've got a pointer for a Person object, and the Person class had a GetNumberOfToes() method, you'd use the points-to operator (->) to call it. (Note that the familiar dot operator is used for objects on the stack). e.g.
Person *pMyPerson = new Person();
int toes = pMyPerson->GetNumberOfToes();
delete pMyPerson;
Note: You can actually call the method on the objected pointed to by pMyPerson like this:
(*pMyPerson).GetNumberOfToes();
i.e. call GetNumberOfToes() on the object held at the memory location pointed to by pMyPerson... but that just looks naff.

Another use for the Ampersand... References

Having read all that, you might start poking round some C++ projects that you've got access to, but were previously too scared to open. For completeness, I thought that I would continue to explain another use for the & operator, so you don't end up completely confused when you see it everywhere.

The & operator is also used for references. References are really just a way of giving an object another name. Anything done to the reference is done to the original. e.g.
int myInt;
int &refToMyInt = myInt;
Now assigning a value to either, will set the same value at a single memory location. e.g.
refToMyInt = 930; //actually sets myInt to 930


Passing by reference

This is a familiar concept to C# programmers, and is simple in C++ too. You just declare a function with the parameters that you want to pass by reference with preceding ampersands. e.g.
int DoSomething(int &intOne, int &intTwo)
{
intOne = intOne + intTwo;
return intOne;
}
and when you call it, just pass the values in 'as normal':
int anInt = 1, anotherInt = 2;
DoSomething(anInt, anotherInt);
The integers passed in can be changed by the body of the method, making anInt 3 and anotherInt 2.

Want more?

There's loads more I could talk about here (I'm getting a bit carried away!), but I'd be here all night if I carried on. If you want to know more, buy a book!



Digg Technorati del.icio.us Stumbleupon Reddit Blinklist Furl Spurl Yahoo Simpy

Please also visit the Swirrl blog

Friday, 26 January 2007

Setting up rails on OS X (including radrails)

Note: The RichText blog has moved to www.ricroberts.com

Hello. This post documents how to get rails going on Mac OS X (I'm using 10.4.8, by the way).

Well, basically I followed these brilliant instructions on Hivelogic.com (except I used ruby 1.8.5 instead of 1.8.4) and it just worked! The only little quirk was getting the lightTPD web server to work properly. I had to tweak the paths on the first line of the dispatch.cgi, dispatch.fcgi and dispatch.rb files (in the public folder of my rails project) as they were pointing to a windows-style path.

Radrails pretty much just worked "out of the box" too, but to get radrails to start the lightTPD server properly, I set up a symbolic link like this...

ln -s /usr/local/sbin/lighttpd /usr/bin/lighttpd
...as radrails was having trouble finding it initially. (Credit for this last bit goes to Marc from the radrails community). Note that even after this link is setup, the radrails never reports the lightTPD server as 'started' - it just stays as 'starting...', but it does actually seem to start it properly. I also noticed that the lightTPD server starts on the last port you used for it, and it doesn't pay attention to the one that you specify in radrails.

Remember this post from September where I explained how to setup debugging in radrails on Windows XP? Setting up debugging for radrails in OS X is almost exactly the same, but in the interpreter arguments page of the debug dialog you'll want to put something like:

-I"/usr/local/lib/ruby/gems/1.8/gems/rails-1.2.1/bin"

Also, I had trouble getting debugging to work with lightTPD, so I've put the following in the program arguments to force it to use webrick instead.

webrick -p3001

That's it. enjoy!

UPDATES (March 07):
There are now updated instructions on Hivelogic.com.
You might also want to try Textmate and ruby-debug



Digg Technorati del.icio.us Stumbleupon Reddit Blinklist Furl Spurl Yahoo Simpy

Please also visit the Swirrl blog

Thursday, 25 January 2007

when is private not private?

Note: The RichText blog has moved to www.ricroberts.com

If a method within a class is marked as 'private', one might expect that it is only accessible from inside of that instance. Well, you'd be wrong (at least you would in C# - and I assume the rest of Microsoft.NET too).

In C#, if you define a member as private, then it is indeed accessible only to that class. However, it isn't restricted to only that instance: Any instance of that type can access the private members of another instance of the same type. This is a bit counter intuitive to me, and freaked me out a bit when I first saw it, as it seems to break encapsulation! Try it if you don't believe me.

Ruby has a different approach, and one which goes some way towards placating my OO-purist tendencies. As you'd expect, public methods can be called by anyone. Protected methods can be invoked only by objects of the defining class and its subclasses (ANY instance of the defining class). Private methods can only be called in the current object's context. i.e. you can't invoke another object's private methods, no matter what it's type.



Digg Technorati del.icio.us Stumbleupon Reddit Blinklist Furl Spurl Yahoo Simpy

Please also visit the Swirrl blog

ruby instance variables in derived classes

Note: The RichText blog has moved to www.ricroberts.com

I thought I would tell the world about something in the ruby language which surprised me a bit. Anyone who has used ruby for any length of time will no doubt already know this, but it took me off guard a little, coming from a C++, java and C# grounding in OO.

First a bit of background for the uninitiated... In ruby an instance variable is one preceded by an @ sign and, as with all variables in ruby, you can create one simply by assigning to it.

Anyway, in ruby, instance variables are automatically accessible to any derived classes. This is especially pertinent in rails, when dealing with the (base) ApplicationController. If you add a before_filter method to the ApplicationController, which assigns some instance variables, they are magically available in the derived controllers, without having to add accessor attributes to the base class.

Here is a very simple example...

class Parent
def parent_meth
@var = 'hello'
end
end

class Child < Parent
def child_meth
@var
end
end

>> child = Child.new
=> #
>> child.parent_meth
=> "hello"
>> child.child_meth
=> "hello"
As I said, I'm sure this is child's play for most rubyists, but I've read Dave Thomas's Ruby "Pick Axe" book, and don't remember seeing this explicitly said anywhere. In languages like C#, variables are private unless made otherwise (e.g. by use of an access modifier, or exposed by a property/method). It appears that in ruby, instance variables are by default protected. (Don't get me started on access modifiers - that's for another blog entry!).

I hope this post helps prevent some chin-scratching for other ruby newbies.

(If you're intersted in reading more about this topic, here's a discussion I started in the ruby forum).



Digg Technorati del.icio.us Stumbleupon Reddit Blinklist Furl Spurl Yahoo Simpy

Please also visit the Swirrl blog

Friday, 5 January 2007

MacBook - Dead Pixel ...Day17

Note: The RichText blog has moved to www.ricroberts.com

In my last post I described how my new Apple MacBook arrived just before Christmas with a dead pixel in the screen. My replacement MacBook finally arrived last night, the screen is perfect and everything else seems to be fine with it too.

I've managed to get all my data/applications off the old one and onto the new one without much fuss. All I've got to do now is arrange for the defective one to be picked up by the courier.

Apple took their time with the replacement, but overall I'm pretty pleased with the way that this has been sorted out, considering some manufacturers would have just told me to live with the dead pixel.



Digg Technorati del.icio.us Stumbleupon Reddit Blinklist Furl Spurl Yahoo Simpy

Please also visit the Swirrl blog