mardi 16 décembre 2008

C++ interview questions/inspiration - part 1

Interviewing candidates is a tough proposition. Every interviewer has a set of favourite questions that they expect to be good differentiators. In this first part I'll give you one of mine. Others will follow.

This is not a test

There is a difference between a test and an interview. I consider tests to be offline - i.e. it can be done by the candidate alone in a room, the results are compiled later by a corrector. Tests are of course very useful, but I think they are even harder to design, especially if you don't want them to last 8 hours.

I always ask the candidate to speak to me while he's writing code. Especially when he's changing his mind and uses the eraser. The whole point is know how he thinks. When he's heading down a wrong path, stop him. Ask questions. Can he understand what's wrong?

Don't do this at home

I mean ; if you are trying to raise the bar at your company, don't design the interview questions alone. Ask your fellow programmers to help. The implicit question is : what's necessary to know to work here? What's the strict minimum one person can master and be helpful? I expect that you will be able to reach a consensus that includes many pieces of knowledge not mastered by all the programmers you work with. If used smartly (no humiliation!), the list can be transformed into a wonderful tool of education.

Question 1

The procedure reverse_words takes a string of characters as input and reverses the order of the words. For example, the result of reverse_words( "one two three" ) is "three two one". On the board, code two versions of this procedure. The first in your favourite programming language, the second in C++.

Runtime performance is important, but you should first consider code clarity and maintainability.
All the good programmers I know can give a pretty good answer very fast. Yet, many candidates fail miserably.

Deux, c'est mieux

There is no such thing as a good C++ programmer that only knows C++. And, of course, if the candidate prefers C++ to anything else, he should have a very good excuse :-) Also, some programmers will have the knee-jerk reflex of using low level constructs when they code in C++ but they are able to come up with quite decent interfaces when they work with a cleaner language. The differences in the implementations they provide can lead to revealing discussions.

Oh -O

The candidate should be able to talk about the big-O runtime complexity of his algorithms. He should also be able to tell you how much memory is necessary. If he comes up with a O( n2 ) algorithm, now is a good time to ask if he can imagine a cheapest implementation.

The devil is on your side

Because it's easy enough to think of algorithms that work, it's ok to focus on the implementation details. If memory is allocated, how is it released? What happens if exceptions are thrown? Should there be an error code returned? If the interface takes a const char*, what happens if it's null?

So, what would be your ideal answer?


goto part 2

lundi 15 décembre 2008

object.method() is not the state of the art

Mainstream is mediocre

Mainstream object-oriented programming languages are very lame at what they say defines them. C++, C#, Java and so many other object-oriented programming languages are so far behind the state of the art - it's shameful.

The simple truth is that Object.Method() is not the state of the art. If that's what it looks like when you invoke a member function, it means you don't have multimethods. Do you have access to your metaclasses? Can you customize your object model to increase performance, to help migrate code?

It's making us worst programmers

The lack of power of our programming languages means we produce code that's more verbose. Sadly we get used to it. We may even find a visitor beautiful ;-) When there is more code, we get better at ctrl-c ctrl-v ctrl-h. That leads to programmers asking for better refactoring tools for our preferred language. Sometimes, those who sell the refactoring tools also have a big say about the evolution of languages. Of course they'll take the easy path : they will give us what we ask. They won't make the language more powerfull - it would be stupid. First, because it would probably make the refactoring tools less reliable and more complex. Also, they would kill a nice market..... it's not in their advantage to make the languages less verbose.

The solution is our impatience

We don't know what we're missing. There is no buzz about CLOS because it's old and there's no big corporation supporting it with millions in marketing. If we find out what's out there and how good it is, the microsofts and suns of the world will understand they have to push for a better - more powerfull - language. We'll tell them! The solution is to read. Here is the deal. I give you a few suggestions of reading material, and you do the same.

My suggestions :
  1. Object-Oriented Programming in Common Lisp: A Programmer's Guide to CLOS but you probably want to read up on lisp first.
  2. The Art of the Metaobject Protocol


What should I read?

vendredi 12 décembre 2008

boost::python vs swig::python - a benchmark

Disclaimer : benchmarks are evil, this one is no different. It probably measures something that is completely irrelevant. But, you know, benchmarks are fun.

What I did : I generated a very simple C++ library : it contains only one-liners that return integer constants. Then I used boost::python and swig::python to make python wrappers for it.

The runtime test

The test is simple enough : I call one of the trivial functions 1000000 times and measure the time using this :


stmt = """
import fun
s = 0
for i in xrange(1000000):
s += fun.fun0()
"""
import timeit
T = timeit.Timer(stmt)
print min( T.repeat(10,1) )





There you have it : swig generates faster code in this useless, totally artificial, case.

When size matters

In many cases (embedded applications, for example) size matters. In the following graphic you can see the size of the linked pyd file (in kb) with 10, 100, 1000 and 10000 nearly identical on-liners generated.



Swig sure has a bigger size overhead when your are binding only a few functions. Then boost grows faster, maybe the infamous code bloat associated with template metaprogramming - but take this with a *huge* grain of salt.

The versions

I used boost 1.37 and swig 1.3.36 on windows vista with the microsoft toolchain (32 bits C/C++ compiler version 15.00.30729.01).
The C++ compiler flags were

/FC /O2 /arch:SSE /arch:SSE2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /D_SCL_SECURE_NO_WARNINGS /D_CRT_SECURE_NO_DEPRECATE /FD /EHsc /GR /W3 /nologo /c /TP /errorReport:prompt


and the linker flags were

/NOLOGO /SUBSYSTEM:CONSOLE /MACHINE:X86 /OPT:REF /OPT:ICF


I'm still using python 2.5.2.

jeudi 11 décembre 2008

boost::python iterators on non-standard contrainers

If you use C++, you have to love boost. For one thing, it helps you get away from C++ by offering a nice bridge to python :-) Boost has always great documentation, but from time to time, one or two extra samples would be of help. Here is one very small contribution.


You have a C++ container type that is accessible to python using boost::python. Of course, you want to iterate on all the elements from python code like this :


my_container_instance = make_a_container()
for x in my_container_instance:
print x


When we look at the documentation, we find that we need to introduce the binding for __iter__ like this :


.def("__iter__", iterator<vector<int> >())



And it works, if your container is a std::vector<int>


My container is a polygon, not a std::vector. It does not have a typedef for iterator, it has a typedef for Vertex_iterator. It has no begin() method, it has a vertices_begin() method. You could have a similar problem. If that's the case, you can expect a compilation error


c:\...\boost\python\iterator.hpp(50) : error C2039:'begin' :
n'est pas membre de 'Polygon'


Here is the offending code :


// Guts of template class iterators<>, below.
template <bool const_ = false>
struct iterators_impl
{
template <class T>
struct apply
{
typedef typename T::iterator iterator;
static iterator begin(T& x) { return x.begin(); }
static iterator end(T& x) { return x.end(); }
};
};


And it's called by that :


// An "ordinary function generator" which contains static begin(x) and
// end(x) functions that invoke T::begin() and T::end(), respectively.
template <class T>
struct iterators
: detail::iterators_impl<
boost::is_const<T>::value
>::template apply<T>
{
};


That's not hard to fix, and there is nothing to modify in boost::python. What a nice illustration of the Open Closed Principle.

The only thing to do is to specialize the iterators template to provide our own begin/end functions. Somewhere in the .cpp file before you added the __iter__ declaration, add something like this code :


namespace boost
{
namespace python
{
template <>
struct iterators< Polygon >
{
typedef Polygon::Vertex_iterator iterator;
static iterator begin( Polygon& x) { return x.vertices_begin(); }
static iterator end( Polygon& x) { return x.vertices_end(); }
};
}
}


And have fun :-)

mercredi 10 décembre 2008

My take on C++ template metaprogramming

Fortunate accident or deliberate mistake, C++ template metaprogramming should be a tool in any C++ programmer's toolbox ; or should it? Here is what comes to mind when I think about it (in no order at all)

C++ template metaprogramming...

  • useful
  • painful
  • functional
  • long compile times
  • code bloat
  • optimization
  • verbose
  • typename
  • necessary to avoid duplication

With a little work I can put some order in there.

The Good : useful, optimization, necessary to avoid duplication
The Bad : painful, long compile times, code bloat
The Ugly : Well, yeah. It's ugly :-) Syntax Schmintax.

The Good

You need it - you really do. Object-oriented programming failed to kill code duplication. In C++, the only way to write code that has a size pretty much proportionnal to the features you implement is with generic programming. Pretty soon you will see that genericity is not free, you need to consider the different types of parameters you can deal with. If all you want to do is dispatch to different implementations based on type properties you will need to know a little about C++ template metaprogramming. If you are up for a challenge and you try to build your algorithms and data structure in a truly generic way, you'll need to go through whole books... not that it's a bad thing.

The Bad


template< typename src_functor_type, typename kernel_accessor_type >
inline pixel_functor< horizontal_convolution_pixel< pixel_functor< src_functor_type >, kernel_accessor_type > >
convolveh( const pixel_functor< src_functor_type >& src,
const kernel_accessor_type& kernel,
typename kernel_accessor_type::value_type norm )
{
typedef typename horizontal_convolution_pixel< pixel_functor< src_functor_type >, kernel_accessor_type > fun_type;
return pixel_functor< fun_type >( fun_type( src, kernel, norm ) );
}



That's what happens when it's used in real life. Since that's some of my own code (from scalex), I hope there are worst examples out there.

Can you see what's happening in that function? It's a simple call to a constructor. That's it. This function exists only so that users don't have to name explicitly all the types that they use. They should not have to because, obviously, those types can be deducted from the context. So there you have it : C++ template metaprogramming means you have functions where the declaration of the return type is longer that the function itself.

There was a time when I wished one day this could be possible :


template< typename T >
auto my_function( const T& parameter )
{
return some_valid_cpp_expression;
}



But not anymore. I have lost faith that it will ever happen and if it ever does it will be too little, too late. Because the C++ standard moves too slowly, C++ needs to be programmable.

The Ugly


(spp
(let-syntax ((for-range
(syntax-rules ()
((_ type var begin end body ...)
(for (type var begin) (< var end) (++ var) body ...)))))

(function int main ()
(decl float[256] array)
(for-range int i 0 256
(= (array-ref array i) 0.0))
(return 0))))


Lisp! Maybe you did not expect "the visual appeal of oatmeal mixed with nail clippings" but I warned you! Hey, at least it works - the previous "spp" code generates this :


int main ()
{
float array[256];
for( int i = 0; (i < 256); (++ i) )
{
(array[i] = 0.0);
};
return 0;
};


In this little example code density is not better, but I hope you can get the idea. C++ template metaprogramming is in itself the wrong solution to the code verbosity problem; lisp macros are a better idea. SPP is my take on this. What I'm shooting for with SPP is Bottom-Up programming in C++.

jeudi 4 décembre 2008

Syntax Schmintax

Syntax Schmintax. It's only syntax. Everybody can learn syntax, *everybody*.

When faced with a programming problem there are always multiple choices. We are supposed to sort them out according to multiple criteria. Here are a few
  • Robustness
  • Maintainability
  • Beauty/Ugliness
  • Originality, as in dry
  • Performance
Syntax Schmintax means that the beauty of a solution should be considered only when you know robustness, maintainability, originality and performance are maximal.

One example from C++ : static_cast, const_cast, reinterpret_cast and friends vs C-style casts. C++ style casts provide more information to the reader and to the compiler, are easier to find and have no cost in performance or otherwise. But they are ugly. They are syntax that's possible to avoid, that's possible not to learn. Learn it! Use it! Syntax Schmintax says : learning new syntax is easy.

It's easy to learn a new syntax that does pretty much the same thing as some other syntax you know. There is no semantics to learn. What you think about and how you think about it does not change. It's not like learning Scheme or Haskell (you should, by the way). Learning Scheme means you should understand what a continuation is, what a lisp macro is. Haskel is quite simply alien.

Syntax is vocabulary. Syntax is good for you. Learn what's out there. Invent beautiful syntax if you can, but ugly syntax is better than nothing.

mardi 2 décembre 2008

How I optimize Python code

I Compyle it :-)

I wrote a library that takes a python string and returns a proxy object that can be used like a python module. Inside the proxy module, there are proxy functions. When a proxy function is called, it looks at the type of its arguments and generates a dll specific to those types.

So you have something like this : python -> C++ -> dll -> really fast code.

How fast? Really fast. Thousands of times faster. Even faster than the C++ code I would normally write. The generated code runs really fast because it's threaded, because the generated code is pretty conservative (C++ compilers like conservative code) and because some image processing functions are recognized as intrinsics. The just-in-time compiler does not support the whole Python language, but it supports enough of it to optimize pretty complex algorithms. I use it mainly to optimize scientific applications, but if you'd like to try it out on other kind of applications (string processing? AI? rendering?) tell me, it may not be a big effort for me to add what's missing.

If you'd like to try out Compyle, send me a message or vist VLAM's website.