Tag Archives: C++

Wrapping MapleSim C code for Python

Previously, I worked through a simple inverse kinematics example with at three link arm. The arm, however, was uninteresting, and in fact had no dynamics of its own. I’ve also previously (but not in a blog post) built up a number of arm models using MapleSim that were interesting, and ported them into C. From there I used them in Matlab / Simulink, but I’m on a Python kick now and tired of Matlab; so I’ve been looking at how to wrap this C code generated by MapleSim using Cython such that I can interact with it from Python.

So the code that simulates my models is exported to C, which is great in that it’s super fast, but it’s bad because it’s a pain in the ass to work with. I don’t want to have to write my controllers and learning algorithms all in C, it would be much nicer to do it all in Python. So I was looking at interfacing Python with the C library, and did a little (as seen in my last post), but then I already know how to make Cython play nice with C++ so I figured let’s shoot for that instead. What follows now is a hacky guide to getting your MapleSim model simulations to work with Python. First we’ll get the C code to just run, then we’ll port it to Python, then we’ll get it going graphically.

Get the MapleSim code to run in C++ with classes and input

Step 1: Rename your file from ‘whatever.c’ to ‘whatever.cpp’, in the code here I’m renaming ‘c2LinkArm.c’ to ‘2LinkArm.cpp’. I know, I know, it’s painful but hopefully that’s the worst of it.

Step 2: Add #include "mplshlib.h" to the beginning of the file. The code is meant to be run in a MapleSim environment (or something), so it requires one of the Maple library files. We’ll just add this in here and everything should be happy.

Step 3: For whatever reason, there is a function in here that sets all of the system input to 0. Why, you ask? Good question. The function in question is static void inpfn(double T, double *U), and you can see that all it does is set every element of U = 0. Now, you can either comment out the body of this function, or, several lines down, the first line of the SolverUpdate function (after a long i variable declaration) is the call to this function, so you can comment that out and then everything’s fine. Hurray! We can give input to the system now!

And that’s all the inline code editing we have to do. We’re not done with this file though, we still need to append some things so that we can use it easily.

Step 4: The reason that I want to use C++ is because with C++ we can make a class, which we’ll call the Sim class, that can store all of our simulation details, and hold some wrapper functions to hide the full blown interface to the simulator functions. Let’s make that class now. Go down to the bottom of the file and paste in the following code:
c2LinkArm.cpp

/*  A class to contain all the information that needs to
    be passed around between these functions, and can
    encapsulate it and hide it from the Python interface.

    Written by Travis DeWolf (May, 2013)
*/
class Sim {
    /* Very simple class, just stores the variables we
    need for simulation, and has 2 functions. Reset
    resets the state of the simulation, and step steps it
    forward. Tautology ftw!*/

    double* params;
    double dt, t0;
	double u0[NINP], other_out[NOUT+1], y[NOUT];
    double w[1 + 2 * NEQ + NPAR + NDFA + NEVT];

    SolverStruct S;

    public:
        Sim(double dt_val, double* params_pointer);
        void reset(double* out, double* ic);
        void step(double* out, double* u);
};

Sim::Sim(double dt_val, double* params_pointer)
{
    t0 = 0.0; // set up start time
    dt = dt_val; // set time step
    for (int i = 0; i < NINP; i++)
        u0[i] = 0.0; // initial control signal

    params = params_pointer; // set up parameters reference

	/* Setup */
	S.w = w;
	S.err = 0;
}

void Sim::reset(double* out, double* ic)
{
	SolverSetup(t0, ic, u0, params, y, dt, &S);

	/* Output */
	out[0] = t0;
        for(int j = 0; j < NOUT; j++)
            out[j + 1] = y[j];
}

void Sim::step(double* out, double* u)
/* u: control signal */
{
    for (int k = 0; k < NOUT; k++)
        out[k] = *dsn_undef; // clear values to nan

	/* Integration loop */
    /* Take a step with states */
    EulerStep(u, &S);

    if (S.err <= 0)
    {
        /* Output */
        SolverOutputs(y, &S);

        out[0] = S.w[0];
        for(long j = 0; j < NOUT; j++)
            out[j + 1] = y[j];
    }
}

int main (void)
{
    FILE *fd;

    double *ic, *p, *out;
    char* errbuf;
    long i, j, outd;
    long internal = 0;

    double dt = 0.00001;

    int time_steps = 1000000;
    double u[NINP];
    for (int k = 0; k < NINP; k++) u[k] = .1;

    fd = fopen("output.dat", "w");

    Sim sim = Sim(dt, NULL);
    sim.reset(out, NULL); // ic = NULL, use default start state

    for(i=0;i<time_steps;i++)
    {
        sim.step(out, u);
        fprintf(fd,"%lf ",out[0]);
        for(j=0;j<NOUT;j++)
        {
            fprintf(fd,"%lf ",out[j+1]);
        }
        fprintf(fd, "\n");
    }

    fclose(fd);

    return 0;
}

So, this is based off of the ParamDriverC() function from the MapleSim generated code. I just separated out the initialization and the simulation into two functions (reset() and step(), respectively), so that it’s possible to explicitly control the simulation stepping through time, and change the input control signal at every time step.

NOTE: The dt here must be <= 1e-5, or the constraint solver in the code throws an error. Don’t fear though, this simulator is still crazy fast.

OK. This code then can be compiled (once you have the library files in the same folder) and run with:

g++ c2LinkArm.cpp -out sim
./sim

You may have noticed a bunch of warning messages scrolling by, I’m not about to touch those and they don’t affect the simulation, so as long as one saying ‘error’ doesn’t pop up let’s just leave well enough alone.

Once you’ve run the executable you should now have a file called ‘output.dat’ filled with simulation data. The first column is time, and the rest are the output variables from your MapleSim model. Alright, we’ve got our code up and running! Let’s go an and make a Cython wrapper for it.

Making a Cython wrapper for the MapleSim code
The first thing that we need to do is take the main function we just added in out of the simulation code above. We’re done with running the C code on its own so we don’t need it anymore. Get rid of it already.

Now, this is going to be very similar to the code from the previous Cython posts, but in this one there’s no getting around it: We need to pass back and forth arrays. It turns out this isn’t very complicated, but there’s a template you have to follow and it can be difficult if you don’t know what that is.

I’ll post the code and then we can go through the tricky parts. As always, we create a ‘pyx’ file for our Cython wrapper code, I called this one ‘py2LinkArm.pyx’.
py2LinkArm.pyx

import numpy as np
cimport numpy as np

cdef extern from "c2LinkArm.cpp": 
    cdef cppclass Sim:
        Sim(double dt, double* params)
        void reset(double* out, double* ic)
        void step(double* out, double* u)

cdef class pySim:
    cdef Sim* thisptr
    def __cinit__(self, double dt = .00001, 
                        np.ndarray[double, mode="c"] params=None):
        """
        param float dt: simulation timestep, must be < 1e-5
        param array params: MapleSim model internal parameters
        """
        if params: self.thisptr = new Sim(dt, &params[0])
        else: self.thisptr = new Sim(dt, NULL)

    def __dealloc__(self):
        del self.thisptr

    def reset(self, np.ndarray[double, mode="c"] out, 
                    np.ndarray[double, mode="c"] ic=None):
        """
        Reset the state of the simulation.
        param np.ndarray out: where to store the system output
            NOTE: output is of form [time, output]
        param np.ndarray ic: the initial conditions of the system
        """
        if ic: self.thisptr.reset(&out[0], &ic[0])
        else: self.thisptr.reset(&out[0], NULL)

    def step(self, np.ndarray[double, mode="c"] out, 
                   np.ndarray[double, mode="c"] u):
        """
        Step the simulation forward one timestep.
        param np.ndarray out: where to store the system output
            NOTE: output is of form [time, output]
        param np.ndarray u: the control signal
        """
        self.thisptr.step(&out[0], &u[0])

This is pretty familiar: We make a reference to a class defined in C++, and then we wrap it in a python class and define functions that we can actually call from Python.

The main difference here is that we have to handle passing in numpy arrays, and having them be in the appropriate double* form that our C++ code is expecting to see. There are three parts involved in this. First, note at the top that we have an import numpy as np followed by a cimport numpy as np, this gives us access to the functions that we’ll need. Second, for each of the arrays accepted as parameter we have a special type declaration: np.ndarray[double, mode="c"] var. And finally, we pass the arrays on with a dereferencing &var[0]. If we were passing along 2D arrays we would use &var[0][0].

In the __init__() and reset() methods, I’m allowing the model parameters, params, and initial conditions, ic, arrays to be undefined, in which case NULL is passed to our C code and the default values we defined inside MapleSim are used. The reason for this is that these require some pretty intimate knowledge of the MapleSim model, and if a casual user wants to just use this business they shouldn’t have to know anything about the internal states.

With our wrapper code done now the only thing left is our setup file! This one is just straight from the previous posts, with the exception that only the ‘pyx’ file is included in the sources declaration because our ‘py2LinkArm.pyx’ file references the ‘c2LinkArm.cpp’ file directly, instead of referencing a header file.
setup.py

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
 
setup(
  cmdclass = {'build_ext': build_ext},
  ext_modules=[Extension("py2LinkArm", 
               sources=["py2LinkArm.pyx"],
               language="c++"),],
)

With our setup file we should now be good to go. Let’s compile! To do this I pull open IPython and punch in

run setup.py build_ext -i

And it will complain and throw some warnings, but it should compile just fine. And now we have access to the pySim class!
Note: that should read ‘build_ext’ with an underscore in there, I’m not sure why it’s not displaying but that underscore needs to be there or it won’t run.
To see it do some things, type in

import numpy as np
import py2LinkArm

sim = py2LinkArm.pySim()

u = np.ones(2) # input array
out = np.zeros(3) # output array

sim.reset(out)
print out # initial values

for i in range(10):
    sim.step(out, u)
    print out

And we can see it moving forward! If you set u = np.ones(2) * .1, which is what it was when we generated the ‘output.dat’ file using the C code only and run it forward you’ll see that the we get the same results for both. Excellent. OK now let’s display it!

Running and plotting the MapleSim arm model
In a previous post I used Pyglet to plot out a 3 link arm (a very poor, lines only plot, but sufficient), and I’m going to use it again here (for a very poor, lines only plot). The file is called ‘run.py’, and is less than 80 lines. I’ll post it and then walk through it below:
run.py

import numpy as np
import pyglet
import time

import py2LinkArm

def run(): 
    """Runs and plots a 2 link arm simulation generated by MapleSim."""

    # set up input control signal to simulation
    u = np.ones(2) * .1 
    # set up array to receive output from simulation
    out = np.zeros(3) 

    # create MapleSim sim class 
    sim = py2LinkArm.pySim() 
    # set up initial conditions of the system
    sim.reset(out) 

    ###### Plotting things

    # segment lengths = (.31, .27)m (defined in MapleSim)
    L = np.array([31, 27]) * 3

    # make our window for drawin'
    window = pyglet.window.Window()

    # set up some labels to display time 
    label_time = pyglet.text.Label('time', font_name='Times New Roman', 
        font_size=36, x=window.width//2, y=window.height - 50,
        anchor_x='center', anchor_y='center')
    # and joint angles
    label_values = pyglet.text.Label('values', font_name='Times New Roman', 
        font_size=36, x=window.width//2, y=window.height - 100,
        anchor_x='center', anchor_y='center')

    def update_arm(dt):
        """ Simulate the arm ahead a chunk of time, then find the 
        (x,y) coordinates of each joint, and update labels."""
   
        # simulate arm forward 15ms
        for i in range(1500):
            sim.step(out, u)

        # find new joint (x,y) coordinates, offset to center of screen-ish
        x = np.array([ 0, 
            L[0]*np.cos(out[1]),
            L[0]*np.cos(out[1]) + L[1]*np.cos(out[1]+out[2])]) + window.width/2
        y = np.array([ 0, 
            L[0]*np.sin(out[1]),
            L[0]*np.sin(out[1]) + L[1]*np.sin(out[1]+out[2])]) + 100

        # update line endpoint positions for drawing
        window.jps = np.array([x, y]).astype('int')
        label_time.text = 'time: %.3f'%out[0]
        label_values.text = '(01, 02) = (%.2f, %.2f)'%(out[1], out[2])
 
    update_arm(0) # call once to set up

    @window.event
    def on_draw():
        window.clear()
        label_time.draw()
        label_values.draw()
        for i in range(2): 
            pyglet.graphics.draw(2, pyglet.gl.GL_LINES, ('v2i', 
                (window.jps[0][i], window.jps[1][i], 
                 window.jps[0][i+1], window.jps[1][i+1])))

    # set up update_arm function to be called super often
    pyglet.clock.schedule_interval(update_arm, 0.00001)
    pyglet.app.run()

run()

Alright. In the first part, we’re just doing the same as above when we imported our brand new module and set it up for simulation.
What’s different is that we’ve now pushed the actual simulation into a method called update_arm. The idea is that the drawing is significantly slower than the simulation, so we’ll simulate a bunch, then update the plots once, then repeat, and hopefully get something that simulates at a decent speed. Once we have the joint angles of the arm stored safely in our out variable, we calculate the (x,y) locations of the elbow and wrist and update the system. Then the last thing this method does is update the text of the labels in the window.

In the on_draw method we’ve overwritten, we clear the window, then draw our labels and the lines corresponding to upper and lower arm segments.

Then the last thing is to schedule our update_arm method to be called periodically, in this case in extremely often.

When you run this file, you should see something that looks like:
pic
You’ll have to trust me it looks better when it’s moving, and when it’s moving it will look better when the input isn’t constant.

But there you go! Now you are able to take your very own MapleSim model’s optimized C code, append some C++ code to make a Sim class, wrap it in Cython, and access it as a Python module! Why would anyone ever want to go through this trouble? Well, finding good simulations with appropriate dynamics can be a huge pain in the ass, and sometimes you just want something stand-alone. Also, now that this is done, this exact same code should be good for any MapleSim model, so hopefully the process is relatively straightforward. At least that’s my dream.

The code that I’ve used in this post can be found at my github: 2LinkArm.

Addendum

This was all written for compiling on Ubuntu, my labmate and pal Xuan Choo also went through getting this all to work on Windows too, here are his notes!

Step 1: Get and install cython

Step 2: Follow the instructions here. Essentially:

  • 2a: Download and install the windows SDK for your python version
  • 2b: Start the windows SDK command shell (start menu->programs->windows sdk->CMD shell)
  • 2c: Type these things into the cmd shell:
    set DISTUTILS_USE_SDK=1
    setenv /x64 /release
    
  • 2d: in the same cmd shell, navigate to the folder with the pyx file you want to compile, and run:
    python setup.py build_ext -i
    
  • 2d*: If it complains about undefined numpy references, add “include_dirs=[numpy.get_include()]” to the Extension class call (also import numpy in setup.py).
    from distutils.core import setup
    from distutils.extension import Extension
    from Cython.Distutils import build_ext
    import numpy
    
    setup(
      cmdclass = {'build_ext': build_ext},
      ext_modules=[Extension("py3LinkArm",
                   sources=["py3LinkArm.pyx"],
                   language="c++",
                   include_dirs=[numpy.get_include()]),],
    )
    

Done!

Advertisement
Tagged , , , , ,

Cython Journey Part 3: Wrapping C code and passing arrays

Basic C interfacing example

OK, so this code is taken straight from an example here, I’m just going to explicitly work through all the steps so that when I try to do this again later I don’t have to look everything up again.

Let’s say we have some c code that we want to work with from Python, in a file called ‘square.c’:

void square(double* array, int n) {
    int i;
    for (i=0; i<n; ++i) {
        array[i] = array[i] * array[i];
    }
}

The first thing we’re going to need to do is compile this to a shared object file, so we punch the following into our terminal:

gcc -c square.c # compile the code to an object file
gcc square.o -shared -o square.so # compile to shared object file

And so now we have our square.so file. Super.

Here is the code that we can then use to interact with this shared object file from Python:

import numpy as np
import ctypes

square = ctypes.cdll.LoadLibrary("./square.so")

n = 5
a = np.arange(n, dtype=np.double)

aptr = a.ctypes.data_as(ctypes.POINTER(ctypes.c_double))
square.square(aptr, n)

print a

OK, so there are a couple of lines with a lot of jargon but aside from that this looks pretty straightforward. The first noticeable thing is that we’re importing the ctypes library. So, make sure that you have that installed before you try to run this code. Right after importing ctypes then we create a handle into our shared object file, using the ctypes.cdll.LoadLibrary() command.

Then it’s some standard python code to create an array, and then there’s this funky command that lets us get a handle to the array that we just created as a c pointer that we can pass in to our square() function. You can see a list of all the different kinds of data types here.

Once we have that handle to our array, we simply pass it through to the square.square() function, and let it go and do whatever it’s going to do with it. Then we print out our array a, and to our great joy, all of the values have been squared.

So that’s that. Instead of returning anything like double* from the C library I would recommend just passing in the array as a parameter and modifying it instead. This is all nice and straightforward enough, and it seems like the returning business can lead to a lot of memory allocation and ownership headaches.

The code for this example can be found at my github, here: Square.

Tagged , , , ,

Using the same random number generator in C++ and Python

Anyone who has converted code from one language to another, where there is a random number generator involved, knows the pain of rigorously checking that both versions of code do the exact same thing. In the past I’ve done a write out to file from one of the language’s random number generators (RNGs), and then read in from file, but it’s still be a pain in the ass as there’s some overhead involved and you have to change everything back and forth if you need to do debugging / comparison in the future, etc etc. After some searching, it doesn’t seem that there are any common RNGs, and the closest I found was a suggestion saying to write the same algorithm out in each of the languages and use that.

Well. I happen to know how to use Cython now, and rather than rewrite the same code in C++ and Python, I thought it was a perfect opportunity to put to use this knowledge. There’s a program called SimpleRNG for C++ (http://www.codeproject.com/Articles/25172/Simple-Random-Number-Generation), with a whole slew of basic random number generation methods, so the idea was just to take that and throw some Cython at it to get access to the same RNG in both C++ and Python. It turned out to be almost a trivial exercise, but with very useful results!

Since we already have the SimpleRNG.h and SimpleRNG.cpp code all written, taken from link above, we can start by making a .pyx file that will 1) provide a Cython handle to the C++ code, and 2) define a Python class that can call these functions. Remembering not to name the .pyx file the same thing as the .cpp files, for fear of the code generated by Cython overwriting my .cpp files, I added a “py” prefix. The first part of the .pyx file is just redefining all the functionality we want from the header file:
pySimpleRNG.pyx

cdef extern from &quot;SimpleRNG.h&quot;:
    cdef cppclass SimpleRNG:
        SimpleRNG()

        # Seed the random number generator 
        void SetState(unsigned int u, unsigned int v)

        # A uniform random sample from the open interval (0, 1) 
        double GetUniform()

        # A uniform random sample from the set of unsigned integers 
        unsigned int GetUint()

        # Normal (Gaussian) random sample 
        double GetNormal(double mean, double standardDeviation)

        # Exponential random sample 
        double GetExponential(double mean)

        # Gamma random sample
        double GetGamma(double shape, double scale)

        # Chi-square sample
        double GetChiSquare(double degreesOfFreedom)

        # Inverse-gamma sample
        double GetInverseGamma(double shape, double scale)

        # Weibull sample
        double GetWeibull(double shape, double scale)

        # Cauchy sample
        double GetCauchy(double median, double scale)

        # Student-t sample
        double GetStudentT(double degreesOfFreedom)

        # The Laplace distribution is also known as the double exponential distribution.
        double GetLaplace(double mean, double scale)

        # Log-normal sample
        double GetLogNormal(double mu, double sigma)

        # Beta sample
        double GetBeta(double a, double b)

        # Poisson sample
        int GetPoisson(double lam)

Look at all those functions! I left out two functions from the redefinition, namely double GetUniform(unsigned int& u, unsigned int& v) and unsigned int GetUint(unsigned int& u, unsigned int& v), for the simple reason that I don’t want to deal with reference operators in Cython, and I don’t have any need for the functionality provided with the overloaded GetUniform() and GetUint() functions.

Alright, the first part is done. Second part! Straightforward again, define a Python class, create a pointer to cppclass we just defined, and then make a bunch of handle functions to call them up. That looks like:
pySimpleRNG.pyx

cdef class pySimpleRNG:
    cdef SimpleRNG* thisptr # hold a C++ instance
    def __cinit__(self):
        self.thisptr = new SimpleRNG()
    def __dealloc__(self):
        del self.thisptr
    
    # Seed the random number generator 
    def SetState(self, unsigned int u, unsigned int v):
        self.thisptr.SetState(u, v)

    # A uniform random sample from the open interval (0, 1) 
    def GetUniform(self): 
        return self.thisptr.GetUniform()

    # A uniform random sample from the set of unsigned integers 
    def GetUint(self):
        return self.thisptr.GetUint()

    # Normal (Gaussian) random sample 
    def GetNormal(self, double mean=0, double std_dev=1):
        return self.thisptr.GetNormal(mean, std_dev)

    # Exponential random sample 
    def GetExponential(self, double mean):
        return self.thisptr.GetExponential(mean)

    # Gamma random sample
    def GetGamma(self, double shape, double scale):
        return self.thisptr.GetGamma(shape, scale)

    # Chi-square sample
    def GetChiSquare(self, double degreesOfFreedom):
        return self.thisptr.GetChiSquare(degreesOfFreedom)

    # Inverse-gamma sample
    def GetInverseGamma(self, double shape, double scale):
        return self.thisptr.GetInverseGamma(shape, scale)

    # Weibull sample
    def GetWeibull(self, double shape, double scale):
        return self.thisptr.GetWeibull(shape, scale)

    # Cauchy sample
    def GetCauchy(self, double median, double scale):
        return self.thisptr.GetCauchy(median, scale)

    # Student-t sample
    def GetStudentT(self, double degreesOfFreedom):
        return self.thisptr.GetStudentT(degreesOfFreedom)

    # The Laplace distribution is also known as the double exponential distribution.
    def GetLaplace(self, double mean, double scale):
        return self.thisptr.GetLaplace(mean, scale)

    # Log-normal sample
    def GetLogNormal(self, double mu, double sigma):
        return self.thisptr.GetLogNormal(mu, sigma)

    # Beta sample
    def GetBeta(self, double a, double b):
        return self.thisptr.GetBeta(a, b)

Again, very simple. The only thing I’ve added in this code is default values for the GetNormal() method, specifying a mean of 0 and standard deviation of 1, since I’ll be using it a lot and it’s nice to have default values.

Now the only thing left is the setup file:
setup.py

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

setup(
  name = 'SimpleRNG',
  ext_modules=[ 
    Extension(&quot;SimpleRNG&quot;, 
              sources=[&quot;pySimpleRNG.pyx&quot;, &quot;SimpleRNG.cpp&quot;], # Note, you can link against a c++ library instead of including the source
              language=&quot;c++&quot;),
    ],
  cmdclass = {'build_ext': build_ext},

)

And now calling it from IPython

run setup.py build_ext -i

And pleasantly, now, you can call the SetState(int,int) function and generate the same set of random numbers in both C++ and Python. Which is absolutely wonderful for comparison / debugging. It’s been super useful for me, I hope someone else also finds it helpful!

All the code for this post can be found at my github repository: pySimpleRNG. If you’re simply looking for what you need to get SimpleRNG in Python, then all you need is the SimpleRNG.o file! Drop it into your Python project folder and import away.

Update: It was pointed out that having a Python script example would be useful, which is very true! Here’s a quick script using this shared library.

from SimpleRNG import pySimpleRNG

a = pySimpleRNG()
a.SetState(1,1) # set the seed
a.GetUniform() # remember this number
a.GetUniform()
a.SetState(1,1) # reset seed 
a.GetUniform() # returns the same number as above
Tagged , , , ,

C++ debugging with GDB and Valgrind quickstart

As I’ve mentioned before, I’m coming into C++ from Python and Matlab. I’ve programmed C++ before, but it’s been a while, and handling the memory management myself and tracking down seg faults again is taking some getting used to. Here are two things that really helped me out.

Program – GDB
As the scale of the program I’m working on in C++ keeps increasing, debugging it becomes more and more of a pain in the ass to do using the old “add print statements” approach to pinpointing the source of the problem. After one seg fault too many I broke down and started looking at how to use GDB. Turns out, it’s very straightforward to get some basic (and very useful) functionality out of it.

Here are the steps:

  1. Add -g to your compile command, ie g++ -g test.cpp -o test
  2. Type in gdb test to your console
  3. Type run into the gdb console, then when your program breaks, type in backtrace or bt to get the line that your program broke down on, and the trace of function calls leading up to it!

There’s a bunch of other functionality built in to gdb, but I was just looking for a very simple, low overhead introduction to it and this was all I needed to get going. Having that line number can save a ton of time.

Program – Valgrind
Once my program got up and going though, I noticed that for longer simulation runs I got core memory dump errors. After a very short search I came across a program called Valgrind that profiles C++ code to find memory leaks. Turns out I had quite a few.
This program is also extremely easy to use. To install on linux is, of course, just sudo apt-get install valgrind. To use Valgrind to check for memory leaks in your program and get information about where they originated, from the console type:

valgrind --leak-check=full --show-reachable=yes --track-origins=yes ./test

If all is going well, then you should see something like this pop up at the end of the printout:

==9422== HEAP SUMMARY:
==9422==     in use at exit: 0 bytes in 0 blocks
==9422==   total heap usage: 31,127 allocs, 31,127 frees, 858,710 bytes allocated
==9422==
==9422== All heap blocks were freed -- no leaks are possible

It’s beautiful.
Valgrind tracks your allocations and freeing of memory, so if it finds that there are unmatched allocations, or you do something fairly tricky with your handling of memory, you’ll get a printout that lets you know there’s unfreed memory and will give you a trace of function calls leading to the memory that isn’t freed. It’s a great program, for more information they have a quickstart guide here: http://valgrind.org/docs/manual/quick-start.html#quick-start.intro.

As I said before, I’m getting used to memory management in C++ again; here are a couple of things that really perplexed me while I was trying to track down the memory leaks:

Perplexment 1 – Copy constructor on dereferenced pointers
I had been passing around pointers into class constructor, and then assigning them to local variables on the stack (so I thought). The idea was then when the class then was destroyed, the stack had taken over control of the memory and would free automatically. It looked like this:

class classA {
public:
    int a;
    VectorXf b;

    classA(); // constructor
};

classA::classA(int a, VectorXf *b) {
    this->a = a;

    if (b == NULL)
        b = new(VectorXf)(VectorXf::Ones(a));
    this->b = *b;
}

WRONG. Apparently this won’t even work at all most of the time, but the Eigen library has an amazing set of copy constructors that let this slide. What’s happening when you dereference this pointer is that a copy of the VectorXf that b points to is created. Then, the instance you created to pass into this constructor is left unfreed floating around in the ether. Terrible. The fix, is, of course, to make b a pointer to a vector instead. i.e.

class classA {
public:
    int a;
    VectorXf* b;

    classA(); // constructor
};

classA::classA(int a, VectorXf *b) {
    this->a = a;

    if (b == NULL)
        b = new(VectorXf)(VectorXf::Ones(a));
    this->b = b;
}

Now this also holds when you return pointers! If you have some function

VectorXf* classA::func1 getVec();

that is returning a pointer to you, you be damned sure you store that result in a pointer. None of this:

VectorXf x = *func1();

or any of its like or ilk, that is bad news. Also this:

VectorXf x = func1()->reverse();

no no no no no no. Memory leak. Same as dereferencing, leaves you with VectorXfs floating around in the wild.

And now you know. And knowledge is power.

Perplexment 2 – Putting virtual in front of base class deconstructor
The other thing that caught me up was that if you have a base class and inheriting classes set up, the deconstructor of the base class must be declared with virtual, or it isn’t called by the inheriting classes. i.e.

virtual ~baseClass();

Additionally, if you don’t have that virtual line in front of your base class deconstructor, none of the inheriting class deconstructors get called either! This was another one that tripped me up.

Perplexment 3 – delete a, b, …
This does not work! It compiles fine, but having a string of variables to delete in a row does not actually delete the latter variables. Perplexing!

Tagged , ,

Getting function run time in C++ on Linux

I wanted to time how long it took for one of my functions to run yesterday and it ended up taking me a little bit of time to track down how to do this in C++ on a Linux box, so I thought I’d throw up the code here.

#include <sys/time.h>
#include <iostream>
using namespace std; 

int main() { 

    struct timeval start_time; 
    struct timeval end_time; 
    
    gettimeofday(&start_time, NULL);
    aFunction(someParams); // function to be timed
    gettimeofday(&end_time, NULL); 

    // get difference, multiply by 1E-6 to convert to seconds
    float duration = (end_t.tv_sec - start_t.tv_sec) + (end_time.tv_usec - start_time.tv_usec) * 1E-6; 
    
    cout << "duration: " << duration << "s" << endl;
}

And that’s it! Not too much to it, but was a little difficult to track down.

Tagged , ,

N-dimensional Matrices in C++

I’m working on coding up an algorithm in C++, based on a Matlab implementation I wrote, and I’ve been using the Eigen library (http://eigen.tuxfamily.org/) for my linear algebra. It’s been relatively straightforward to use so far, but unfortunately it doesn’t (as far as I can tell) have any built in support for N-dimensional matrices. In fact, it appears that N-dimensional matrices aren’t really addressed by any of the standard C++ linear algebra libraries. So, to get my 4-dimensional matrix, I have to make an array of pointers pointing to an array of pointers pointing to my Eigen MatrixXf objects. I initialize my 4-D matrices using this function:

MatrixXf*** initialize_4dMatrix(MatrixXf*** matrix, int sizes[], float value=0.0) {
    matrix = new MatrixXf** [sizes[0]];
    for (int d1 = 0; d1 < sizes[0]; d1++) {
        matrix[d1] = new MatrixXf* [sizes[1]];
        for (int d2 = 0; d2 < sizes[1]; d2++) 
            matrix[d1][d2] = new (MatrixXf)(MatrixXf::Ones(sizes[2],sizes[3]).array() * value);
    }
    return matrix;
} 

and call it with:

int sizes[4] = {1,2,3,4};
MatrixXf ***m = initialize_4dMatrix(m, {1,2,3,4});

To then access the matrices, it’s (*m[d1][d2])(d3,d4); which isn’t bad for accessing one matrix at a time, but what if I want the jth element of the ith row of every matrix (i.e. element (i,j))? The Python access method m[:,:,i,:] (or just m[:,:,i]) would be ideal. Unfortunately, I don’t think there’s any way around writing another function to do it:

    static VectorXf* get_elements_across_matrix_array(MatrixXf** matrix, int size, int element[]) { 
        VectorXf* vector = new VectorXf(size);
        for (int i = 0; i < size; i++) 
            (*vector)[i] = (*matrix[i])(element[0],element[1]);
        return vector;
    }

And I would need to write another function that returns a MatrixXf pointer if I wanted to access the ith rows of each matrix.

To delete these matrices I also need to pass into my delete function the sizes of the array. It’s a pain that it can’t dynamically discover the array size, but I don’t think there’s any way around it in C++.

static void delete_4dMatrix(MatrixXf*** matrix, int sizes[]) {
    for (int d1 = 0; d1 < sizes[0]; d1++) {
        for (int d2 = 0; d2 < sizes[1]; d2++)
            delete matrix[d1][d2];
    }
}

Additionally, I need to have a separate function for initializing my 3D matrices, and any other dimension that I would need the way I have it set up now. Less than ideal. My guess is that there’s a elegant way to handle this with void pointers, but at the moment I only need 3D and 4D matrices so it’s not worth the effort for me to set it up.

So there are a few obstacles to using N-dimensional matrices with Eigen, but it’s really not a problem once you get a handle on it and how to properly reference different elements and matrices through your (N-2)-dimensional array of pointers.

Tagged , ,