Replace Non-Alphanumeric Characters in a C++ String

I needed to replace the non-alphanumeric characters in a std::string. While the Java String class has the replace() and replaceAll() methods for doing this, the std::string class has no such utility methods. Instead you can use the std::replace_if function from the Standard Template Library. Here’s some example code:

#include <iostream>
#include <string>
#include <algorithm>

int isNotAlphaNum(char c)
{
        return !std::isalnum(c);
}

int main(int argc, char* argv[])
{
        std::string s1 = "some/string/with*/nonalpha/characters+1";
        std::cout << s1 << " : ";
        std::replace_if(s1.begin(), s1.end(), isNotAlphaNum, ' ');
        std::cout << s1 << std::endl;
        return 0;
}

The third parameter of the replace_if() function is a pointer to a function that performs the desired check and returns either true or false if the input satisfies the condition. The last parameter is the character to replace the matched character, in this case a space. This program produces the following:

$ ./replace.exe
some/string/with*/nonalpha/characters+1
some string with  nonalpha characters 1

Convert C++ String to Lower Case (or Upper Case)

I don’t usually need to convert string case in C++ so when the need comes up I’ve usually forgotten how to do it and have to Google.

While the Java String class has toLowerCase() and toUpperCase(), C++ std::string does not have such a utility method. Instead, you need to use the std::transform() function. Here’s some example code:

#include <iostream>
#include <string>
#include <utility>

int main(int argc, char* argv[])
{
        std::string s1 = "lowertoupper";
        std::string s2 = "UPPERTOLOWER";
        std::cout << s1 << " : ";
        std::transform(s1.begin(), s1.end(), s1.begin(), ::toupper);
        std::cout << s1 << std::endl;
        std::cout << s2 << " : ";
        std::transform(s2.begin(), s2.end(), s2.begin(), ::tolower);
        std::cout << s2 << std::endl;
        return 0;
}

Produces the following:

$ ./case.exe
lowertoupper : LOWERTOUPPER
UPPERTOLOWER : uppertolower

Note that while the Java toUpperCase() and toLowerCase() methods do not modify the original string, the std::transform function does.

Exclude Documentation for Protected Members in Doxygen Output.

Doxygen is a tool for generating documentation by extracting code structure and relations, code signatures and specially formatted comments from application source code. It’s a very powerful tool that I’ve used on several personal and work related projects. One of the features it provides is customizability.

For example, it can be configured to generate separate documentation for internal or external use. Internally used documentation may include all public and private methods of classes while external documentation my only provide documentation for public APIs.

To exclude private class members the configuration option EXTRACT_PRIVATE which defaults to YES should be set to NO.

If you really want to generate public documentation you likely want to exclude protected members as well. There is no EXTRACT_PROTECTED configuration option but you can work around it by using the following:

ENABLE_PREPROCESSING = YES
MACRO_EXPANSION = YES
EXPAND_ONLY_PREDEF = YES
PREDEFINED = protected=private

This will cause the Doxygen engine to treat the protected keyword as a macro and replace it with private during processing. With EXTRACT_PRIVATE set to NO, protected members will also be excluded.

Source: https://osdir.com/ml/text.doxygen.general/2004-12/msg00047.html

Setting up MinGW for DirectX

Set up MinGW This article will show you how to set up a MinGW DirectX development environment. First, you need to get the latest and greatest MinGW development environment. The easiest way to do this is to download and run the latest mingw-get-inst GUI installer. Select the directory to install MinGW. Make sure that this path contains no spaces. Select the C++ and MSYS optional components. This will download and run the Command line installer. You will see a console window open while it downloads and installs the selected components. Once the installation completes you must add the MinGW bin/ directory to your path, for example C:\MinGW\bin. Start the MSYS shell by running C:\MinGW\MSYS1.0\msys.bat. In this MinGW shell, run the /postinstall/pi.sh script to establish bindings between the MinGW and MSYS installations. Just say yes. Other packages may be installed manually following the manual install instructions found here.

Set up the DirextX SDK

Download the latest DirectX SDK. Microsoft likes to keep moving things around but as of this writing it can be found at this link https://msdn.microsoft.com/en-us/directx/. Run the installer and install to a directory without spaces, for example C:\DirectXSDK. The latest DirectX SDK is installed but it is not yet visible from within the MSYS environment. In the root directory /, create an empty /dxsdk subdirectory. This will be used as a mount point for the DX SDK. Edit the file /etc/fstab and add the line “C:/DirectXSDK /dxsdk“. The hard drive path and the mount point may be separated by any number of spaces or tab characters. Go to /dxsdk and list the directory contents. If it is empty you probably have a typo in fstab. Also, older MinGW versions did not automatically reload the mount points when fstab was modified. You may need to exit your shell and restart it.

Build and run the test application

I borrowed the following test application from DirectX Tutorial.com. If everything was set up properly it should build and run with no modifications. dxtest.cpp Build the DirectX test application with the following command:
g++ dxtest.cpp -o dxtest.exe -I/dxsdk/include -L/dxsdk/lib -DUNICODE -ld3d9
The options are:
  • -o dxtest.exe – specifies the output file
  • -I/dxsdk/include – specifies additional include directories to search
  • -L/dxsdk/lib – specifies additional library directories to search
  • -DUNICODE – defines the UNICODE preprocessor symbol for unicode strings
  • -ld3d9 – link to the d3d9.dll
You can run the DirectX test application by typing:
./dxtest.exe
Note that if you try to run this program outside the MinGW environment, for example double clicking it in explorer, you may get a dialog box complaining about not being able to find libgcc_s_dw2-1.dll. This is probably because you didn’t close explorer and open it again after you added MinGW to your path. Ensure that C:\MinGW\bin is in your path. Explorer “should” pick up the change immediately but for some reason it doesn’t always. Close and reopen your explorer window. This brings up an interesting point however. Our application is dependent on a dll shipped with the MinGW environment. If we are to redistribute our executable, we must also redistribute the dll or our users will not be able to run our application. One way to fix this is to link the gcc run time statically:
g++ dxtest.cpp -o dxtest.exe -I/dxsdk/include -L/dxsdk/lib -DUNICODE -ld3d9
	-static-libgcc
This does however have the drawback that it has the potential to dramatically increase the size of our executable. Choose what works best for your situation. Also note, now that we are able to run our application outside the MinGW shell, a console window opens up along with our application. This is ugly and not something we really want. To get rid of this console window we need to specify that we are building a windows application using the -mwindows option:
g++ dxtest.cpp -o dxtest.exe -I/dxsdk/include -L/dxsdk/lib -DUNICODE -ld3d9
	-static-libgcc -mwindows
Thats it! Have fun with DirectX and MinGW! These are the exact steps I’ve taken and they worked for me today. Tomorrow a new version of something might break things. Let me know what issues you uncover and I’ll try to keep this up to date. Useful resources: ]]>

Factory Design Pattern in C++

A question was asked on stackoverflow.com about how to dynamically instantiate classes given a vector of strings of the class names to instantiate. Take for example:

class Base {};
class Child1 : public Base {};
class Child2 : public Base {};
class Child3 : public Child2 {};
int main (int argc, char* argv [])
{
   std::vector<std::string> names = get_names(argv);
   Base* p;
   for (std::vector<std::string>::iterator i = names.begin(); i != names.end(); i++) {
      if (*i == "Child1")
         p = new Child1;
      if (*i == "Child2")
         p =new Child2;
      if (*i == "Child3")
         p = new Child3;
      // do something with p
  }
}

So, this works. Except each time you add another class, you have to remember to update this code and rebuild the entire application. The factory design pattern, or factory method pattern (or dynamic constructor) is a mechanism for creating objects without knowing exactly what object needs to be created or how to actually create the object. A class factory provides an interface in which subclasses can implement the necessary functionality to create specific objects. A class factory is an object for creating other objects. As classes are added to the application they resister their creation routines with the class factory which then can instantiate them upon request. A class factory could simplify the above example to something like this:

class Base {};
class Child1 : public Base {};
class Child2 : public Base {};
class Child3 : public Child2 {};
int main (int argc, char* argv [])
{
   std::vector<std::string> names = get_names(argv);
   Base* p;
   for (std::vector<std::string>::iterator i = names.begin(); i != names.end(); i++) {
      p = factory.create (*i);
      // do something with p
  }
}

For now, this glosses over the registration details which we’ll discuss shortly. Let’s start with some classes:

class Base
{
public:
   virtual void foo() = 0;
};
class Child1 : public Base
{
public:
   virtual void foo();
};
class Child2 : public Base
{
public:
   virtual void foo();
};
class Child3 : public Child2
{
public:
   virtual void foo();
};
void Child1::foo()
{
   std::cout << "Child1\n";
}
void Child2::foo()
{
   std::cout << "Child2\n";
}
void Child3::foo()
{
   std::cout << "Child3\n";
}

Next, we need some classes to instantiate our classes derived from Base.

class Creator
{
public:
   virtual Base* create() = 0;
};

Every time a new class is defined it must also be accompanied by its own specialized Creator derived class that is responsible for instantiating instances of the new class.

class Child1Creator : public Creator
{
public:
   virtual Base* create() { return new Child1; }
};
class Child2Creator : public Creator
{
public:
   virtual Base* create() { return new Child2; }
};
class Child3Creator : public Creator
{
public:
   virtual Base* create() { return new Child3; }
};

Well, how does this help us? Well, it doesn’t really. It creates a butt load more work. However, given that these creator classes differ only in type, this is screaming loudly for templates.

template <class T>
class CreatorImpl : public Creator
{
public:
   virtual Base* create() { return new T; }
};

With this template, the developer only needs to instantiate an instance of this class with each new Base derived class. So, these creators are like little machines that stamp out new objects. Now we need a class factory to store our machines.

class Factory
{
public:
   Base* create(const std::string& classname);
   void registerit(const std::string& classname, Creator* creator);
private:
   std::map<std::string, Creator*> table;
};

The class factory registers the creators in a lookup table. The creators must be registered with the class factory before the factory can create instances of a class. The registerit() method maps an instance of a creator to a C++ class name:

void Factory::registerit(const std::string& classname, Creator* creator)
{
   table[classname] = creator;
}

The create() method looks up a specific creator based on the given classname to construct the required object. If there is no creator for the class then NULL is returned.

Base* Factory::create(const std::string& classname)
{
   std::map<std::string, Creator*>::iterator i;
   i = table.find(classname);
   if (i != table.end())
      return i->second->create();
   else
      return (Base*)NULL;
}

So now we can simplify our main() example:

int main(int argc, char* argv[])
{
   Factory factory;
   CreatorImpl<Child1> creator1;
   CreatorImpl<Child2> creator2;
   CreatorImpl<Child3> creator3;
   factory.registerit("Child1", &creator1);
   factory.registerit("Child2", &creator2);
   factory.registerit("Child3", &creator3);
   std::vector<std::string> names = get_names(argv);
   Base* p;
   for (std::vector<std::string>::iterator i = names.begin(); i != names.end(); i++) {
      p = factory.create(*i);
      if (p != NULL)
         p->foo();
      else
         std::cout << "Class not found!\n";
  }
   return 0;
}

Well, the developer still needs to instantiate and register another creator and rebuild the application every time a new class is added. Now we’ll try to get the creators to register themselves with the class factory so the developer does not need to worry about it. We’ll modify the creator’s constructor to do this. The class factory however now has to be global so it can be seen from within the creator’s constructor. Here is the updated code:

class Creator
{
public:
   Creator(const std::string& classname);
   virtual Base* create() = 0;
};
Factory factory; // factory is global, not in main()
// have the creator's constructor do the registration
Creator::Creator(const std::string& classname)
{
   factory.registerit(classname, this);
}
template <class T>
class CreatorImpl : public Creator
{
public:
    CreatorImpl(const std::string& classname) : Creator(classname) {}
    virtual Base* create() { return new T; }
};

Now the main application can be modified like this:

extern Factory factory;
int main(int argc, char* argv[])
{
   // automatically registers with the factory.
   CreatorImpl<Child1> creator1("Child1");
   CreatorImpl<Child2> creator2("Child2");
   CreatorImpl<Child3> creator3("Child3");
   std::vector<std::string> names = get_names(argv);
   Base* p;
   for (std::vector<std::string>::iterator i = names.begin(); i != names.end(); i++) {
      p = factory.create(*i);
      if (p != NULL)
         p->foo();
      else
         std::cout << "Class not found!\n";
  }
   return 0;
}

So, the developer still needs to instantiate a creator for every new class. How could we remove the burden of the developer to remember to update main() every time a new class is added? The only real way to do this is to have code executed before entry into main() which can only happen with the initialization of global data. From this point on we are starting to wander into non-standard territory. We are going to count on the fact that an object’s constructor is guaranteed to be called at some point before the object is first accessed. This can be before main() but does not necessarily have to be. We want to give each new class its own creator instantiation without needing to update main(). We can either create a global creator object in the class’s implementation file, or give the class a private static creator member. For this implementation we are going to use the private static member. This presents us with another problem however. There is no way to specify the order in which objects in global scope are created. Bad things will happen if the creator tries to register with the class factory before the class factory has been constructed. To solve this problem, we need to modify the factory class:

class Factory
{
public:
   static Base* create(const std::string& classname);
   static void register(const std::string& classname, Creator* creator);
private:
   static std::map<std::string, Creator*>& get_table();
};
Base* Factory::create(const std::string& classname)
{
   std::map<std::string, Creator*>::iterator i;
   i = get_table().find(classname);
   if (i != get_table().end())
      return i->second->create();
   else
      return (Base*)NULL;
}
void Factory::register(const std::string& classname, Creator* creator)
{
   get_table()[classname] = creator;
}
std::map<std::string, Creator*>& Factory::get_table()
{
   static std::map<std::string, Creator*> table;
   return table;
}

We’ve moved the table member and wrapped it as a static local variable in a member function called get_table(). The register() method will accesses the lookup table through this static function. This guarantees the lookup table will be created before it is accessed. Also, by making all the member functions static, we don’t have to create a global instance of the class factory. We can instead call the factory methods directly. Now we need to update the creator class to reflect these changes:

Creator::Creator(const std::string& classname)
{
   Factory::registerit(classname, this);
}

To remove the creator instantiation out of main() we update our class definitions:

class Child1 : public Base
{
private:
   static const CreatorImpl<Child1> creator;
public:
   virtual void foo();
};
class Child2 : public Base
{
private:
   static const CreatorImpl<Child2> creator;
public:
   virtual void foo();
};
class Child3 : public Child2
{
private:
   static const CreatorImpl<Child3> creator;
public:
   virtual void foo();
};
const CreatorImpl<Child1> Child1::creator("Child1");
void Child1::foo()
{
   std::cout << "Child1\n";
}
const CreatorImpl<Child2> Child2::creator("Child2");
void Child2::foo()
{
   std::cout << "Child2\n";
}
const CreatorImpl<Child3> Child3::creator("Child3");
void Child3::foo()
{
   std::cout << "Child3\n";
}

Which really simplifies main():

int main(int argc, char* argv[])
{
   std::vector<std::string> names = get_names(argv);
   Base* p;
   for (std::vector<std::string>::iterator i = names.begin(); i != names.end(); i++) {
      p = Factory::create(*i);
      if (p != NULL)
         p->foo();
      else
         std::cout << "Class not found!\n";
  }
   return 0;
}

This is where we hit non-standard territory. If you were to compile and link this as a stand alone application or statically linked library, it probably will not work. This is because the compiler will see that none of the static creators we give each class are directly accessed anywhere in the application. Basically, we define them and count on their constructors to do work for us but never need to explicitly call any methods on them afterwards. The compiler will recognize this and will optimize them away along with the registration code in the constructor calls. There is nothing in the C++ standard that can force constructor generation even if the construction has necessary side effects. There had been some discussions about adding a force keyword but I don’t know if it actually made it into the C++0x standard. Currently the only way to force the constructor to be generated and called is by accessing the object. Since that is abstracted away from the compiler in our factory table, they are all optimized away. The non-standard solution to this is to put the factory and Base derived classes into a dynamically linked library. The idea is that because the compiler does not know what objects from the library will be accessed by an application at run time, it has to generate all the constructors for the global data. When the library is loaded into the process address space, all the creator constructors are executed registering their respective classes. I’ve tested this on Windows and Linux platforms and it works as described. Given that it is non-standard, it is not guaranteed to work everywhere. Finally, we get to abuse the C preprocessor and create some macros for setting up this framework. Setting up the creators manually is monotonous and error prone. The macros allow the developer to define and implement the framework in a way that will cause any errors to be caught at compile time while allowing the framework to be set up with a single line of code. For class definition files we define a REGISTER() macro which will add the private static member:

#define REGISTER(classname) \
private: \
   static const CreatorImpl<classname> creator;

We would use the macro like this:

// Child1.h
class Child1 : public Base
{
   REGISTER(Child1);
public:
   virtual void foo();
};

Which would produce code like this:

// Child1.h
class Child1 : public Base
{
private:
   static const CreatorImpl<Child1> creator;
public:
   virtual void foo();
};

It is recommended that this macro be placed as the first thing in the class definition. It does not really matter where in the class definition it is placed but it must be understood that C++ lines following the macro will be private. So for instance, if the macro is placed following a bunch of public methods, methods directly following will all of the sudden become private even though you did not specify the private access modifier. If the macro is placed as the very first line in the class definition, because C++ class members are private by default, the macro will not change the expected semantics. For class implementation files we define a REGISTERIMPL() macro which will instantiate the creator:

#define REGISTERIMPL(classname) \
   const CreatorImpl<classname> classname::creator(#classname);

We would use the macro like this:

// Child1.cpp
REGISTERIMPL(Child1);
void Child1::foo()
{
   std::cout << "Child1\n";
}

The preprocessor substitutes the code for the constructor call, replacing classname with the supplied C++ class name. The “#” symbol tells the preprocessor to substitute the value of classname as a string constant instead of expanding it as code. This would produce code that looks like this:

// Child1.cpp
const CreatorImpl<Child1> Child1::creator("Child1");
void Child1::foo()
{
   std::cout << "Child1\n";
}

I implemented this in a utility library that I use in various projects. I’m always looking for ways to improve this. If you see any bugs, typos or have any suggestions, feel free to email me or leave a comment below.

Update 4/14/2013

After receiving requests for a working example, I finally threw one together. It can be downloaded here: factorydemo.zip This demo is built using MinGW. To build, type make from the MinGW shell. This will build main.exe and factory.dll. If you do not have MinGW and just want to try it, there are precomiled binaries in the zip archive. An example run of the demo looks like this:

$ main Child1
Child1
$ main Child2
Child2
$ main Child3
Child3
$ main Child4
Class not found!

I also updated the code listings above. It seems some special characters like < and > and others were stripped at some point making the code actually incorrect. They all should be fixed now. If I missed any let me know.]]>