Home Page

Warning: you do not have javascript enabled. This WILL cause layout glitches.

Weird compiler problem

I wanted to write about a really weird problem I recently had while debugging in C++ (technically, it’s all C). Unfortunately, I was doing this in kernel debugging mode, which made life a bit harder, but it would have happened the same in userland.

I had an .hpp file (we’ll call it process_internal.hpp) that was originally an internal file just to be included from a .cpp file (we’ll call it process.cpp), so it contained global variables as symbols. I ended up needing to include this process_internal.hpp file elsewhere (for testing, we’ll call it test.cpp). Because of this, the same symbol was included in multiple files, so the separate .o builds were not properly interacting. I ended up using “#ifdef”s to only include the parts I needed in the test.cpp file, and doing “extern” defines of the global variables for it. It looked something like the following:

enum { FT_Inbound, FT_Outbound };
typedef struct FilteringLayer {
	int FilterTypeNum, OriginalID;
	const char *Name;
} FilteringLayer;
const int FT_NumTypes=2;

	FilteringLayer FilterTypes[FT_NumTypes]={
		{FT_Inbound,  5, "Inbound"),
		{FT_Outbound, 8, "Outbound"),
	extern "C" FilteringLayer *FilterTypes;

So I was accessing this variable in test.cpp and getting a really weird problem. The code looked something like this:

struct foo { int a, b; };
foo Stuff[]={...};
void FunctionBar()
	for(int i=0;i<FT_NumTypes;i++)

This was causing an access exception, which blue screened my debug VM. I tried running the exact same statements in the visual studio debugger, and things were working just as they were supposed to! So I decided to go to the assembly level. It looked something like this: (I included descriptions)

L#CodeDescriptionCombined description
for(int i=0;i<FT_NumTypes;i++)
1 mov qword ptr [rsp+58h],0 int i=0
2 jmp MODULENAME!FunctionBar+0xef JUMP TO #LINE@6
3 mov rax,qword ptr [rsp+58h] RAX=i
4 inc rax RAX++ i++
5 mov qword ptr [rsp+58h],rax I=RAX
6 cmp qword ptr [rsp+58h],02h CMP=(i-FT_NumTypes)
7 jae MODULENAME!FunctionBar+0x11e IF(CMP>=0) GOTO #LINE@15 if(i>=FT_NumTypes) GOTO #LINE@15
8 imul rax,qword ptr [rsp+58h],10h RAX=i*sizeof(FilterTypes)
9 mov rcx,[MODULENAME!FilterTypes ]RCX=(void**)&FilterTypes
10movzx eax,word ptr [rcx+rax+4] RAX=((UINT16*)(RCX+RAX+4) RAX=((FilteringLayer*)&FilterType)[i].OriginalID
11imul rax,rax,30h RAX*=sizeof(foo)
12lea rcx,[MODULENAME!Stuff ] RCX=(void*)&Stuff
13mov dword ptr [rcx+rax+04h],1 *(UINT32*)(RCX+RAX+0x4)=1 Stuff[RAX].b=1
14jmp MODULENAME!FunctionBar+0xe2 GOTO #LINE@3

I noticed that line #9 was putting 0x0000000C`00000000 into RCX instead of &FilterTypes. I knew the instruction should have been an “lea” instead of a “mov” to fix this. My first thought was compiler bug, but as many programming mantras say, that is very very rarely the case. If you want to guess now what the problem is, now is the time. I’ve given you all the information (and more) to make the guess.

The answer: extern "C" FilteringLayer *FilterTypes; should have been extern "C" FilteringLayer FilterTypes[];. Oops! The debugger was getting it right because it had the extra information of the real definition of the FilterTypes variable.

Windows Driver Service Loader

Following is some C++ source code for a Windows kernel-driver service loader. It could be used to load other service types too by changing the dwServiceType flag on the CreateService call. I threw this together for another project I am currently working on. It is also used in the following post (posting soon).

It works in the following way:
  • It is a command line utility which takes 3 arguments:
    1. The service name. Hereby referred to as SERVICE_NAME
    2. The service display name. Hereby referred to as DISPLAY_NAME
    3. The driver path (to the .sys file). Hereby referred to as DRIVER_PATH
  • This program (most likely) requires administrative access. There are also some caveats regarding driver code signing requirements that are thoroughly explored elsewhere.
  • It first checks to see if a service already exists with the given SERVICE_NAME. If it does:
    1. If the DISPLAY_NAME matches, the service is kept as is.
    2. If the DISPLAY_NAME does not match, the user is prompted on if they want to delete the current service. If they do not, the program exits.
  • If the service needs to be created (it did not already exist or was deleted), it creates the service with the given SERVICE_NAME, DISPLAY_NAME, and DRIVER_PATH. If the service is not created during this run, the DRIVER_PATH is ignored.
    Note: The DRIVER_PATH must be to a direct local file system file. I have found that network links and symbolic links do not work.
  • The service is started up:
    • If it is already running, the user is prompted on if they want to stop the currently running service. If they say no, the program exits.
  • The program then waits for a final user input on if they want to close the service before exiting the program.
  • If there was an error, the program reports the error, otherwise, it reports “Success”.
  • The program pauses at the end until the user presses any key to exit.
  • The program returns 0 on success, and 1 if an error occurred.

//Compiler flags
#define WIN32_LEAN_AND_MEAN  //Include minimum amount of windows stuff
#ifndef _UNICODE //Everything in this script is unicode
	#define _UNICODE

#include <windows.h>
#include <stdio.h>
#include <conio.h>
#include <memory>

//Smart pointers
typedef std::unique_ptr<WCHAR, void(*)(WCHAR*)> SmartWinAlloc;
typedef std::unique_ptr<SC_HANDLE, void(*)(SC_HANDLE*)> SmartCloseService;
void Delete_SmartWinAlloc(WCHAR *p) { if(p) LocalFree(p); }
void Delete_SmartCloseService(SC_HANDLE *h) { if(h && *h) CloseServiceHandle(*h); }

//Function declarations
WCHAR* InitDriver(int argc, WCHAR *argv[]);
WCHAR* FormatError(WCHAR* Format, ...);
SmartWinAlloc GetLastErrorStr();
BOOLEAN AskQuestion(WCHAR* Question); //Returns if user answered yes

int wmain(int argc, WCHAR *argv[])
	//Run the init routine
	WCHAR* Ret=InitDriver(argc, argv);

	//If there is an error, report it, or otherwise, report success
	wprintf(L"%s\n", Ret ? Ret : L"Success");
	wprintf(L"%s\n", L"Press any key to exit");

	//Return if successful
	return (Ret ? 1 : 0);

WCHAR* InitDriver(int argc, WCHAR *argv[])
	//Confirm arguments
		return FormatError(L"%s", L"3 arguments are required: Service Name, Display Name, Driver Path");
	const WCHAR* Param_ServiceName=argv[1];
	const WCHAR* Param_DisplayName=argv[2];
	const WCHAR* Param_DriverPath =argv[3];

	//Open the service manager
	wprintf(L"%s\n", L"Opening the service manager");
	SC_HANDLE HSCManager=OpenSCManager(nullptr, nullptr, SC_MANAGER_CREATE_SERVICE);
		return FormatError(L"%s: %s", L"Error opening service manager", GetLastErrorStr());
	SmartCloseService FreeHSCManager(&HSCManager, Delete_SmartCloseService);

	//Check if the service already exists
	wprintf(L"%s\n", L"Checking previously existing service state");
	BOOL ServiceExists=false;
		//Get the service name
		const DWORD NameBufferSize=255;
		WCHAR NameBuffer[NameBufferSize];
		WCHAR *NamePointer=NameBuffer;
		DWORD NamePointerSize=NameBufferSize;
		std::unique_ptr<WCHAR> Buf(nullptr); //May be swapped with a real pointer later
		for(INT_PTR i=0;i<2;i++)
			//If we found the service, exit the lookup here
			if(GetServiceDisplayName(HSCManager, Param_ServiceName, NamePointer, &NamePointerSize))

			//If the service does not exist, we can exit the lookup here

			//If error is not insufficient buffer size, return the error
				return FormatError(L"%s: %s", L"Could not query service information", GetLastErrorStr());

			//If second pass, error out
				return FormatError(L"%s: %s", L"Could not query service information", L"Second buffer pass failed");

			//Create a buffer of appropriate size (and make sure it will later be released)
			NamePointer=new WCHAR[++NamePointerSize];
			std::unique_ptr<WCHAR> Buf2(NamePointer);

		//If the service already exists, confirm the service name matches, and if not, ask if user wants to delete the current service
			wprintf(L"%s\n", L"The service already exists");
			if(wcsncmp(NamePointer, Param_DisplayName, NamePointerSize+1))
				//If the server names do not match, ask the user what to do
				wprintf(L"%s:\nCurrent: %s\nRequested: %s\n", L"The service names do not match", NamePointer, Param_DisplayName);

				//Make the request
				if(!AskQuestion(L"Would you like to replace the service? (y/n)")) //If user does not wish to replace the service
					return FormatError(L"%s", L"Cannot continue if service names do not match");

				//Delete the service
				wprintf(L"%s\n", L"Deleting the old service");
				SC_HANDLE TheService=OpenService(HSCManager, Param_ServiceName, DELETE);
					return FormatError(L"%s: %s", L"Could not open the service to delete it", GetLastErrorStr());
				SmartCloseService CloseTheService(&TheService, Delete_SmartCloseService); //Close the service handle
					return FormatError(L"%s: %s", L"Could not delete the service", GetLastErrorStr());
				wprintf(L"%s\n", L"The service has been deleted");

	//Create the service
	SC_HANDLE TheService;
		//Confirm the driver path exists
		wprintf(L"%s\n", L"Checking the driver file");
		DWORD FileAttrs=GetFileAttributes(Param_DriverPath);
			return FormatError(L"%s: %s", L"Given path is invalid", GetLastErrorStr());
			return FormatError(L"%s: %s", L"Given path is invalid", L"Path is a folder");

		//Create the service
		wprintf(L"%s\n", L"Creating the service");
			HSCManager, Param_ServiceName, Param_DisplayName, 
			Param_DriverPath, nullptr, nullptr, nullptr, nullptr, nullptr);
			return FormatError(L"%s: %s", L"Could not create the service", GetLastErrorStr());

	//Open the service if not creating
	} else {
		TheService=OpenService(HSCManager, Param_ServiceName, SERVICE_START|SERVICE_STOP);
			return FormatError(L"%s: %s", L"Could not open the service", GetLastErrorStr());
	SmartCloseService CloseTheService(&TheService, Delete_SmartCloseService); //Close the service on exit

	//Start the service
	wprintf(L"%s\n", L"Starting the service");
	for(INT_PTR i=0;i<2;i++)
		if(StartService(TheService, 0, nullptr))

		//If not "service already running" error, or user does not want to stop the current service
		if(i==1 || GetLastError()!=ERROR_SERVICE_ALREADY_RUNNING || !AskQuestion(L"The service is already running. Would you like to stop it? (y/n)"))
			return FormatError(L"%s: %s", L"Could not start the service", GetLastErrorStr());

		//Stop the service
		wprintf(L"%s\n", L"Stopping the current service");
		if(!ControlService(TheService, SERVICE_CONTROL_STOP, &ss))
			return FormatError(L"%s: %s", L"Could not stop the current service", GetLastErrorStr());
	wprintf(L"%s\n", L"Started the service");

	//Ask if the user wants to close the service
	if(!AskQuestion(L"Would you like to stop the service before exit? (y/n)"))
		return nullptr;

	//Stop the service
	if(!ControlService(TheService, SERVICE_CONTROL_STOP, &ss))
		return FormatError(L"%s: %s", L"Could not stop the service", GetLastErrorStr());
	if(ss.dwCurrentState!=SERVICE_STOP_PENDING && ss.dwCurrentState!=SERVICE_STOPPED)
		return FormatError(L"%s", L"The service does not appear to be closing");
	wprintf(L"%s\n", L"The service has been stopped");

	//Return success
	return nullptr;

WCHAR* FormatError(WCHAR* Format, ...)
	static WCHAR Err[255];
	va_list VAList;
	va_start(VAList, Format);
	vswprintf(Err, sizeof(Err)/sizeof(Err[0]), Format, VAList);
	return Err;

SmartWinAlloc GetLastErrorStr()
	LPWSTR MessageBuffer=nullptr;
		nullptr, GetLastError(), MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), (LPWSTR)&MessageBuffer, 0, nullptr);
	return SmartWinAlloc(MessageBuffer, Delete_SmartWinAlloc);

BOOLEAN AskQuestion(WCHAR* Question)
	//Make the request and wait for an input character
		//Ask the question and get the answer
		wprintf(L"%s:", Question);
		char InputChar=_getch();

		//Check for a valid answer
		if(InputChar=='n' || InputChar=='N')
			return FALSE;
		if(InputChar=='y' || InputChar=='Y')
			return TRUE;
Setting the time zone through a numeric offset
They never make it easy

I had the need today to be able to set the current time zone for an application in multiple computer languages by the hourly offset from GMT/UTC, which turned out to be a lot harder than I expected. It seems most time zone related functions, at least in Linux, expect you to use full location strings to set the current time zone offset (i.e. America/Chicago).

After a lot of research and experimenting, I came up with the following results. All of these are confirmed working in Linux, and most or all of them should work in Windows too.

Language Format Note Format for GMT+5 Format for GMT-5
C Negate GMT-5 GMT5
Perl Negate GMT-5 GMT5
SQL Requires Sign +5:00 -5:00
PHP Negate, Requires Sign Etc/GMT-5 Etc/GMT+5

And here are examples of using this in each language. The “TimeZone” string variable should be a 1-2 digit integer with an optional preceding negative sign:
Language Example
#include <stdio.h> //snprintf
#include <stdlib.h> //setenv, atoi
#include <time.h> //tzset


char Buffer[10];
snprintf(Buffer, 10, "GMT%i", -atoi(TimeZone));
setenv("TZ", Buffer, 1);
use POSIX qw/tzset/;
SQL [Query string created via Perl]
$Query='SET time_zone="'.($TimeZone>=0 ? '+' : '').$TimeZone.':00"';
date_default_timezone_set('Etc/GMT'.($TimeZone<=0 ? '+' : '').(-$TimeZone));
Realtime StdOut pass through to Web Browser
Tying it all together

I had the need to pass a program’s [standard] output to a web browser in real time. The best solution for this is to use a combination of programs made in different languages. The following are all of these individual components to accomplish this task.

Please note the C components are only compatible with gcc and bash (cygwin required for Windows), as MSVC and Windows command prompt are missing vital functionality for this to work.

The first component is a server made in C that receives stdin (as a pipe, or typed by the user after line breaks) and sends that data out to a connected client (buffering the output until the client connects).

PassThruServer source, PassThruServer compiled Windows executable.

Compilation notes:
  • This compiles as C99 under gcc:
    gcc PassThruServer.c -o PassThruServer
  • Define “WINDOWS” when compiling in Windows (pass “-DWINDOWS”)

Source Code:
#include <stdio.h>
#include <malloc.h>
#include <fcntl.h>
#include <sys/types.h> 
#include <sys/socket.h>
#include <netinet/in.h>
#include <signal.h>

//The server socket and options
int ServerSocket=0;
const int PortNumber=1234; //The port number to listen in on

//If an error occurs, exit cleanly
int error(char *msg)
	//Close the socket if it is still open

	//Output the error message, and return the exit status
	fprintf(stderr, "%s\n", msg);
	return 1;

//Termination signals
void TerminationSignal(int sig)
	error("SIGNAL causing end of process");

int main(int argc, char *argv[])
	//Listen for termination signals
	signal(SIGINT, TerminationSignal);
	signal(SIGTERM, TerminationSignal);
	signal(SIGHUP, SIG_IGN); //We want the server to continue running if the environment is closed, so SIGHUP is ignored -- This doesn't work in Windows
	//Create the server
	struct sockaddr_in ServerAddr={AF_INET, htons(PortNumber), INADDR_ANY, 0}; //Address/port to listen on
	if((ServerSocket=socket(AF_INET, SOCK_STREAM, 0))<0) //Attempt to create the socket
		return error("ERROR on 'socket' call");
	if(bind(ServerSocket, (struct sockaddr*)&ServerAddr, sizeof(ServerAddr))<0) //Bind the socket to the requested address/port
		return error("ERROR on 'bind' call");
	if(listen(ServerSocket,5)<0) //Attempt to listen on the requested address/port
		return error("ERROR on 'listen' call");

	//Accept a connection from a client
	struct sockaddr_in ClientAddr;
	int ClientAddrLen=sizeof(ClientAddr);
	int ClientSocket=accept(ServerSocket, (struct sockaddr*)&ClientAddr, &ClientAddrLen);
		return error("ERROR on 'accept' call");

	//Prepare to receive info from STDIN
		//Create the buffer
		const int BufferSize=1024*10;
		char *Buffer=malloc(BufferSize); //Allocate a 10k buffer
		//STDIN only needs to be set to binary mode in windows
		const int STDINno=fileno(stdin);
		#ifdef WINDOWS
			_setmode(STDINno, _O_BINARY);
		//Prepare for blocked listening (select function)
		fcntl(STDINno, F_SETFL, fcntl(STDINno, F_GETFL, 0)|O_NONBLOCK); //Set STDIN as blocking
		fd_set WaitForSTDIN;

	//Receive information from STDIN, and pass directly to the client
	int RetVal=0;
		//Get the next block of data from STDIN
		select(STDINno+1, &WaitForSTDIN, NULL, NULL, NULL); //Wait for data
		size_t AmountRead=fread(Buffer, 1, BufferSize, stdin); //Read the data
		if(feof(stdin) || AmountRead==0) //If input is closed, process is complete
		//Send the data to the client
		if(write(ClientSocket,Buffer,AmountRead)<0) //If error in network connection occurred
			RetVal=error("ERROR on 'write' call");
	return RetVal;

The next component is a Flash applet as the client to receive data. Flash is needed as it can keep a socket open for realtime communication. The applet receives the data and then passes it through to JavaScript for final processing.

Compiled Flash Client Applet

ActionScript 3.0 Code (This goes in frame 1)
import flash.external.ExternalInterface;
import flash.events.Event;
ExternalInterface.addCallback("OpenSocket", OpenSocket);

function OpenSocket(IP:String, Port:Number):void
	SendInfoToJS("Trying to connect");
	var TheSocket:Socket = new Socket();
	TheSocket.addEventListener(Event.CONNECT, function(Success) { SendInfoToJS(Success ? "Connected!" : "Could not connect"); });
	TheSocket.addEventListener(Event.CLOSE, function() { SendInfoToJS("Connection Closed"); });
	TheSocket.addEventListener(IOErrorEvent.IO_ERROR, function() {SendInfoToJS("Could not connect");});
	TheSocket.addEventListener(ProgressEvent.SOCKET_DATA, function(event:ProgressEvent):void { ExternalInterface.call("GetPacket", TheSocket.readUTFBytes(TheSocket.bytesAvailable)); });
	TheSocket.connect(IP, Port);
function SendInfoToJS(str:String) { ExternalInterface.call("GetInfoFromFlash", str); }

Flash sockets can also be implemented in ActionScript 1.0 Code (I did not include hooking up ActionScript 1.0 with JavaScript in this example. “GetPacket” and “SendInfoToJS” need to be implemented separately. “IP” and “Port” need to also be received separately).
var NewSock=new XMLSocket();
NewSock.onData=function(msg) { GetPacket(msg); }
NewSock.onConnect=function(Success) { SendInfoToJS(Success ? "Connected!" : "Could not connect"); }
SendInfoToJS(NewSock.connect(IP, Port) ? "Trying to Connect" : "Could not start connecting");

JavaScript can then receive (and send) information from (and to) the Flash applet through the following functions.

  • FLASH.OpenSocket(String IP, Number Port): Call this from JavaScript to open a connection to a server. Note the IP MIGHT have to be the domain the script is running on for security errors to not be thrown.
  • JAVASCRIPT.GetInfoFromFlash(String): This is called from Flash whenever connection information is updated. I have it giving arbitrary strings ATM.
  • JAVASCRIPT.GetPacket(String): This is called from Flash whenever data is received through the connection.

This example allows the user to input the IP to connect to that is streaming the output. Connection information is shown in the “ConnectionInfo” DOM object. Received data packets are appended to the document in separate DOM objects.

JavaScript+HTML Source

Source Code: (See JavaScript+HTML Source file for all code)
var isIE=navigator.appName.indexOf("Microsoft")!=-1;
function getFlashMovie(movieName) { return (isIE ? window[movieName] : document[movieName]);  }
function $(s) { return document.getElementById(s); }

function Connect()
	getFlashMovie("client").OpenSocket($('IP').value, 1234);

function GetInfoFromFlash(Str)

function GetPacket(Str)
	var NewDiv=document.createElement('DIV');

Next is an example application that outputs to stdout. It is important that it flushes stdout after every output or the communication may not be real time.

inc source, inc compiled Windows executable.

inc counts from 0 to one less than a number (parameter #1 [default=50]) after a certain millisecond interval (parameter #2 [default=500]).

[Bash] Example:
./inc 10 #Counts from 0-9 every half a second

Source Code:
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
	int NumLoops=(argc>1 ? atoi(argv[1]) : 50); //Number of loops to run from passed argument 1. Default is 50 if not specified.
	int LoopWait=(argc>2 ? atoi(argv[2]) : 500); //Number of milliseconds to wait in between each loop from passed argument 2. Default is 500ms if not specified.
	LoopWait*=1000; //Convert to microseconds for usleep

	//Output an incremented number every half a second
	int i=0;
		printf("%u\n", i++);
		fflush(stdout); //Force stdout flush
		usleep(LoopWait); //Wait for half a second
	return 0;

This final component is needed so the Flash applet can connect to a server. Unfortunately, new versions of Flash (at least version 10, might have been before that though) started requiring policies for socket connections >:-(. I don’t think this is a problem if you compile your applet to target an older version of Flash with the ActionScript v1.0 code.

This Perl script creates a server on port 843 to respond to Flash policy requests, telling any Flash applet from any domain to allow connections to go through to any port on the computer (IP). It requires Perl, and root privileges on Linux to bind to a port <1024 (su to root or run with sudo).

Flash Socket Policy Server (Rename extension to .pl)

Source Code:
use warnings;
use strict;

#Listen for kill signals
$SIG{'QUIT'}=$SIG{'INT'}=$SIG{__DIE__} = sub
	close Server;
	print "Socket Policy Server Ended: $_[0]\n";

#Start the server:
use Socket;
use IO::Handle;
my $FlashPolicyPort=843;
socket(Server, PF_INET, SOCK_STREAM, getprotobyname('tcp')) or die "'socket' call: $!"; #Open the socket
setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, 1) or die "'setsockopt' call: $!"; #Allow reusing of port/address if in TIME_WAIT state
bind(Server, sockaddr_in($FlashPolicyPort,INADDR_ANY)) or die "'bind' call: $!"; #Listen on port $FlashPolicyPort for connections from any INET adapter
listen(Server,SOMAXCONN) or die "'listen' call: $!"; #Start listening for connections
Server->autoflush(1); #Do not buffer output

#Infinite loop that accepts connections
$/ = "\0"; #Reset terminator from new line to null char
while(my $paddr=accept(Client,Server))
	Client->autoflush(1); #Do not buffer IO
	if(<Client> =~ /.*policy\-file.*/i) { #If client requests policy file...
		print Client '<cross-domain-policy><allow-access-from domain="*" to-ports="*" /></cross-domain-policy>'.$/; #Output policy info: Allow any flash applets from any domain to connect
	close Client; #Close the client
This could very easily be converted to another better [less resource intensive] language too.

How to tie all of this together
  1. Start the servers
    • In your [bash] command shell, execute the following
      Server/FlashSocketPolicy.pl & #Run the Flash Policy Server as a daemon. Don't forget sudo in Linux
      ./inc | ./PassThruServer #Pipe inc out to the PassThruServer
    • Note that this will immediately start the PassThruServer receiving information from “inc”, so if you don’t get the client up in time, it may already be done counting and send you all the info at once (25 seconds).
    • The PassThruServer will not end until one of the following conditions has been met:
      • The client has connected and the piped process is completed
      • The client has connected and disconnected and the disconnect has been detected (when a packet send failed)
      • It is manually killed through a signal
    • The Flash Policy Server daemon should probably just be left on indefinitely in the background (it only needs to be run once).
  2. To run the client, open client.html through a web server [i.e. Apache’s httpd] in your web browser. Don’t open the local file straight through your file system, it needs to be run through a web server for Flash to work correctly.
  3. Click “connect” (assuming you are running the PassThruServer already on localhost [the same computer]). You can click “connect” again every time a new PassThruServer is ran.
Executable Stubs
Win32 Executable Hacking

Executable stubs can be used by a compiler to create the header section (the beginning section) of an outputted executable by adding the “/stub” switch to the linker.

#pragma comment(linker, "/stub:Stub.exe")

The MSDN Library for MSVC6 has the following to say about it:

The MS-DOS Stub File Name (/STUB:filename) option attaches an MS-DOS stub program to a Win32 program.

A stub program is invoked if the file is executed in MS-DOS. It usually displays an appropriate message; however, any valid MS-DOS application can be a stub program.

Specify a filename for the stub program after a colon (:) on the command line. The linker checks filename to be sure that it is a valid MS-DOS executable file, and issues an error message if the file is not valid. The program must be an .EXE file; a .COM file is invalid for a stub program.

If this option is not used, the linker attaches a default stub program that issues the following message:
This program cannot be run in MS-DOS mode.

For the stub to work in XP, the following guidelines must be met:
  • The stub should be at least 64 bytes long
  • The first 2 bytes of the stub (Bytes 0-1) need to be “MZ”
  • Bytes 60-63 (4 bytes) are replaced by the compiler (I have a note that you might want to set this to 0x60000000 [big endian] for some reason)

As long as these guidelines are met, the rest of the stub can be whatever you want :-). For Small Projects, you can even put information here like strings for the executable, which are accessible through the executable virtual address space starting at 0x400000.
Virtual Functions in DLLs
And Shared Object Visibility

Since I am now working on making sure all my applications work in both Windows and Linux, I have been having to work a lot more with GCC recently (which I am basically totally switching too). It’s a bit trying having to learn an entirely different toolset after having used the same formula for many many years.

Linux shared objects are a bit different than Window’s DLLs in that all symbols are naturally exported instead of just the ones you specify, among other different behaviors like how the libraries are loaded and unloaded during runtime. The solution to the visibility problem is more recent as far as GCC goes, and is to use the visibility attribute. Symbol visibility in a shared library is very important for both symbol collision and library load time reasons. This means, however, that every functions/symbol needs to be marked as either not visible (for Linux), or exportable (for Windows).

I ran into a rather nasty problem however that I couldn’t find any information on when making a class exportable in a DLL. Basically, I had the following:

A.dll creation
//Determine whether we are exporting or importing functions from this DLL
#ifdef A_DLL
	#define DLLEXPORT __declspec(dllexport)
	#define DLLEXPORT __declspec(dllimport)

	int x;
	void a_foo();
	virtual void a_bar();
#define A_DLL
#include "A.h"
#include "stdio.h"
void a::a_foo() {printf("a_foo");}
void a::a_bar() {printf("a_bar");}
Compiling A.dll
g++ -c A.cpp -o A.o
g++ -shared -Wl,--output-def=libA.def -Wl,--out-implib=./libA.a -Wl,--dll A.o -o ./A.dll

B.dll creation
//Determine whether we are exporting or importing functions from this DLL
#ifdef B_DLL
	#define DLLEXPORT __declspec(dllexport)
	#define DLLEXPORT __declspec(dllimport)

	int x;
	void b_foo();
	virtual void b_bar();
#define B_DLL
#include "B.h"
#include "stdio.h"
void b::b_foo() {printf("b_foo");}
void b::b_bar() {printf("b_bar");}
Compiling B.dll
g++ -c B.cpp -o B.o
g++ -shared -Wl,--output-def=libB.def -Wl,--out-implib=./libB.a -Wl,--dll B.o -o ./B.dll

main.exe creation
#include "A.h"
#include "B.h"

int main()
	a a_var;

	b b_var;
	return 1;
Compiling main.exe
g++ -c main.cpp -o main.o
g++ -c main.o -o ./main.exe -L./ -lA -lB

The problem that occurred was when only 1 of the DLL files was included on Windows, everything worked fine, but when both were included, I could only access the virtual functions from one of the 2. I was getting an error about “required class export or error due to vtable not being found”.

After many hours of tinkering and fruitless research, I stumbled upon the solution by accident. It turns out I was using the incorrect syntax. The classes should have been defined as “class DLLEXPORT x {}” instead of “DLLEXPORT class x {}”. Why it worked when only 1 of the DLLs was present, but clobbered with multiple DLLs, I have no idea.

NULL Pointer for C++
Extending a language for what it’s lacking

I’ve recently been frustrated by the fact that NULL in C++ is actually evaluated as integral 0 instead of a pointer with the value of 0. This problem can be seen in the following example:

class String
	String(int i)  { /* ... */ } //Convert a number to a string
	String(char* i){ /* ... */ } //Copy a char* string directly into the class

String Foo(NULL); //This would give the string "Foo" the value "0" instead of a char* to (void*)0

The solution I came up with, which my good friend Will Erickson (aka Sarev0k) helped me revise, is as follows:

#undef NULL //If NULL is already defined, get rid of it
struct NULL_STRUCT { template <typename T> operator T*() { return (T*)0; } }; //NULL_STRUCT will return 0 to any pointer
static NULL_STRUCT NULL; //NULL is of type NULL_STRUCT and static (local to the current file)

After coming up with this way of doing it, I found out this concept is already a part of the new C++0x standard as nullptr, but since it is not really out yet, I still need a solution for the current C++ standard.

After getting this to work how I wanted it, I tested it out to make sure it is optimized correctly in compilers. When the compiler knows a value will be 0, it can apply lots of special assembly tricks.

Microsoft Visual C++ got it right by seeing that NULL was just 0 and applying appropriate optimizations, but GCC missed an optimization step and didn’t detect that it was 0 down the whole pipe. GCC, to my knowledge, however, isn’t exactly known for its optimization.

Example code:
BYTE* a=...; //Set a to an arbitrary value (best if brought in via an external method [i.e. stdin] so the compiler doesn’t make assumptions about the variable)
bool b=(a==NULL); //Set to b if a is 0 (NULL)
What MSVC6 outputs (and what it should be after optimization):
test eax,eax	//logical and a against itself to determine if it is 0 or not
sete al		//Set the lowest byte of eax to 1 if a is 0
What GCC gives
xor edx,edx	//Temporarily store 0 in edx for later comparison. This is a 0 trick, but 1 step higher than it could be used at.
cmp edx,eax	//Compare a against edx (0)
sete al		//Set the lowest byte of eax to 1 if a equals the value in edx

On a side note, it has been quite painful going from using assembly in Microsoft Visual C++ to GCC for 2 reasons:
  • I hate AT&T (as opposed to Intel) assembly syntax. It is rather clunky to use, and every program I’ve ever used is in Intel syntax (including all the Intel reference documentation). I tried turning on Intel syntax through a flag when compiling through GCC, but it broke GCC. :-\
  • Having to list which assembly registers are modified/used in the extended assembly syntax. This interface is also very clunky and, I have found, prone to bugs and problems.
C Jump Tables
The unfortunate reality of different feature sets in different language implementations

I was thinking earlier today how it would be neat for C/C++ to be able to get the address of a jump-to label to be used in jump tables, specifically, for an emulator. A number of seconds after I did a Google query, I found out it is possible in gcc (the open source native Linux compiler) through the “label value operator” “&&”. I am crushed that MSVC doesn’t have native support for such a concept :-(.

The reason it would be great for an emulator is for emulating the CPU, in which, usually, each first byte of a CPU instruction’s opcode [see ASM] gives what the instruction is supposed to do. An example to explain the usefulness of a jump table is as follows:

void DoOpcode(int OpcodeNumber, ...)
	void *Opcodes[]={&&ADD, &&SUB, &&JUMP, &&MUL}; //assuming ADD=opcode 0 and so forth
	goto *Opcodes[OpcodeNumber];

Of course, this could still be done with virtual functions, function pointers, or a switch statement, but those are theoretically much slower. Having them in separate functions would also remove the possibility of local variables.

Although, again, theoretically, it wouldn’t be too bad to use, I believe, the _fastcall function calling convention with function pointers, and modern compilers SHOULD translate switches to jump tables in an instance like this, but modern compilers are so obfuscated you never know what they are really doing.

It would probably be best to try and code such an instance so that all 3 methods (function pointers, switch statement, jump table) could be utilized through compiler definitions, and then profile for whichever method is fastest and supported.

//Define the switch for which type of opcode picker we want
#define UseSwitchStatement
//#define UseJumpTable
//#define UseFunctionPointers

//Defines for how each opcode picker acts
#if defined(UseSwitchStatement)
	#define OPCODE(o) case OP_##o:
#elif defined(UseJumpTable)
	#define OPCODE(o) o:
	#define GET_OPCODE(o) &&o
#elif defined(UseFunctionPointers)
	#define OPCODE(o) void Opcode_##o()
	#define GET_OPCODE(o) (void*)&Opcode_##o
	//The above GET_OPCODE is actually a problem since the opcode functions aren't listed until after their ...
	//address is requested, but there are a couple of ways around that I'm not going to worry about going into here.

enum {OP_ADD=0, OP_SUB}; //assuming ADD=opcode 0 and so forth
void DoOpcode(int OpcodeNumber, ...)
	#ifndef UseSwitchStatement //If using JumpTable or FunctionPointers we need an array of the opcode jump locations
		void *Opcodes[]={GET_OPCODE(ADD), GET_OPCODE(SUB)}; //assuming ADD=opcode 0 and so forth
	#if defined(UseSwitchStatement)
		switch(OpcodeNumber) { //Normal switch statement
	#elif defined(UseJumpTable)
		goto *Opcodes[OpcodeNumber]; //Jump to the proper label
	#elif defined(UseFunctionPointers)
		*(void(*)(void))Opcodes[OpcodeNumber]; //Jump to the proper function
		} //End the current function

	//For testing under "UseFunctionPointers" (see GET_OPCODE comment under "defined(UseFunctionPointers)")
	//put the following OPCODE sections directly above this "DoOpcode" function

	#ifdef UseSwitchStatement //End the switch statement

#ifndef UseFunctionPointers //End the function

After some tinkering, I did discover through assembly insertion it was possible to retrieve the offset of a label in MSVC, so with some more tinkering, it could be utilized, though it might be a bit messy.
void ExamplePointerRetreival()
	void *LabelPointer;
	_asm mov LabelPointer, offset TheLabel
Inlining Executable Resources
Do you suffer from OPC (Obsessive Perfection Complex)? If not, you aren’t an engineer :-)

I am somewhat obsessive about file cleanliness, and like to have everything I do well organized with any superfluous files removed. This especially translates into my source code, and even more so for released source code.

Before I zip up the source code for any project, I always remove the extraneous workspace compilation files. These usually include:

  • C/C++: Debug & Release directories, *.ncb, *.plg, *.opt, and *.aps
  • VB: *.vbw
  • .NET: *.suo, *.vbproj.user

Unfortunately, a new offender surfaced in the form of the Hyrulean Productions icon and Signature File for about pages. I did not want to have to have every source release include those 2 extra files, so I did research into inlining them in the resource script (.rc) file. Resources are just data directly compiled into an executable, and the resource script tells the executable all of these resources and how to compile them in. All my C projects include a resource script for at least the file version, author information, and Hyrulean Productions icon. Anyways, this turned out to be way more of a pain in the butt than intended.

There are 2 ways to load “raw data” (not a standard format like an icon, bitmap, string table, version information, etc) into a resource script. The first way is through loading an external file:
for example:
RESOURCEID and RESOURCETYPE are arbitrary and user defined, and it should also be noted to usually have them in caps, as the compilers seem to often be picky about case.

The second way is through inlining the data:
for example:
DakSig	Sig
Getting the data in the right format for the resource script is a relatively simple task.
  • First, acquire the data in 16-bit encoded format (HEX). I suggest WinHex for this job.
    On a side note, I have been using WinHex for ages and highly recommend it. It’s one of the most well built and fully featured application suites I know if.
  • Lastly, convert the straight HEX DATA (“DA32CF2A0603...”) into an array of proper endian hex values (“0x32DA,0x2ACF,0x0306...”). This can be done with a global replace regular expression of “(..)(..)” to “0x$2$1,”. I recommend Editpad Pro for this kind of work, another of my favorite pieces of software. As a matter of fact, I am writing this post right now in it :-).

Here is where the real caveats and problems start falling into place. First, I noticed the resource data was corrupt for a few bytes at a certain location. It turned out to be Visual Studio wanting line lengths in the resource file to be less than ~4175 characters, so I just added a line break at that point.

This idea worked great for the about page signature, which needed to be raw data anyways, but encoding the icon this way turned out to be impossible :-(. Visual Studio apparently requires external files be loaded if you want to use a pre-defined binary resource type (ICON, BITMAP, etc). The simple solution would be to inline the icon as a user defined raw data type, but unfortunately, the Win32 icon loading API functions (LoadIcon, CreateIconFromResource, LoadImage, etc) only seemed to work with properly defined ICONs. I believe the problem here is that when the compiler loads in the icon to include in the executable, it reformats it somewhat, so I would need to know this format. Again, unfortunately, Win32 APIs failed me. FindResource/FindResourceEx wouldn’t let me load the data for ICON types for direct coping (or reverse engineering) :-(. At this point, it wouldn’t be worth my time to try and get the proper format just to inline my Hyrulean Productions icon into resource scripts. I may come back to it later if I’m ever really bored.

This unfortunately brings back a lot of bad old memories regarding Win32 APIs. A lot of the Windows system is really outdated, not nearly robust enough, or just way too obfuscated, and has, and still does, cause me innumerable migraines trying to get things working with their system.

As an example, I just added the first about page to a C project, and getting fonts working on the form was not only a multi-hour long knockdown drag out due to multiple issues, I ended up having to jury rig the final solution in exasperation due to time constraints. I wanted the C about pages to match the VB ones exactly, but font size numbers just wouldn’t conform between the VB GUI designer and Windows GDI (the Windows graphics API), so I just put in arbitrary font size numbers that matched visually instead of trying to find the right conversion process, as the documented font size conversion process was not yielding proper results. This is the main reason VB (and maybe .NET) are far superior in my book when dealing with GUIs (for ease of use at least, not necessarily ability and power). I know there are libraries out that supposedly solve this problem, but I have not yet found one that I am completely happy with, which is why I had started my own fully fledged cross operating system GUI library a ways back, but it won’t be completed for a long time.