Home Page
RABiD BUNNY FEVER
K.T.K

Warning: you do not have javascript enabled. This WILL cause layout glitches.

Archives
Posts Updates
200620072008Tags
By Viewed Sorted Scattered
By Used Sorted Scattered


See “Site Map” for Title lists

C Jump Tables
The unfortunate reality of different feature sets in different language implementations

I was thinking earlier today how it would be neat for C/C++ to be able to get the address of a jump-to label to be used in jump tables, specifically, for an emulator. A number of seconds after I did a Google query, I found out it is possible in gcc (the open source native Linux compiler) through the “label value operator” “&&”. I am crushed that MSVC doesn’t have native support for such a concept :-(.

The reason it would be great for an emulator is for emulating the CPU, in which, usually, each first byte of a CPU instruction’s opcode [see ASM] gives what the instruction is supposed to do. An example to explain the usefulness of a jump table is as follows:

void DoOpcode(int OpcodeNumber, ...)
{
	void *Opcodes[]={&&ADD, &&SUB, &&JUMP, &&MUL}; //assuming ADD=opcode 0 and so forth
	goto *Opcodes[OpcodeNumber];
  	ADD:
		//...
	SUB:
		//...
	JUMP:
		//...
	MUL:
		//...
}

Of course, this could still be done with virtual functions, function pointers, or a switch statement, but those are theoretically much slower. Having them in separate functions would also remove the possibility of local variables.

Although, again, theoretically, it wouldn’t be too bad to use, I believe, the _fastcall function calling convention with function pointers, and modern compilers SHOULD translate switches to jump tables in an instance like this, but modern compilers are so obfuscated you never know what they are really doing.

It would probably be best to try and code such an instance so that all 3 methods (function pointers, switch statement, jump table) could be utilized through compiler definitions, and then profile for whichever method is fastest and supported.

//Define the switch for which type of opcode picker we want
#define UseSwitchStatement
//#define UseJumpTable
//#define UseFunctionPointers

//Defines for how each opcode picker acts
#if defined(UseSwitchStatement)
	#define OPCODE(o) case OP_##o:
#elif defined(UseJumpTable)
	#define OPCODE(o) o:
	#define GET_OPCODE(o) &&o
#elif defined(UseFunctionPointers)
	#define OPCODE(o) void Opcode_##o()
	#define GET_OPCODE(o) (void*)&Opcode_##o
	//The above GET_OPCODE is actually a problem since the opcode functions aren't listed until after their ...
	//address is requested, but there are a couple of ways around that I'm not going to worry about going into here.
#endif

enum {OP_ADD=0, OP_SUB}; //assuming ADD=opcode 0 and so forth
void DoOpcode(int OpcodeNumber, ...)
{
	#ifndef UseSwitchStatement //If using JumpTable or FunctionPointers we need an array of the opcode jump locations
		void *Opcodes[]={GET_OPCODE(ADD), GET_OPCODE(SUB)}; //assuming ADD=opcode 0 and so forth
	#endif
	#if defined(UseSwitchStatement)
		switch(OpcodeNumber) { //Normal switch statement
	#elif defined(UseJumpTable)
		goto *Opcodes[OpcodeNumber]; //Jump to the proper label
	#elif defined(UseFunctionPointers)
		*(void(*)(void))Opcodes[OpcodeNumber]; //Jump to the proper function
		} //End the current function
	#endif

	//For testing under "UseFunctionPointers" (see GET_OPCODE comment under "defined(UseFunctionPointers)")
	//put the following OPCODE sections directly above this "DoOpcode" function
	OPCODE(ADD)
	{
		//...
	}
	OPCODE(SUB)
	{
		//...
	}

	#ifdef UseSwitchStatement //End the switch statement
	}
	#endif

#ifndef UseFunctionPointers //End the function
}
#endif

After some tinkering, I did discover through assembly insertion it was possible to retrieve the offset of a label in MSVC, so with some more tinkering, it could be utilized, though it might be a bit messy.
void ExamplePointerRetreival()
{
	void *LabelPointer;
	TheLabel:
	_asm mov LabelPointer, offset TheLabel
}
Inlining Executable Resources
Do you suffer from OPC (Obsessive Perfection Complex)? If not, you aren’t an engineer :-)

I am somewhat obsessive about file cleanliness, and like to have everything I do well organized with any superfluous files removed. This especially translates into my source code, and even more so for released source code.

Before I zip up the source code for any project, I always remove the extraneous workspace compilation files. These usually include:

  • C/C++: Debug & Release directories, *.ncb, *.plg, *.opt, and *.aps
  • VB: *.vbw
  • .NET: *.suo, *.vbproj.user

Unfortunately, a new offender surfaced in the form of the Hyrulean Productions icon and Signature File for about pages. I did not want to have to have every source release include those 2 extra files, so I did research into inlining them in the resource script (.rc) file. Resources are just data directly compiled into an executable, and the resource script tells the executable all of these resources and how to compile them in. All my C projects include a resource script for at least the file version, author information, and Hyrulean Productions icon. Anyways, this turned out to be way more of a pain in the butt than intended.


There are 2 ways to load “raw data” (not a standard format like an icon, bitmap, string table, version information, etc) into a resource script. The first way is through loading an external file:
RESOURCEID RESOURCETYPE DISCARDABLE "ResourceFileName"
for example:
DAKSIG	SIG	DISCARDABLE	"Dakusan.sig"
RESOURCEID and RESOURCETYPE are arbitrary and user defined, and it should also be noted to usually have them in caps, as the compilers seem to often be picky about case.

The second way is through inlining the data:
RESOURCEID	RESOURCETYPE
BEGIN
	DATA
END
for example:
DakSig	Sig
BEGIN
	0x32DA,0x2ACF,0x0306,...
END
Getting the data in the right format for the resource script is a relatively simple task.
  • First, acquire the data in 16-bit encoded format (HEX). I suggest WinHex for this job.
    On a side note, I have been using WinHex for ages and highly recommend it. It’s one of the most well built and fully featured application suites I know if.
  • Lastly, convert the straight HEX DATA (“DA32CF2A0603...”) into an array of proper endian hex values (“0x32DA,0x2ACF,0x0306...”). This can be done with a global replace regular expression of “(..)(..)” to “0x$2$1,”. I recommend Editpad Pro for this kind of work, another of my favorite pieces of software. As a matter of fact, I am writing this post right now in it :-).

Here is where the real caveats and problems start falling into place. First, I noticed the resource data was corrupt for a few bytes at a certain location. It turned out to be Visual Studio wanting line lengths in the resource file to be less than ~4175 characters, so I just added a line break at that point.

This idea worked great for the about page signature, which needed to be raw data anyways, but encoding the icon this way turned out to be impossible :-(. Visual Studio apparently requires external files be loaded if you want to use a pre-defined binary resource type (ICON, BITMAP, etc). The simple solution would be to inline the icon as a user defined raw data type, but unfortunately, the Win32 icon loading API functions (LoadIcon, CreateIconFromResource, LoadImage, etc) only seemed to work with properly defined ICONs. I believe the problem here is that when the compiler loads in the icon to include in the executable, it reformats it somewhat, so I would need to know this format. Again, unfortunately, Win32 APIs failed me. FindResource/FindResourceEx wouldn’t let me load the data for ICON types for direct coping (or reverse engineering) :-(. At this point, it wouldn’t be worth my time to try and get the proper format just to inline my Hyrulean Productions icon into resource scripts. I may come back to it later if I’m ever really bored.


This unfortunately brings back a lot of bad old memories regarding Win32 APIs. A lot of the Windows system is really outdated, not nearly robust enough, or just way too obfuscated, and has, and still does, cause me innumerable migraines trying to get things working with their system.

As an example, I just added the first about page to a C project, and getting fonts working on the form was not only a multi-hour long knockdown drag out due to multiple issues, I ended up having to jury rig the final solution in exasperation due to time constraints. I wanted the C about pages to match the VB ones exactly, but font size numbers just wouldn’t conform between the VB GUI designer and Windows GDI (the Windows graphics API), so I just put in arbitrary font size numbers that matched visually instead of trying to find the right conversion process, as the documented font size conversion process was not yielding proper results. This is the main reason VB (and maybe .NET) are far superior in my book when dealing with GUIs (for ease of use at least, not necessarily ability and power). I know there are libraries out that supposedly solve this problem, but I have not yet found one that I am completely happy with, which is why I had started my own fully fledged cross operating system GUI library a ways back, but it won’t be completed for a long time.