Home Page
Archive > Posts > Tags > Assembly
Archive > Posts > Tags > Assembly
Search:
RABiD BUNNY FEVER
K.T.K

Warning: you do not have javascript enabled. This WILL cause layout glitches.

Archives
Posts Updates
2006200720082009201020112012Tags
By Viewed Sorted Scattered
By Used Sorted Scattered


See “Site Map” for Title lists

NULL Pointer for C++
Extending a language for what it’s lacking

I’ve recently been frustrated by the fact that NULL in C++ is actually evaluated as integral 0 instead of a pointer with the value of 0. This problem can be seen in the following example:

class String
{
	String(int i)  { /* ... */ } //Convert a number to a string
	String(char* i){ /* ... */ } //Copy a char* string directly into the class
};

String Foo(NULL); //This would give the string "Foo" the value "0" instead of a char* to (void*)0

The solution I came up with, which my good friend Will Erickson (aka Sarev0k) helped me revise, is as follows:

#undef NULL //If NULL is already defined, get rid of it
struct NULL_STRUCT { template <typename T> operator T*() { return (T*)0; } }; //NULL_STRUCT will return 0 to any pointer
static NULL_STRUCT NULL; //NULL is of type NULL_STRUCT and static (local to the current file)

After coming up with this way of doing it, I found out this concept is already a part of the new C++0x standard as nullptr, but since it is not really out yet, I still need a solution for the current C++ standard.


After getting this to work how I wanted it, I tested it out to make sure it is optimized correctly in compilers. When the compiler knows a value will be 0, it can apply lots of special assembly tricks.

Microsoft Visual C++ got it right by seeing that NULL was just 0 and applying appropriate optimizations, but GCC missed an optimization step and didn’t detect that it was 0 down the whole pipe. GCC, to my knowledge, however, isn’t exactly known for its optimization.


Example code:
BYTE* a=...; //Set a to an arbitrary value (best if brought in via an external method [i.e. stdin] so the compiler doesn’t make assumptions about the variable)
bool b=(a==NULL); //Set to b if a is 0 (NULL)
What MSVC6 outputs (and what it should be after optimization):
test eax,eax	//logical and a against itself to determine if it is 0 or not
sete al		//Set the lowest byte of eax to 1 if a is 0
What GCC gives
xor edx,edx	//Temporarily store 0 in edx for later comparison. This is a 0 trick, but 1 step higher than it could be used at.
cmp edx,eax	//Compare a against edx (0)
sete al		//Set the lowest byte of eax to 1 if a equals the value in edx

On a side note, it has been quite painful going from using assembly in Microsoft Visual C++ to GCC for 2 reasons:
  • I hate AT&T (as opposed to Intel) assembly syntax. It is rather clunky to use, and every program I’ve ever used is in Intel syntax (including all the Intel reference documentation). I tried turning on Intel syntax through a flag when compiling through GCC, but it broke GCC. :-\
  • Having to list which assembly registers are modified/used in the extended assembly syntax. This interface is also very clunky and, I have found, prone to bugs and problems.
OllyDbg 2.0
Reverse engineering is fun! :-D

OllyDbg is my favorite assembly editing environment for reverse engineering applications in Windows. I used it for all of my Ragnarok Online projects in 2002, and you can find a tutorial that uses it here (sorry, the writing in it is horrible x.x; ).

Ever since I started using it back then, the author was talking about his complete rewrite of the program, dubbed version 2.0, that was supposedly going to be much, much better. I have been patiently waiting for it ever since :-). Rather randomly, I decided to check back on the website yesterday, after not having visiting there for over a year, and low and behold, the first beta of version 2.0 [self-mirror] was released yesterday! :-D. Unfortunately, I’m not really doing any reverse engineering or assembly level work right now, so I have no reason or need to test it :-\.


... So yes, just wanted to call attention to this wonderful program being updated, that’s all for today!

C Jump Tables
The unfortunate reality of different feature sets in different language implementations

I was thinking earlier today how it would be neat for C/C++ to be able to get the address of a jump-to label to be used in jump tables, specifically, for an emulator. A number of seconds after I did a Google query, I found out it is possible in gcc (the open source native Linux compiler) through the “label value operator” “&&”. I am crushed that MSVC doesn’t have native support for such a concept :-(.

The reason it would be great for an emulator is for emulating the CPU, in which, usually, each first byte of a CPU instruction’s opcode [see ASM] gives what the instruction is supposed to do. An example to explain the usefulness of a jump table is as follows:

void DoOpcode(int OpcodeNumber, ...)
{
	void *Opcodes[]={&&ADD, &&SUB, &&JUMP, &&MUL}; //assuming ADD=opcode 0 and so forth
	goto *Opcodes[OpcodeNumber];
  	ADD:
		//...
	SUB:
		//...
	JUMP:
		//...
	MUL:
		//...
}

Of course, this could still be done with virtual functions, function pointers, or a switch statement, but those are theoretically much slower. Having them in separate functions would also remove the possibility of local variables.

Although, again, theoretically, it wouldn’t be too bad to use, I believe, the _fastcall function calling convention with function pointers, and modern compilers SHOULD translate switches to jump tables in an instance like this, but modern compilers are so obfuscated you never know what they are really doing.

It would probably be best to try and code such an instance so that all 3 methods (function pointers, switch statement, jump table) could be utilized through compiler definitions, and then profile for whichever method is fastest and supported.

//Define the switch for which type of opcode picker we want
#define UseSwitchStatement
//#define UseJumpTable
//#define UseFunctionPointers

//Defines for how each opcode picker acts
#if defined(UseSwitchStatement)
	#define OPCODE(o) case OP_##o:
#elif defined(UseJumpTable)
	#define OPCODE(o) o:
	#define GET_OPCODE(o) &&o
#elif defined(UseFunctionPointers)
	#define OPCODE(o) void Opcode_##o()
	#define GET_OPCODE(o) (void*)&Opcode_##o
	//The above GET_OPCODE is actually a problem since the opcode functions aren't listed until after their ...
	//address is requested, but there are a couple of ways around that I'm not going to worry about going into here.
#endif

enum {OP_ADD=0, OP_SUB}; //assuming ADD=opcode 0 and so forth
void DoOpcode(int OpcodeNumber, ...)
{
	#ifndef UseSwitchStatement //If using JumpTable or FunctionPointers we need an array of the opcode jump locations
		void *Opcodes[]={GET_OPCODE(ADD), GET_OPCODE(SUB)}; //assuming ADD=opcode 0 and so forth
	#endif
	#if defined(UseSwitchStatement)
		switch(OpcodeNumber) { //Normal switch statement
	#elif defined(UseJumpTable)
		goto *Opcodes[OpcodeNumber]; //Jump to the proper label
	#elif defined(UseFunctionPointers)
		*(void(*)(void))Opcodes[OpcodeNumber]; //Jump to the proper function
		} //End the current function
	#endif

	//For testing under "UseFunctionPointers" (see GET_OPCODE comment under "defined(UseFunctionPointers)")
	//put the following OPCODE sections directly above this "DoOpcode" function
	OPCODE(ADD)
	{
		//...
	}
	OPCODE(SUB)
	{
		//...
	}

	#ifdef UseSwitchStatement //End the switch statement
	}
	#endif

#ifndef UseFunctionPointers //End the function
}
#endif

After some tinkering, I did discover through assembly insertion it was possible to retrieve the offset of a label in MSVC, so with some more tinkering, it could be utilized, though it might be a bit messy.
void ExamplePointerRetreival()
{
	void *LabelPointer;
	TheLabel:
	_asm mov LabelPointer, offset TheLabel
}