Home Page
Archive > Posts > 2013
Archive > Posts > 2013

Warning: you do not have javascript enabled. This WILL cause layout glitches.

Results from my first high-load scalable system
Putting the cloud on a scale so it’s not so heavy

I’ve wanted to create and test a large-scale application for a very long time but have never really had the chance until recently. The Vintage Experience project I did earlier this year finally gave me the opportunity. As one of many parts of the project, I was tasked to create a voting system that could handle 1 million votes via a web page in a 30 second time span. The final system was deployed successfully without any problems for Gala Artis 2013 (a French Canadian artist/TV awards show). The following are the results of my implementation and testing.

The main front-end was done via a static HTML page (smart-phone optimized) that was hosted by Amazon S3, where handling 33k requests/second is a drop in the bucket. All voting requests were done via AJAX from this web page to backend servers hosted by Amazon EC2.

The backend was programmed in GoLang as a simple web server, optimized for memory and speed, which spawned a new goroutine for each incoming request. The request returned a message to the user stating either the error message, or success if the vote was added to the database. Each server held a single persistent MySQL connection to an Amazon RDS “Large DB Instance” with the minimum IOPS (1000). Votes from a server were sent to the database in batches once a second, or earlier if 10,000 votes had been received.

The servers were Amazon “M1 Standard Extra Large” (m1.xlarge) instances running Linux, of which there were 6 total, handling vote requests delegated by a round-robin DNS on Amazon’s Route 53. During stress testing, each server was found to be able to handle 6800 requests/second, and load was staying under 3, so there were was probably another bottle neck. Running the same tests using php(sapi)+apache(fork), only 4500 requests/second could be executed, and there was a 16+ load.

On the servers, I found it necessary to set the following sysctl setting to increase performance “net.core.somaxconn=1024”. The following commands need to be run to execute this:

sysctl 'net.core.somaxconn=1024' #Store for the current computer session
echo 'net.core.somaxconn=1024' >> /etc/sysctl.conf #Set after a reboot

Stress test client instances were also run on Amazon as m1.xlarge instances, and were found to be able to push 5000-6000 requests/second. The GoLang test clients spawned 200 requests at a time and waited for them to finish before sending the next batch. The client system needed the following sysctl settings for optimal performance:

net.ipv4.ip_local_port_range="15000 65534"
Installing PSLib for PHP
I was setting up a brand new server yesterday running Cent OS 6.4 and had the need to install PSLib (PostScript) for PHP.
The initial setup commands all worked as expected:
cd /src #A good directory to install stuff in. You may need to create it first

#Install intltool (required for pslib)
yum install intltool

#Install pslib
wget http://sourceforge.net/projects/pslib/files/latest/download?source=files
tar -xzvf pslib-*.tar.gz
cd pslib-*
make install
cd ..

#Install pslib wrapper for php
pecl download ps
tar -xzvf ps-*.tgz
cd ps-*

At this point, the make failed with
/src/ps-1.3.6/ps.c:58: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘ps_functions’

After a little bit of browsing the code and a really lucky first guess, I found that changing the following fixed the problem:
In ps.c change: “function_entry ps_functions[] = {” to “zend_function_entry ps_functions[] = {

Then to finish the install, just run:
make install
and you’re done!
Using Twitter API v1.1

Twitter recently turned off their v1.0 API which broke a multitude of applications, many of which are still broken today. I had the need to immediately update at least one website to the new Twitter v1.1 API, but I could not find a simple bare bones example on the internet. Pretty much all the code/libraries out there I could find for the new API version were hundreds to thousands of lines long, or didn’t work >.<; . So anywho, here is the simple PHP code I put together to make API calls.

This code requires certain consumer/authentication keys and secrets. You can find how to generate them elsewhere online.

//OAuth Request parameters
$ConsumerKey='FILL ME IN';
$ConsumerSecret='FILL ME IN';
$AccessToken='FILL ME IN';
$AccessTokenSecret='FILL ME IN';

function EncodeParam($input) { return strtr(rawurlencode($input), Array('+'=>' ', '%7E'=>'~')); }
function SendTwitterRequest($RequestURL, $Params=Array())
	//Compile the OAuth parameters
	global $ConsumerKey, $ConsumerSecret, $AccessToken, $AccessTokenSecret;
		Array('oauth_version'=>'1.0', 'oauth_nonce'=>mt_rand(), 'oauth_timestamp'=>time(), 'oauth_consumer_key'=>$ConsumerKey, 'oauth_signature_method'=>'HMAC-SHA1'),
		isset($AccessToken) ? Array('oauth_token'=>$AccessToken) : Array()
	uksort($Params, 'strcmp'); //Must be sorted to determine the signature
	foreach($Params as $Key => &$Val) //Create the url encoded parameter list
	$Params=implode('&', $Params); //Combine the parameter list
	$Params.='&oauth_signature='.EncodeParam(base64_encode(hash_hmac('sha1', 'GET&'.EncodeParam($RequestURL).'&'.EncodeParam($Params), EncodeParam($ConsumerSecret).'&'.EncodeParam($AccessTokenSecret), TRUE)));

	//Do the OAuth request
		curl_setopt($CurlObj, $Key, $CurlVal);
	return $Result;

If you don’t have an AccessToken and AccessSecret yet, you can get them through the following code:
parse_str($OAuthResult, $OauthRet);
	throw new Exception("OAuth error: $OAuthResult");

Here is an example to pull the last 4 tweets from a user:
$Result=json_decode(SendTwitterRequest('https://api.twitter.com/1.1/statuses/user_timeline.json', Array('screen_name'=>$UserName, 'count'=>4)));
	throw new Exception($Result->{'errors'}[0]->{'message'});
foreach($Result as $Tweet)

print implode('<br>', $Tweets);
JavaScript Cookies Functions

Just throwing these up here for reference. Simple JavaScript scripts to get and set cookies. Not particularly foolproof, robust, or fully featured.

function SetCookie(Name, Value, SecondsToExpire) { document.cookie=Name+"="+escape(Value)+"; expires="+new Date(new Date().getTime()+SecondsToExpire*1000).toUTCString(); }
function GetCookie(Name)
	var Match=document.cookie.match(new RegExp('(?:^|; ?)'+Name+'=(.*?)(?:;|$)'));
	return Match ? unescape(Match[1]) : null;
Converting Videos to HTML5 Formats

The following is a Linux script I threw together to convert an mpeg4 (.mp4, .mpv, .mpeg4, .mpg) file into ogg vorbis (.ogg, .ogv) and flash video (.flv) maintaining the same bit rate. It also doesn’t hurt to have a .webm format for some older devices. This assumes you already have the appropriate packages installed, including ffmpeg. The script also wasn’t made to be foolproof, and might not be able to read the bitrate on some videos. The script takes a list of mpeg4 files to convert.

for filename in "$@"

  export BITRATE=`ffmpeg -i $filename.$extension 2>&1 | grep -oP 'Video.*\d+ kb/s' | grep -oP '\d+ kb/s' | grep -oP '\d+'`
  ffmpeg2theora -V $BITRATE -o $filename.ogv $filename.$extension
  ffmpeg -i $filename.$extension -ar 44100 -b ${BITRATE}k -f flv $filename.flv #Add an optional "-threads #" to make this faster
Font size hack for Mac Web Browsers

Mac renders fonts differently than windows, which is a problem when trying to make cross-system compatible web pages. I've read in many places that using "em" instead of pixels fixes this problem, but sometimes using pixel based measurements is required for a project.

I found when testing a recent project's web page out on Apple's OSX that no matter what browser I used, the fonts were always 2px bigger than on windows, which threw off my layouts. So I used the simple JavaScript solution below (jQuery required). Note that this assumes all font sizes are in pixels. It might not hurt to add a check for that if you mix font size types.

$(document).ready(function() {
		$.each(document.styleSheets, function(Indx, SS) {
			var rules=SS.cssRules || SS.rules;
			for(var i=0;i<rules.length;i++)
				if(rules[i].style && rules[i].style.fontSize!='')
					rules[i].style.fontSize=(parseInt(rules[i].style.fontSize, 10)-2)+'px';
Progress on Windows Logon Background Hacking
Since my last post, I’ve made some progress in my spare time on working out how the windows logon screen displays the background. I’ve gotten to try out a lot of new tools, since I haven’t done anything like this in over 10 years, and quite frankly, I’m mostly disappointed. All the tools I already had seem to do the job better than anything new I could find (though, granted, I need to learn WinDBG more). So at this point I can debug the dll, while it’s running live in Windows, via the following process:
  1. Run the Windows Terminal Services hack (WinXP version explanation) so you can have multiple desktops running at once on the virtual machine, which makes things a little easier, but is not necessary.
  2. Make a backup of C:\Windows\System32\LogonUI.exe and made it editable (see previous post for risks and further info).
  3. Add an INT3 interrupt breakpoint near the beginning of LogonUI.exe. I just changed the first conditional jump in the dll startup code that is supposed to fail to an INT3, padded with NOPs.
  4. Set OllyDbg as the JIT debugger, so whenever LoginUI.exe is run and hits the interrupt, it automatically spawns OllyDbg and is attached.
  5. Tell windows to lock itself (Start>Shut Down>Lock [if available] or Win+L).
  6. As soon as OllyDbg is spawned with LoginUI.exe attached, also attach winlogon.exe in another debugger and keep it paused so it doesn’t keep trying to respawn LoginUI.exe when your attached copy doesn’t respond.

It would be nice if I could find an [easy] way to make a spawned process automatically go into my debugger without the need to add an interrupt, especially to a remote debugger, but oh well.

So my plan of action after this is to:
  1. Get the handle or memory location of where the image is stored by monitoring the GDI calls made after the text reference to C:\Windows\System32\oobe\info\backgrounds\backgroundDefault.jpg .
  2. Put a hardware breakpoint on that memory location/handle and find where it is used to draw to the screen background.
  3. At that point, the GDI call could be manipulated to not shrink the image to the primary monitor (easy), or multiple GDI calls could be made to each monitor for all the images (much harder).
The image might actually be shrunk before it is stored in memory too, though from what I’ve gleamed from the disassembly so far, I do not believe this to be the case.

More to come if and when I make progress.
Encoding & decoding HTML in JavaScript with jQuery

Here are a few functions I’ve been finding a lot of use for lately. They are basically the JavaScript equivalent for PHP’s htmlentities and html_entity_decode. These functions are useful for inserting HTML dynamically, and getting values of contentEditable fields. These functions do replace line breaks appropriately, and HTML2Text removes a trailing line break.

var TextTransformer=$('<div></div>');
function Text2HTML(T) { return TextTransformer.text(T).html().replace(/\r?\n/g, '<br>'); }
function HTML2Text(T) { return TextTransformer.html(ReplaceBreaks(T, "\x01br\x01")).text().replace(/\x01br\x01/g, "\n").replace(/\n$/, ''); }
function ReplaceBreaks(TheHTML, ReplaceText) { return TheHTML.replace(/<\s*br\s*\/?\s*>/g, ReplaceText || ' - '); }
Overcoming the 250KB Windows Login Background Cap

I had the need this year to upgrade to a 6+1 monitor setup for some of the work I’ve been doing.

Home Office 1

Home Office 2

It took me a bit to get everything how I wanted, using Display Fusion for multi monitor control, and a customized version of Window Manager for organizing window positioning. I am very happy with the final result.

However, there was one minor annoyance I decided to tackle as a fun get-back-into-reverse-engineering project (it’s been years since I’ve done any real fun programming, which saddens me greatly). When in the lock/logon screen for Windows 7, only one monitor can show a background, and that background must be limited to a filesize of 250KB, which can greatly reduce the quality of the image.

The C:\Windows\System32\authui.dll controls the lock screen behavior, so it is to this file I looked to for the solutions. Before I go on, there are 2 very important notes I should make:

  1. It can be very dangerous to modify system DLLs. This could crash your operating system, or even make it not able to load! Always backup the files you are modifying first, and make sure you are comfortable with restoring them somehow (most likely using a separate operating system like a Linux Boot CD).
  2. You need to make sure you are actually editing the right file when you open it up. While the file you want will always be in c:\Windows\System32, on 64-bit windows machines there is also a directory at c:\Windows\SysWOW64 that contains a 32-bit version of the file. (Brilliant naming scheme Microsoft! 32 bit files in the “64” directory and vice versa). Depending on the software you are using, sometimes when you try to access the authui.dll in the System32 directory (~1.84MB), it actually modifies the file in SysWOW64 (~1.71MB) using obfuscated Windows magic.

After a little bit of playing, so far I’ve solved the 250KB size limitation, and I plan on continuing to tinker with it a bit more until the other is solved too. To start, you will need to give yourself file system access to modify the c:\Windows\System32\authui.dll file. To do so, go into the file’s property page, change the owner to yourself, and then give appropriate user permissions so you can modify it as you see fit.

Open the authui.dll in your favorite hex editor and replace:
41 B9 00 E8 03 00
41 B9 FF FF FF 00
this essentially changes the size cap to ~16MB. However, I haven’t tested anything larger than 280KB yet. There is possibly a size limitation somewhere that may be dangerous to breech, but from what I gleam from the code; I do not think this is the case.

What this change actually does is update the 256,000 value to (2^24)-1 in the following code:
jmp __imp_GetFileSize
41 B9 00 E8 03 00mov r9d, 3E800h
41 3B C1cmp eax, r9d
jnb short loc_xxx

It’s been a bit tedious working on the assembly code of the authui.dll, as my favorite disassembler/debugger (ollydbg) does not work with 64-bit files, and I am not very comfortable with other dissasemblers I have tried. :-\ Alas. Hopefully more coming soon on this topic.