Home Page
Archive > Posts > Tags > Reverse Engineering
Archive > Posts > Tags > Reverse Engineering

Warning: you do not have javascript enabled. This WILL cause layout glitches.

Progress on Windows Logon Background Hacking
Since my last post, I’ve made some progress in my spare time on working out how the windows logon screen displays the background. I’ve gotten to try out a lot of new tools, since I haven’t done anything like this in over 10 years, and quite frankly, I’m mostly disappointed. All the tools I already had seem to do the job better than anything new I could find (though, granted, I need to learn WinDBG more). So at this point I can debug the dll, while it’s running live in Windows, via the following process:
  1. Run the Windows Terminal Services hack (WinXP version explanation) so you can have multiple desktops running at once on the virtual machine, which makes things a little easier, but is not necessary.
  2. Make a backup of C:\Windows\System32\LogonUI.exe and made it editable (see previous post for risks and further info).
  3. Add an INT3 interrupt breakpoint near the beginning of LogonUI.exe. I just changed the first conditional jump in the dll startup code that is supposed to fail to an INT3, padded with NOPs.
  4. Set OllyDbg as the JIT debugger, so whenever LoginUI.exe is run and hits the interrupt, it automatically spawns OllyDbg and is attached.
  5. Tell windows to lock itself (Start>Shut Down>Lock [if available] or Win+L).
  6. As soon as OllyDbg is spawned with LoginUI.exe attached, also attach winlogon.exe in another debugger and keep it paused so it doesn’t keep trying to respawn LoginUI.exe when your attached copy doesn’t respond.

It would be nice if I could find an [easy] way to make a spawned process automatically go into my debugger without the need to add an interrupt, especially to a remote debugger, but oh well.

So my plan of action after this is to:
  1. Get the handle or memory location of where the image is stored by monitoring the GDI calls made after the text reference to C:\Windows\System32\oobe\info\backgrounds\backgroundDefault.jpg .
  2. Put a hardware breakpoint on that memory location/handle and find where it is used to draw to the screen background.
  3. At that point, the GDI call could be manipulated to not shrink the image to the primary monitor (easy), or multiple GDI calls could be made to each monitor for all the images (much harder).
The image might actually be shrunk before it is stored in memory too, though from what I’ve gleamed from the disassembly so far, I do not believe this to be the case.

More to come if and when I make progress.
Overcoming the 250KB Windows Login Background Cap

I had the need this year to upgrade to a 6+1 monitor setup for some of the work I’ve been doing.

Home Office 1

Home Office 2

It took me a bit to get everything how I wanted, using Display Fusion for multi monitor control, and a customized version of Window Manager for organizing window positioning. I am very happy with the final result.

However, there was one minor annoyance I decided to tackle as a fun get-back-into-reverse-engineering project (it’s been years since I’ve done any real fun programming, which saddens me greatly). When in the lock/logon screen for Windows 7, only one monitor can show a background, and that background must be limited to a filesize of 250KB, which can greatly reduce the quality of the image.

The C:\Windows\System32\authui.dll controls the lock screen behavior, so it is to this file I looked to for the solutions. Before I go on, there are 2 very important notes I should make:

  1. It can be very dangerous to modify system DLLs. This could crash your operating system, or even make it not able to load! Always backup the files you are modifying first, and make sure you are comfortable with restoring them somehow (most likely using a separate operating system like a Linux Boot CD).
  2. You need to make sure you are actually editing the right file when you open it up. While the file you want will always be in c:\Windows\System32, on 64-bit windows machines there is also a directory at c:\Windows\SysWOW64 that contains a 32-bit version of the file. (Brilliant naming scheme Microsoft! 32 bit files in the “64” directory and vice versa). Depending on the software you are using, sometimes when you try to access the authui.dll in the System32 directory (~1.84MB), it actually modifies the file in SysWOW64 (~1.71MB) using obfuscated Windows magic.

After a little bit of playing, so far I’ve solved the 250KB size limitation, and I plan on continuing to tinker with it a bit more until the other is solved too. To start, you will need to give yourself file system access to modify the c:\Windows\System32\authui.dll file. To do so, go into the file’s property page, change the owner to yourself, and then give appropriate user permissions so you can modify it as you see fit.

Open the authui.dll in your favorite hex editor and replace:
41 B9 00 E8 03 00
41 B9 FF FF FF 00
this essentially changes the size cap to ~16MB. However, I haven’t tested anything larger than 280KB yet. There is possibly a size limitation somewhere that may be dangerous to breech, but from what I gleam from the code; I do not think this is the case.

What this change actually does is update the 256,000 value to (2^24)-1 in the following code:
jmp __imp_GetFileSize
41 B9 00 E8 03 00mov r9d, 3E800h
41 3B C1cmp eax, r9d
jnb short loc_xxx

It’s been a bit tedious working on the assembly code of the authui.dll, as my favorite disassembler/debugger (ollydbg) does not work with 64-bit files, and I am not very comfortable with other dissasemblers I have tried. :-\ Alas. Hopefully more coming soon on this topic.

Second Life Research
More old research I never got around to releasing

Back in May of 2007 one of my friends got me onto Second Life, the first and only MMORPG I’ve touch since my Ragnarok days. While Second Life had a strong pull for me due to its similarities to The MetaVerse in Snow Crash, my favorite book, I was of course more drawn to playing with the Engine and seeing what I could do with it.

I felt no real need to delve into the code or packet level of the client as it was open source, so I stayed mostly on the scripting level side of things in the world. IIRC I did find at least a dozen major security holes, but I unfortunately cannot seem to find logs of my research :-(.

I do however remember at least 2 of the security holes I found:
  • While an avatar could not pass through solid walls normally, if an object was visible that allowed “sitting” beyond the walls, the user could issue the sit command on that object which transported the avatar past the barriers.
  • While there were optional restrictions on areas pertaining to if/where an object could be placed, once an object was placed somewhere, it could be “pushed” to almost any other location no matter the restrictions. When an object was pushed into another area beyond where it was placed, it was still inventoried as being in the originally placed location, but could interact with the world at the location it was actually at. Objects could even pass through solid barriers if the proper push velocities were given. The only way at the time to combat this was to have whole private islands as blocking anonymous objects. This security hole opened up multiple other security holes including:
    • If a user “sat” on the object, they could get to anywhere the object could.
    • These objects could be used to interact with the immediate world around them, including repeating private conversations in a private area.

I had also at the time planned on writing an application that allowed hijacking and reuploading any encountered texture or construct, which was trivial due to the open nature of the system. I never did get around to it for two reasons. First, I got distracted by other projects, and second, because it could have seriously destabilized the Second Life economy, which was built around selling said textures and constructs. I actually liked what Second Life was trying to accomplish and had no wish of making Linden Lab’s life harder or ruining the experiment in open economy.

I was however able to find a few pieces of my research and scripts that I figured I could post here. First, I do not recall what I did to find this, but the entire list of pre-defined “Last Names” was accessible, and IIRC the proprietary last names could be used for character creation if you knew how to access them (not 100% sure if this latter hack was available). Here was the list as of when I acquired it in 2007. I had the list separated into two columns, and I think they were “open” names and “proprietary” names. Each name is followed by its identifier.

Open Names
Congrejo(339), Spitteler(957), Boucher(1716), Kohime(2315), Korobase(2363), Bingyi(3983), Hyun(3994), Qunhua(4003), Yiyuan(4010), Nikolaidis(4032), Bikcin(4040), Laryukov(4112), Bamaisin(4127), Choche(4136), Ultsch(4140), Coage(4164), Cioc(4173), Barthelmess(4212), Koenkamp(4322), Daviau(4340), Menges(4345), Beaumont(4390), Lubitsch(4392), Taurog(4408), Negulesco(4418), Beresford(4466), Babenco(4468), Catteneo(4483), Dagostino(4509), Ihnen(4511), Basevi(4517), Gausman(4530), Heron(4533), Fegte(4535), Huldschinsky(4539), Juran(4543), Furse(4548), Heckroth(4550), Perfferle(4552), Reifsnider(4553), Hotaling(4559), DeCuir(4560), Carfagno(4561), Mielziner(4573), Bechir(4592), Zehetbauer(4615), Roelofs(4624), Hienrichs(4647), Rau(4654), Oppewall(4657), Bonetto(4659), Forwzy(4677), Repine(4680), Fimicoloud(4685), Bleac(4687), Anatine(4688), Gynoid(4745), Recreant(4748), Hapmouche(4749), Ceawlin(4758), Balut(4760), Peccable(4768), Barzane(4778), Eilde(4783), Whitfield(4806), Carter(4807), Vuckovic(4808), Rehula(4809), Docherty(4810), Riederer(4811), McMahon(4812), Messmer(4813), Allen(4814), Harrop(4815), Lilliehook(4816), Asbrink(4817), Laval(4818), Dyrssen(4819), Runo(4820), Uggla(4822), Mayo(4823), Handrick(4824), Grut(4825), Szondi(4826), Mannonen(4827), Korhonen(4828), Beck(4829), Nagy(4830), Nemeth(4831), Torok(4832), Mokeev(4833), Lednev(4834), Balczo(4835), Starostin(4836), Masala(4837), Rasmuson(4838), Martinek(4839), Mizser(4840), Zenovka(4841), Dovgal(4842), Capalini(4843), Kuhn(4845), Platthy(4846), Uriza(4847), Cortes(4848), Nishi(4849), Rang(4850), Schridde(4851), Dinzeo(4852), Winkler(4853), Broome(4854), Coakes(4855), Fargis(4856), Beerbaum(4857), Pessoa(4858), Mathy(4859), Robbiani(4860), Raymaker(4861), Voom(4862), Kappler(4863), Katscher(4864), Villota(4865), Etchegaray(4866), Waydelich(4867), Johin(4868), Blachere(4869), Despres(4871), Sautereau(4872), Miles(4873), Lytton(4874), Biedermann(4875), Noel(4876), Pennell(4877), Cazalet(4878), Sands(4879), Tatham(4880), Aabye(4881), Soderstrom(4882), Straaf(4883), Collas(4884), Roffo(4885), Sicling(4886), Flanagan(4887), Seiling(4888), Upshaw(4889), Rodenberger(4890), Habercom(4891), Kungler(4892), Theas(4893), Fride(4894), Hirons(4895), Shepherd(4896), Humphreys(4897), Mills(4898), Ireton(4899), Meriman(4900), Philbin(4901), Kidd(4902), Swindlehurst(4903), Lowey(4904), Foden(4905), Greggan(4906), Tammas(4907), Slade(4908), Munro(4909), Ebbage(4910), Homewood(4911), Chaffe(4912), Woodget(4913), Edman(4914), Fredriksson(4915), Larsson(4916), Gustafson(4917), Hynes(4918), Canning(4919), Loon(4920), Bekkers(4921), Ducatillon(4923), Maertens(4924), Piek(4925), Pintens(4926), Jansma(4927), Sewell(4928), Wuyts(4929), Hoorenbeek(4930), Broek(4931), Jacobus(4932), Streeter(4933), Babii(4934), Yifu(4935), Carlberg(4936), Palen(4937), Lane(4938), Bracken(4939), Bailey(4940), Morigi(4941), Hax(4942), Oyen(4943), Takacs(4944), Saenz(4945), Lundquist(4946), Tripsa(4947), Zabelin(4948), McMillan(4950), Rosca(4951), Zapedzki(4952), Falta(4953), Wiefel(4954), Ferraris(4955), Klaar(4956), Kamachi(4957), Schumann(4958), Milev(4959), Paine(4960), Staheli(4961), Decosta(4962), Schnyder(4963), Umarov(4964), Pinion(4965), Yoshikawa(4966), Mertel(4967), Iuga(4968), Vollmar(4969), Dollinger(4970), Hifeng(4971), Oh(4972), Tenk(4973), Snook(4974), Hultcrantz(4975), Barbosa(4976), Heberle(4977), Dagger(4978), Amat(4979), Jie(4980), Qinan(4981), Yalin(4982), Humby(4983), Carnell(4984), Burt(4985), Hird(4986), Lisle(4987), Huet(4988), Ronmark(4989), Sirbu(4990), Tomsen(4991), Karas(4992), Enoch(4993), Boa(4994), Stenvaag(4995), Bury(4996), Auer(4997), Etzel(4998), Klees(4999), Emmons(5000), Lusch(5001), Martynov(5002), Rotaru(5003), Ballinger(5004), Forcella(5005), Kohnke(5006), Kurka(5007), Writer(5008), Debevec(5009), Hirvi(5010), Planer(5011), Koba(5012), Helgerud(5013), Papp(5014), Melnik(5015), Hammerer(5016), Guyot(5017), Clary(5018), Ewing(5019), Beattie(5020), Merlin(5021), Halasy(5022), Rossini(5024), Halderman(5025), Watanabe(5026), Bade(5027), Vella(5028), Garrigus(5029), Faulds(5030), Pera(5031), Bing(5032), Singh(5033), Maktoum(5034), Petrov(5035), Panacek(5036), Dryke(5037), Shan(5038), Giha(5039), Graves(5040), Benelli(5041), Jun(5042), Ling(5043), Janus(5044), Gazov(5045), Pfeffer(5046), Lykin(5047), Forder(5048), Dench(5049), Hykova(5050), Gufler(5051), Binder(5052), Shilova(5053), Jewell(5054), Sperber(5055), Meili(5056), Matova(5057), Holmer(5058), Balogh(5059), Rhode(5060), Igaly(5061), Demina(5062)

Proprietary Names
ACS(1353), FairChang(1512), Teacher(2186), Learner(2213), Maestro(2214), Aprendiz(2215), Millionsofus(2746), Playahead(2833), RiversRunRed(2834), SunMicrosystems(2836), Carr(2917), Dell(3167), Reuters(3168), Hollywood(3173), Sheep(3471), YouTopia(3816), Hillburn(3817), Bradford(3820), CiscoSystems(3958), PhilipsDesign(3959), MadeVirtual(4205), DuranDuran(4210), eBay(4665), Vodafone(4666), Xerox(4667), TGDev(4668), Modesto(4669), Sensei(4670), Ideator(4671), Autodesk(4789), MovieTickets(4790), AvaStar(4791), DiorJoaillerie(4793), AOL(4795), Gabriel(4805), Tequila(5064), Loken(5065), Matlin(5066), GeekSquad(5067), Bradesco(5068), CredicardCiti(5069), PontiacGXP(5070), KAIZEN(5071), McCain(5072), Schomer(5074), Showtime(5075), OzIslander(5076), Meltingdots(5077), Allanson(5083), Sunbelter(5084), SaxoBank(5085), Esslinger(5086), Stengel(5087), Lemeur(5088), Tsujimoto(5089), KaizenGames(5090), Uphantis(5091), OurVirtualHolland(5092), McKinseyandCompany(5093), Lempert(5094), Affuso(5095), Gkguest(5096), Eye4You(5097), OShea(5098), Citibank(5099), Citicard(5100), Citigroup(5101), Citi(5102), Credicard(5103), Diners(5104), Citifinancial(5105), CitiBusiness(5106), BnT(5107), Yensid(5108), Helnwein(5111), Grindstaff(5112), Shirk(5113), SolidWorks(5114), Storm(5115), CarterFinancial(5116), Parkinson(5117), Lear(5118), FiatBrasil(5119), RossiResidencial(5120), Brooklintolive(5121), Calmund(5123), Briegel(5124), Herde(5125), Pfetzing(5126), Triebel(5127), Roemer(5128), Reacher(5129), Thomas(5130), Fraser(5131), Gabaldon(5132), NBA(5133), Accubee(5134), Brindle(5135), Searer(5136), Ukrop(5137), Ponticelli(5138), Belcastro(5139), Glin(5140), Rice(5141), DavidStern(5142), Totti(5144), onrez(5145), DeAnda(5146), Grandi(5147), Pianist(5148), osMoz(5149), PaulGee(5150)

The second piece I was able to find was a script I used to alert me via email whenever one of my friends signed on. I have unfortunately not tested this script before posting it as I no longer have Second Life installed or wish to waste the time testing it, but here it is none the less. ^_^;

//Users to watch
key DetectPersons=[ //List of UIDs of users to watch. (Real UIDs redacted)
    "fdf1fbff-f19f-ffff-ffff-ffffffffffff", //Person 1
    "f0fffaff-f61f-ffff-ffff-ffffffffffff" //Person 2

//Other Global Variables
integer NumUsers;
integer UsersParsed=0;
list UserNames;
list Status;

        NumUsers=llGetListLength(DetectPersons); //Number of users to watch

        //Get User Names
        integer i;
            llListInsertList(UserNames, [''], i);
            llListInsertList(Status, [0], i);
            llRequestAgentData(llList2Key(DetectPersons, i), DATA_NAME);

    dataserver(key requested, string data)
        //Find User Position
        integer i;
            if(llList2Key(DetectPersons, i)==requested)
                llListReplaceList(UserNames, [data], i, 1);

            state Running;

state Running

        llRequestAgentData(DetectPerson, DATA_ONLINE);

    dataserver(key requested, string data)
        string Message="The user you are watching '"+UserName+"' signed on at "+llGetTimestamp();
        llEmail(EMAIL_ADDRESS, "User Signed on", Message);

Of course all this research was from 2007 and I have no idea what is capable now. I do really hope though that they at least updated the client’s interface because it was incredibly clunky. Also, Second Life has always been a neat experiment, and I hope it still is and continues to keep doing well :-).

OllyDbg 2.0
Reverse engineering is fun! :-D

OllyDbg is my favorite assembly editing environment for reverse engineering applications in Windows. I used it for all of my Ragnarok Online projects in 2002, and you can find a tutorial that uses it here (sorry, the writing in it is horrible x.x; ).

Ever since I started using it back then, the author was talking about his complete rewrite of the program, dubbed version 2.0, that was supposedly going to be much, much better. I have been patiently waiting for it ever since :-). Rather randomly, I decided to check back on the website yesterday, after not having visiting there for over a year, and low and behold, the first beta of version 2.0 [self-mirror] was released yesterday! :-D. Unfortunately, I’m not really doing any reverse engineering or assembly level work right now, so I have no reason or need to test it :-\.

... So yes, just wanted to call attention to this wonderful program being updated, that’s all for today!

Client Side Security Fallacies
Never rely solely on information you receive from untrusted sources

One of the most laughable aspects of client/server* systems is client side based security access restrictions. What I mean by this is when credentials and actions are not checked and restricted on the server side of the equation, only on the client side, which can ALWAYS be bypassed.

To briefly explain why it is basically insane to trust a client computer; ANY multimedia, software, data, etc that has touched a person’s computer is essentially now their property. Once something has been on or through a person’s computer, the user can make copies, modify it, and do whatever the heck they want with it. This is how the digital world works. There are ways to help stop copying and modification, like hashes and encryption, but most of the ways in which things are implemented nowadays are quite fallible. There may be, for example, safeguards in place to only allow a user to use a piece of software on one certain computer or for a certain amount of time (DRM [Digital Rights Management]), but these methods are ALWAYS bypassable. The only true security comes by not letting information which people aren’t supposed to have access to cross through their computer, and keeping track of all verifiable factual information on secure servers. A long time ago at an IGDA [International Game Developers Association] meeting (I only ever went to the one unfortunately :-\), I learned an interesting truth that hadn’t occurred to me before from the lecturer. That is, that companies that make games and other software [usually] know it will sooner or later be pirated/cracked**. The true intention of software DRM is to make it hard enough to crack to discourage the crackers into giving up, and to make it take long enough so that hopefully people stop waiting for a free copy and go ahead and buy it. By the time a piece of software is cracked (if it takes as long as they hope), the companies know the majority of the remainder of the people usually wouldn’t have bought it anyways. Now I’m done with the basic explanation of client side insecurities, back to the real reason for this post.

While it is actually proper to program safeguards into client side software, you can never rely on it for true security. Security measures should always be duplicated in both client and server software. There are two reasons off the top of my head for implementing security access restrictions into the client side of software. The first is to help remove strain on servers. There is no point in asking a server if something is valid when the client can immediately confirm that it isn’t. The second reason is for speed. It’s MUCH quicker if a client can detect a problem and instantly inform the user than having to wait for a server to answer, though this time is usually imperceptible to the user, it can really add up.

So I thought I’d give a couple of examples of this to help you understand more where I’m coming from. This is a very big problem in the software industry. I find exploitable instances of this kind of thing on a very regular basis. However, I generally don’t take advantage of such holes, and try to inform the companies/programmers if they’ll listen. The term for this is white hat hacking, as opposed to black hat.

First, a very basic example. Let’s say you have a folder on your website “/PersonalPictures” that you wanted to restrict access to with a password. The proper way to do it would be to restrict access to the whole folder and all files in it on the server side, requiring a password be sent to the server to view the contents of each file. This is normally done through Apache httpd (the most utilized web server software) with an “.htaccess” file and the mod_auth (authentication) module. The improper way to do it would be a page that forwarded to the “hidden” section with a JavaScript script like the following.

if(prompt('Please enter the password')=='SecretPassword')

The problem with this code is two fold (besides the fact it pops up a request window :-) ). First, the password is exposed in plain text to the user. Fortunately, passwords are usually not as easy to find as this, but I have found passwords in web pages and Flash code before with some digging (yes, Flash files (and Java!) are 100% decompilable to their original source code, sans comments). The second problem is that once the person goes to the URL “/PersonalPictures”, they can get back there and to all files inside it without the password, and also give it freely to others (no need to mention the fact that the URL is written in plain text here, as it’s the same as with the password). This specific problem with JavaScript was much more prevalent in the old day when people ran their web pages through free hosting sites like Geocities (now owned and operated by Yahoo) which didn’t allow for proper password protection.

This kind of problem is still around on the web, though it morphed with the times into a new form. Many server side scripts I have found across the Internet assume their client side web pages can take care of security and ignore the necessary checks in the server scripts. For example, very recently I was on a website that only allowed me to add a few items to a list. The way it was done is that there was a form with a textbox that you submitted every time you wanted to add an entry to the list. After submitting, the page was reloaded with the updated list. After you added the maximum allowed number of items to the list, when the page refreshed, the form to add more was gone. This is incredibly easy to bypass however. The normal way to do this would be to just send the modified packets directly to the server with whatever information you want in it. The easier method would be to make your own form submission page and just submit to the proper URL all you want. The Firebug extension for Firefox however makes this kind of thing INCREDIBLY easy. All that needs to be done is to add an attribute to the form to send the requests to a new window “<form action=... method=... target=_blank>”, so the form is never erased/overwritten and you can keep sending requests all you want. Using Firebug, you can also edit the values of hidden input boxes for this kind of thing.

AJAX (Asynchronous JavaScript and XML - A tool used in web programming to send and receive data from a server without having to refresh a page) has often been lampooned as insecure for this kind of reason. In reality, the medium itself is not insecure at all; it’s just how people use it.

As a matter of fact, the majority of my best and most fun Ragnarok hacking was done with these methods. I just monitored the packets that came in and out of the system, reverse engineered how they were all structured, then made modifications and resent them myself to see what I could do. With this, I was able to do things like (These should be most of the exploits; listed in descending order of usefulness & severity):

  • Duplicate items
  • Crash the server (It was never fixed AFAIK, but I stopped playing 5+ years ago. I just put that it was fixed on my site so people wouldn’t look for it ^_^; )
  • Warp to any map from any warp location (warp locations are only supposed to link to 1 other map)
  • Spoof your name during chats (so you could pretend someone else was saying something - Ender’s game, anyone? ^_^)
  • Use certain skills of other classes (I have up pictures of my swordsman using merchant skills to house a selling shop)
  • Add skills points to an item on your skill tree that is not yet available (and use it immediately)
  • Warp back to save point without dying
  • Talk to NPCs on a map from any location on that map, and sometimes from other maps (great for selling items when in a dungeon)
  • Attack with weapons much quicker than was supposed to be allowed
  • Use certain skills on creatures from any location on a map no matter how far they are
  • Equip any item in any spot (so you could equip body armor on your head slot and get much more free armor defense points)
  • Run commands on your party/guild and in chat rooms as if you were the leader/admin
  • Rollback a characters stat’s to when you logged on that session (part of the dupe hack)
  • Bypass text repetition, length, and curse filters
  • Find out user account names

The original list is here; it should contain most of what I found. I took it down very soon after putting it up (replacement here) because I didn’t want to explicitly screw the game over with people finding out about these hacks (I had a lot of bad encounters with the company that ran the game, they refused to acknowledge or fix existing bugs when I reported them). There were so many things the server didn’t check just because the client wasn’t allowed to do them naturally.

Here are some very old news stories I saved up for when I wrote about this subject:

Just because you don’t give someone a way to do something doesn’t mean they won’t find a way.

*A server is a computer you connect to and a client is the connecting computer. So all you people connecting to this website are clients connecting to my web server.
**“Cracked” usually means to make a piece of software usable when it is not supposed to be, bypassing the DRM
FoxPro Table Memo Corruption
Data integrity loss is such a drag :-(

My father’s optometric practice has been using an old DOS database called “Eyecare” since the (I believe) early 80’s. For many years, he has been programming a new, very customized, database up from scratch in Microsoft Access which is backwards compatible with “Eyecare”, which uses a minor variant of FoxPro databases. I’ve been helping him with minor things on it for a number of years, and more recently I’ve been giving a lot more help in getting it secured and migrated from Microsoft Access databases (.mdb) into MySQL.

A recent problem cropped up in that one of the primary tables started crashing Microsoft Access when it was opened (through a FoxPro ODBC driver). Through some tinkering, he discovered that the memo file (.fpt) for the table was corrupted, as trying to view any memo fields is what crashed Access. He asked me to see if I could help in recovering the file, which fortunately I can do at my leisure, as he keeps paper backups of everything for just such circumstances. He keeps daily backups of everything too… but for some reason that’s not an option.

I went about trying to recover it through the easiest means first, namely, trying to open and export the database through FoxPro, which only recovered 187 of the ~9000 memo records. Next, I tried finding a utility online that did the job, and the first one I found that I thought should work was called “FoxFix”, but it failed miserably. There are a number of other Shareware utilities I could try, but I decided to just see how hard it would be to fix myself first.

I opened the memo file up in a HEX editor, and after some very quick perusing and calculations, it was quite easy to determine the format:

So I continued on the path of seeing what I could do to fix the file.
  • First, I had it jump to the header of each record and just get the record data length, and I very quickly found multiple invalid record lengths.
  • Next, I had it attempt to fix each of these by determining the real length of the memo by searching for the first null terminator (“\0”) character, but I quickly discovered an oddity. There are weird sections in many of the memo fields in the format BYTE{0,0,0,1,0,0,0,1,x}, which is 2 little endian DWORDS which equal 1, and a final byte character (usually 0).
  • I added to the algorithm to include these as part of a memo record, and many more original memo lengths then agreed with my calculated memo lengths.
  • The final thing I did was determine how many invalid (non keyboard) characters there were in the memo data fields. There were ~3500 0x8D characters, which were usually always followed by 0xA, so I assume these were supposed to be line breaks (Windows line breaks are denoted by [0xD/new line/\r],[0xA/carriage return/\n]). There were only 5 other invalid characters, so I just changed these to question marks ‘?’.

Unfortunately, Microsoft Access still crashed when I tried to access the comments fields, so I will next try to just recover the data, tie it to its primary keys (which I will need to determine through the table file [.dbf]), and then rebuild the table. I should be making another post when I get around to doing this.

The following code which “fixes” the table’s memo file took about 2 hours to code up.
//Usually included in windows.h
typedef unsigned long DWORD;
typedef unsigned char BYTE;

#include <iostream.h> //cout
#include <stdio.h> //file io
#include <conio.h> //getch
#include <ctype.h> //isprint

//Memo file structure
#pragma warning(disable: 4200) //Remove zero-sized array warning
const MemoFileHeadLength=512;
const RecordBlockLength=32; //This is actually found in the header at (WORD*)(Start+6)
struct MemoRecord //Full structure must be padded at end with \0 to RecordBlockLength
	DWORD Type; //Type in little endian, 1=Memo
	DWORD Length; //Length in little endian
	BYTE Data[0];
#pragma warning(default: 4200)

//Input and output files
const char *InFile="EXAM.Fpt.old", *OutFile="EXAM.Fpt";

//Assembly functions
__forceinline DWORD BSWAP(DWORD n) //Swaps endianness
	_asm mov eax,n
	_asm bswap eax
	_asm mov n, eax
	return n;

//Main function
void main()
	//Read in file
	const FileSize=6966592; //This should actually be found when the file is opened...
	FILE* MyFile=fopen(InFile, "rb");
	BYTE *MyData=new BYTE[FileSize];
	fread(MyData, FileSize, 1, MyFile);

	//Start checking file integrity
	DWORD FilePosition=MemoFileHeadLength; //Where we currently are in the file
	DWORD RecordNum=0, BadRecords=0, BadBreaks=0, BadChars=0; //Data Counters
	const DWORD OneInLE=0x01000000; //One in little endian
	while(FilePosition<FileSize) //Loop until EOF
		FilePosition+=sizeof(((MemoRecord*)NULL)->Type); //Advanced passed record type (1=memo)
		DWORD CurRecordLength=BSWAP(*(DWORD*)(MyData+FilePosition)); //Pull in little endian record size
		cout << "Record #" << RecordNum++ << " reports " << CurRecordLength << " characters long. (Starts at offset " << FilePosition << ")" << endl; //Output record information

		//Determine actual record length
		FilePosition+=sizeof(((MemoRecord*)NULL)->Length); //Advanced passed record length
		DWORD RealRecordLength=0; //Actual record length
			for(;MyData[FilePosition+RealRecordLength]!=0 && FilePosition+RealRecordLength<FileSize;RealRecordLength++) //Loop until \0 is encountered
#if 1 //**Check for valid characters might not be needed
				if(!isprint(MyData[FilePosition+RealRecordLength])) //Makes sure all characters are valid
					if(MyData[FilePosition+RealRecordLength]==0x8D) //**0x8D maybe should be in ValidCharacters string? - If 0x8D is encountered, replace with 0xD
					else //Otherwise, replace with a "?"

			//Check for inner record memo - I'm not really sure why these are here as they don't really fit into the proper memo record format.... Format is DWORD(1), DWORD(1), BYTE(0)
			if(((MemoRecord*)(MyData+FilePosition+RealRecordLength))->Type==OneInLE && ((MemoRecord*)(MyData+FilePosition+RealRecordLength))->Length==OneInLE /*&& ((MemoRecord*)(MyData+FilePosition+RealRecordLength))->Data[0]==0*/) //**The last byte seems to be able to be anything, so I removed its check
			{ //If inner record memo, current memo must continue
				((MemoRecord*)(MyData+FilePosition+RealRecordLength))->Data[0]=0; //**This might need to be taken out - Force last byte back to 0
			else //Otherwise, current memo is finished
		if(RealRecordLength!=CurRecordLength) //If given length != found length
			//Tell the user a bad record was found
			cout << "   Real Length=" << RealRecordLength << endl;

			//Update little endian bad record length

		//Move to next record - Each record, including RecordLength is padded to RecordBlockLength
		DWORD RealRecordSize=sizeof(MemoRecord)+CurRecordLength;
		FilePosition+=CurRecordLength+(RealRecordSize%RecordBlockLength==0 ? 0 : RecordBlockLength-RealRecordSize%RecordBlockLength);

	//Tell the user file statistics
	cout << "Total bad records=" << BadRecords << endl << "Total bad breaks=" << BadBreaks << endl << "Total bad chars=" << BadChars << endl;

	//Output fixed data to new file
	MyFile=fopen(OutFile, "wb");
	fwrite(MyData, FileSize, 1, MyFile);

	//Cleanup and wait for user keystroke to end
	delete[] MyData;