skip to Main Content

Creating small executables with Qt Creator and MingW

Starting with an empty Plain C Project in Qt Creator IDE and gcc from MinGW as compiler, I will show you how to generate small binaries that are independent from MinGW dlls. At the writing of this article the app versions that I used were Qt Creator 2.6.2 with Qt 5.0.1 and gcc 4.7.2.

Replace the code of the main.c with following:

#include <windows.h>

void main()
  MessageBoxA(0, "Hello", "world!", MB_OK);

Switch the build configuration in menu Projects from Debug to Release and compile the project (Ctrl + B is the default shortcut for this). This will result in a 9.728 byte big exe file. The file content, split in parts, looks like this:

This is the portable executable header, containing 8 section definitions (.text, .data, .rdata, .eh_fram, .bss, .idata, .CRT, .tls).

The code section is nearly as twice as huge as the one generated by Visual Studio compiler and of cause way more than we want inside our exe for calling 2 API functions.

In the import section you can see that it not only includes the Microsoft Visual C Run-Time Library, but also the MingW runtime.

Let's pimp the project to remove all the stuff that we did not want included in the exe file:

  • Open the .pro file of your project and replace it with following:
CONFIG -= qt
SOURCES += main.c
QMAKE_CFLAGS += -fno-asynchronous-unwind-tables
QMAKE_LFLAGS += -static-libgcc -nostdlib
LIBS = -lkernel32 -lmsvcrt -luser32
  • If you try to build the project now you will get an linker error about undefined reference to `__main'. To fix this issue we just rename the main function in our c file:
#include <windows.h>

void __main()
  MessageBoxA(0, "Hello", "world!", MB_OK);

Recompile the project and your compiled exe will have a size of 2.048 byte and look so fresh and so clean like this:

Creating the smallest possible Windows executable using assembly language

Using nasm, we can build the smallest possible native exe (without using a packer, dropper or anything like that) file that will work on all Windows versions. This is what one of the possible solution binary looks like:

The code for this little cutie:

IMAGEBASE equ 400000h

  dw "MZ"                       ; e_magic
  dw 0                          ; e_cblp

; IMAGE_NT_HEADERS - lowest possible start is at 0x4
  dw 'PE',0                     ; Signature

  dw 0x14c                      ; Machine = IMAGE_FILE_MACHINE_I386
  dw 0                          ; NumberOfSections
  dd 'user'                     ; TimeDateStamp
  db '32',0,0                   ; PointerToSymbolTable
  dd 0                          ; NumberOfSymbols
  dw 0                          ; SizeOfOptionalHeader
  dw 2                          ; Characteristics = IMAGE_FILE_EXECUTABLE_IMAGE

  dw 0x10B                      ; Magic = IMAGE_NT_OPTIONAL_HDR32_MAGIC
  db 'k'                        ; MajorLinkerVersion
  db 'e'                        ; MinorLinkerVersion
  dd 'rnel'                     ; SizeOfCode
  db '32',0,0                   ; SizeOfInitializedData
  dd 0                          ; SizeOfUninitializedData
  dd Start - IMAGEBASE          ; AddressOfEntryPoint
  dd 0                          ; BaseOfCode
  dd 0                          ; BaseOfData
  dd IMAGEBASE                  ; ImageBase
  dd 4                          ; SectionAlignment - overlapping address with IMAGE_DOS_HEADER.e_lfanew
  dd 4                          ; FileAlignment
  dw 0                          ; MajorOperatingSystemVersion
  dw 0                          ; MinorOperatingSystemVersion
  dw 0                          ; MajorImageVersion
  dw 0                          ; MinorImageVersion
  dw 4                          ; MajorSubsystemVersion
  dw 0                          ; MinorSubsystemVersion
  dd 0                          ; Win32VersionValue
  dd 0x40                       ; SizeOfImage
  dd 0                          ; SizeOfHeaders
  dd 0                          ; CheckSum
  dw 2                          ; Subsystem = IMAGE_SUBSYSTEM_WINDOWS_CUI
  dw 0                          ; DllCharacteristics
  dd 0                          ; SizeOfStackReserve
  dd 0                          ; SizeOfStackCommit
  dd 0                          ; SizeOfHeapReserve
  dd 0                          ; SizeOfHeapCommit
  dd 0                          ; LoaderFlags
  dd 2                          ; NumberOfRvaAndSizes

  dd 0                          ; VirtualAddress
  dd 0                          ; Size


  push  0                       ; = MB_OK - overlapps with IMAGE_DIRECTORY_ENTRY_IMPORT.Size
  push  world
  push  hello
  push  0
  call  [MessageBoxA]
  push  0
  call  [ExitProcess]

  dd impnameExitProcess - IMAGEBASE
  dd 0
  dd impnameExitProcess - IMAGEBASE
  dw 0

impnameExitProcess:             ; IMAGE_IMPORT_BY_NAME
  dw 0                          ; Hint, terminate list before
  db 'ExitProcess'              ; Name
impnameMessageBoxA:             ; IMAGE_IMPORT_BY_NAME
  dw 0                          ; Hint, terminate string before
  db 'MessageBoxA', 0           ; Name

  dd impnameMessageBoxA - IMAGEBASE
  dd 0
  dd impnameMessageBoxA - IMAGEBASE
  dd 0

; IMAGE_IMPORT_DESCRIPTOR for kernel32.dll
  dd kernel32.dll_hintnames - IMAGEBASE ; OriginalFirstThunk / Characteristics
  db 'worl'                     ; TimeDateStamp
  db 'd!',0,0                   ; ForwarderChain
  dd kernel32.dll - IMAGEBASE   ; Name
  dd kernel32.dll_iat - IMAGEBASE ; FirstThunk

  dd user32.dll_hintnames - IMAGEBASE ; OriginalFirstThunk / Characteristics
  db 'Hell'                     ; TimeDateStamp
  db 'o',0,0,0                  ; ForwarderChain
  dd user32.dll - IMAGEBASE     ; Name
  dd user32.dll_iat - IMAGEBASE ; FirstThunk

; IMAGE_IMPORT_DESCRIPTOR empty one to terminate the list all bytes after the end will be zero in memory
times 7 db 0                    ; fill up exe to be 268 byte, smallest working exe for win7 64bit

Save the file as tinyexe.asm and assemble it with:

nasm -f bin -o tinyexe.exe tinyexe.asm

Some short facts about this binary:

  • As Ange Albertini found out, the smallest possible universal exe that works for all Windows version up to Windows 7 64 bit (Still needs to be tested on Windows 8 tho) is 268 byte
    There is still room for optimization in this code (like moving code into header, using smaller opcodes for it or exiting the program without the call to ExitProcess), but the resulting binary can't be smaller anyway
  • Some fields in the header can be abused to store code or data, I use them to store the 2 imported dll names. Peter Ferrie did some nice work on figuring the details out of what fields can be reused
  • Some lists like the import descriptor one use an empty entry to mark the end of the list, so we can reuse the extra length definition of this list for other data if the value inside this field is high enough to point after the end of such lists
  • The imported dlls can be imported without using the .dll at the end of the string
  • We don't need a linker for this project, even the assembler does not have to do much work beside resolving symbolic names and calculating the memory locations and translating the push and call instruction to opcode
  • The binary works when run with Wine, whether the exe works on Win 9x and Win 2k I still need to verify

Converting a DOS intro to JavaScript/HTML5

One day I had the idea of converting my fr29b DOS intro to JavaScript. Using the canvas element of HTML5, this should be an easy task and offer a good performance as well. To make the port as similar as possible, the standard VGA DOS palette should be supported. Drawing the ARGB values into the canvas can be speed up by using JavaScript typed arrays with an int32 view to write into the image data buffer. The DOS palette can be found in DOSBox source code and converted to a JavaScript usable format with this small code:

#include <stdio.h>

static unsigned char vga_palette[256][3]= {
  // put array content from file DOSBox source file src\ints\int10_modes.cpp here

int main() {
  for (int i = 0; i < 256; i++)
    printf("0xFF%02X%02X%02X,\n", vga_palette[i][0], vga_palette[i][1], vga_palette[i][2]);

The initial framework JavaScript code:

<canvas height="200" id="vga" width="320">
<script type="text/javascript">
// ... full palette that was giving out by the conversion tool here

var canvas = document.getElementById('vga');
var ctx = canvas.getContext('2d');
var imageData = ctx.getImageData(0, 0, 320, 200);
var buf = new ArrayBuffer(;
var buf8 = new Uint8ClampedArray(buf);
var buf32 = new Uint32Array(buf);
var dosvmem = new ArrayBuffer(320*200);
for (i = 0; i < 320*200; i++) { // init screen with black
  buf32[i] = 0xFF000000;
  dosvmem[i] = 0;

The DOSBox VGA palette is given in 6-bit RGB and needs to be converted to modern RGB 8-bit:

for (i = 0; i < 256; i++) {
  color = vgapalette[i];
  r = (color >> 16) & 0xff;
  r = (r << 2) | (r >> 4)
  g = (color >> 8) & 0xff;
  g = (g << 2) | (g >> 4)
  b = (color >> 0) & 0xff;
  b = (b << 2) | (b >> 4)
  colornew = (r << 16) + (g << 8) + (b << 0) + 0xFF000000;
  vgapalette[i] = colornew;

The main loop and the effect itself:

setInterval(function() {
  offset = Math.floor(Math.random() * 103981) & 0xffff; // generate random offset to screen
  for (i = 0; i < 140; i++, offset += 180) {
    for (j = 0; j < 140; j++, offset++) {       if (offset > 320*200) { // offset is outside of screen?
        if (offset <= 0xffff { // no 16 bit overflow?
        offset -= 0xffff; // simulate overflow
      colorindex = (((dosvmem[offset] + 1) & 0xff) | 0x80); // increase palette index
      dosvmem[offset] = colorindex; // store the new index
      color = vgapalette[colorindex]; // get argb color value from palette
      buf32[offset] = color; // set pixel on screen
  ctx.putImageData(imageData, 0, 0);
}, 1000 / 35); // 35 fps

Analysis of the 624 (Six-2-Four) packer

The 624 packer is a executable packer that got released in 1997 by Kim Holviala. It only supports DOS .com files and was targeted to compress 4kb demoscene intros but offers a decent compression of files from 1 kb to 20 kb size.

  • Uses LZSS (Lempel–Ziv–Storer–Szymanski) compression algorithm
  • The assembly unpacking stub is 127 byte small
  • Fixed length huffmann codes are used to store the length and offset of a match
  • Packed data is stored as a bit stream. This allows for a shorter unpacker stub and higher compression ratio but slightly slower decompression
  • 1 byte matches are stored using 6 bit, 1 to mark a match, 1 to store the length as a marker for 1 byte length and the remaining 4 bit to encode the offset of the match, therefor up to 16 byte backwards in the output buffer
  • The storing of output bytes is using a byte addressing instruction. This allows the compressor to use RLE to compress a run of a literal by storing offset 1 and the amount of bytes to copy as match length

Later someone released a rewrite of this packer in assembly as version 1.1. In this version the unpacking stub had been optimized to 116 byte and the compression was notable faster.

Ideas to improve the compression ratio:

  • With the current implementation the first byte is always a literal, therefor the bit in front to mark it as literal is not needed. The unpacking stub could jump at beginning of execution directly to the code to copy the literal to the output buffer
  • A match will probably be followed by a literal. The bit to mark the literal can be skipped and the unpacker adjusted to expect a literal after a match without changing the size of the unpacker stub
  • A run of literals could be encoded with the length of it. The following compressed flag of the match following the literal run would not need to be saved
  • Huffman codes can be replaced by [gamma encoding](, which would result in a smaller unpacker stub but may depend on the input file to compress to improve the overall compression ratio

The packer versions to download:

Unpack packed DOS binaries with DOSBox debugger

Set up the DOSBox debugger

First off, you will need a DOSBox version that is compiled with the built-in debugger. If you are using Windows and don't want to compile it yourself, you can grab the latest version of the DOSBox debugger from The unpacking target of my choice is Omniscent, one of the most impressive 4kb DOS intros that exist. It uses a custom packer to compress the executable. I wanted to have the unpacked version of this intro and compare how the common DOS executable packer compress this file. This tutorial works for both .com and .exe files.

Start DOSBox but do not run your targeted executable yet. Enter DEBUG followed by the name of your executable that you want to unpack and press Return. It seems like the DOSBox Debugger has a bug that causes the window to not refresh itself after trapping into the program. Click into the Debugger window and press a key like Space to refresh the window. Your screen looks now like this:

DOSBox debugger 1DOSBox debugger 2

The normal DOS executable unpacking stub first copies itself plus the packed data to a higher memory location. It then starts to unpack the data to the address space where to execution started. As a result, the memory location will now contain the unpacked code with the Original Entry Point (OEP) of the real program. After the unpacking is done, the stub jumps back to the address of the first executed instruction and starts the execution of the real program. This behaviour can be used as a quite easy method to unpack the packed executable now.

Debug the target program

Enter the command BP CS:IP to set a breakpoint at the current instruction. Execute the current instruction with the F11 key and press F5 to continue the normal execution. After a short time the debugger breaks again at the starting address 0100. Finally the execution stopped at OEP before the first instruction of the unpacked program executes.

Dump the program at OEP

Enter the command MEMDUMPBIN CS:IP 60000 and press Return. The unpacked program should have been saved into the current directory now. The last parameter of the command is the number of bytes to dump and can be adjusted for larger programs. Your debugger window should look like this now:

DOSBox debugger 3

Rename the file MEMDUMP.BIN to .com or .exe depending on your program type. You can use a hex-editor to remove zero-bytes at the end of the program. Run the renamed executable to see if the unpacking worked. For my unpacking example, the DOS screen is now full of awesomeness:

DOSBox debugger 4

This unpacking method should work on DOS packers such as 624 and UPX.

How to filter Facebook ads and annoyance using CSS

To apply a user stylesheet on a website, you will need a browser plugin like Stylish, which exists for Chrome and Firefox. The website for this plugin is and contains tons of user made stylesheets for websites.

  • Install the plugin
  • Open Stylish extension options
  • Click ‘Add New Style’
  • Enter ‘Facebook’ in the name field and check ‘Enabled’
  • Copy & paste the following stylesheet into 'Code'
  • Click the ‘Specify’ button and switch the drop down to ‘URLs on the domain’ and enter ''
  • Click ‘Save’ and open Facebook to see the difference
/* Custom user styles for by */

/* Apps */

/* Suggested Page */

/* Sponsored */
#pagelet_ego_pane_w, .ego_section, .ego_unit_container,

/* List Suggestions */

/* Like pages */
#pagesNav > ul > li:nth-child(3),

/* Like your favourite Pages in feeds */

/* Like Similar Stories */

/* Suggested Post - profile picture can't be filtered yet */
.uiStreamHeadlineWithLikeButton, .uiStreamHeadlineWithLikeButton~h5, .uiStreamHeadlineWithLikeButton~div, .uiStreamHeadlineWithLikeButton~form {
  display: none !important;

.fbx #globalContainer {

.hasLeftCol .homeWiderContent div#contentArea {

The complete filtering of suggested posts is not possible with current CSS standard due to a missing parent selector. The W3C Working Draft for Selectors Level 4 provide a syntax to define a subject of a selector, which would help filtering this. Some other styles that I am using:

How to backup and restore a MongoDB Sharded Cluster

The goal of this tutorial is to show a way to backup and restore a MongoDB without the need of using file system snapshots by only using MongoDB shipped tools. In my case I did not have the option to use LVM snapshots, which would have been my primary choice. The restored Cluster will have the same distribution of chunks like it was when the backup was done.

My MongoDB cluster setup looks like this:

  • 2 replicasets
  • Each replicaset added as a shard
  • Each replicaset consists of 1 master, 1 slave and 1 arbiter

You can create a full backup of the MongoDB cluster using:

mongodump --host mongos1 --port 27017

Let's now simulate a worst case scenario. You lost all hard drives on all server in your Cluster. Starting from scratch with empty server, configure the basic settings like port, replicaset and config server list. Start up all mongod and config server instances. Make sure that there are no write requests on the MongoDB cluster until the import of the structure data is done and the data import starts. You could use a different port for the single mongos that you now need to start, which will handle most of the restore process.

First restore the settings like chunk size:

mongorestore --host mongos1 --port 27017 --drop -d config -c settings dump/config/settings.bson
mongo --host mongos1 --port 27017 --eval 'sh.setBalancerState(false)'

Reinitialize the replicasets:

Initialize replicaset 1:

mongo --host rs1host1 --port 27017
    "_id" : "rs1",
    "members" : [
        "_id" : 0,
        "host" : "rs1host1:27017"
        "_id" : 1,
        "host" : "rs1host2:27017"

Check rs.status() until initialization is done:


Initialize replicaset 2:

mongo --host rs2host1 --port 27017
    "_id" : "rs2",
    "members" : [
        "_id" : 0,
        "host" : "rs2host1:27017"
        "_id" : 1,
        "host" : "rs2host2:27017"

Check rs.status() until initialization is done:


Restore the remaining configuration data:

mongorestore --host mongos1 --port 27017 --drop -d admin dump/admin
mongorestore --host mongos1 --port 27017 -d config -c databases dump/config/databases.bson
mongorestore --host mongos1 --port 27017 -d config -c shards dump/config/shards.bson
mongorestore --host mongos1 --port 27017 -d config -c chunks dump/config/chunks.bson
mongorestore --host mongos1 --port 27017 -d config -c collections dump/config/collections.bson
mongorestore --host mongos1 --port 27017 -d config -c tags dump/config/tags.bson

Make sure that all mongos instances get the manually changed new configuration of the Cluster.

mongo --host mongos1 --port 27017 --eval 'db.adminCommand({"flushRouterConfig" : 1})'

The import of the MongoDB internal structures is now done. You can enable inserting to the databases, but don't enable the balancer yet. Import each database that you wish to restore:

mongorestore --host mongos1 --port 27017 -d database1 dump/database1
mongorestore --host mongos1 --port 27017 -d database2 dump/database2
mongo --host mongos1 --port 27017 --eval 'sh.setBalancerState(true)'

The restore process is now done. Your cluster is in the same state as the time that the backup was done.

Facts to know about MongoDB

Facts To Know About MongoDB

The following list contains some facts and problems that I stumbled across in my MongoDB journey. It was a good lesson on what problems you can face in a production environment when you or your company decide to jump onto the latest Hype Train of technology. Fool me once, shame on you; fool me twice, shame on me.

I will update this post from time to time to mark the fixed problems - the current speed is 1 fixed ticket per year. It has been some years since I had to use this database, but I guess it could only improve from that point. Last update: 2017-09-06

Are you thinking about using MongoDB? Don't give up yourself, there is help in form of a different database out there.


  • TOOLS-106 mongorestore has a hardcoded filename for the oplogreplay option, if you want to replay any other file you have to rename the file to oplog.bson. Fixed in 3.3.1.
  • TOOLS-111 Full restore of a MongoDB Sharded Cluster is now disabled in the MongoDB code. Trying to do so will give you the error message: Cannot do a full restore on a sharded system. You can still do it manually by following my guide How to backup and restore a MongoDB Sharded Cluster.

MongoDB Management Service (MMS)

  • The Profile Data option for each server will contain the same data that you can find in the mongod.log. I expected that the Monitoring Agent will grab data of the system.profile collection, which includes a detailed lockStats field.


  • SERVER-7680 replSetSyncFrom is a nice feature if it would work like one would have expected it to be implemented. If you add a new member to a replicaset it will always sync from the master. This feature could have been useful to take some load of the master and sync from a secondary instead. Executing this command while the initial sync from master is running will not do anything. Update: This ticket was closed because of Duplicate but I could not find the ticket that it duplicates. I guess this is just the Mongodb way of fixing substantial problems.
  • Using mongodump and mongorestore to clone an existing replicaset member for seeding a new member will not work, when you add this seeded member to the replicaset you will get a replSet initial sync drop all databases message telling you that this member will start with empty databases and do a full sync.


  • SERVER-9275 The official RPM for MongoDB includes a wrong path inside the /etc/init.d/mongod startup script. Fixed in 2.5.3.
  • Some binary of the mongo package may fail to start printing out the error message what(): locale::facet::_S_create_c_locale name not valid. You can fix that by running the shell command export LC_ALL=C.


  • SERVER-1240 MongoDB uses database locking. If you are writing into a collection, the whole database containing this collection is locked for reads or other writes. This is some really bad stuff that you should consider while you are designing your database layout. You could put each collection into a database that just contains this collection. Fixed in 2.7.8.
  • SERVER-863 To reduce the object size it is recommend to shorten the field names. At the moment, the translation from shorter to longer readable names needs to be implemented at application layer.
  • Dropping a collection will not free disk space nor work when you run out of disk space: exception: Dropping collection failed on the following hosts: { assertion: "Can't take a write lock while out of disk space"


  • If you lose all data of your config server in a sharded environment and try to add those filled server as shard you will get: "errmsg" : "can't add shard server1:27017 because a local database 'test' exists in another shard0000:server2:27017"
  • In case that you lose the config.chunks collection and try to restart your sharded cluster, all collections will have a disabled sharded state and will only know about the data that resists on the primary shard of every database. When you try to re-enable sharding for a collection and the balancer starts to run, MongoDB will not recognize the data that still resists on the secondary shards but instead delete all of them and grinning at you with the warning [migrateThread] warning: moveChunkCmd deleted data already in chunk
  • There is no option to exclude databases or collections from balancing.


  • Using find() in mongo shell will limit the shown results automatically, if you add an .explain() to this command this automatically limitation will not be used and the whole table will be scanned. This could block your database if you run this on a large collection (and worse: on a production deployment). If possible, always add and limit() before explain(), even a high number as limit will help in such a case. In case you want to kill such an op, use db.currentOp() to find the op that blocks the database and run db.killOp() with opid as parameter.

Text search

  • SERVER-9779 The text search scoring can't be influenced by other fields.

TTL Indexes

29 byte DOS intro called fr29b

When going back down memory lane about my programming projects the first memory that comes up is developing software under DOS. The simple way of writing programs as .com files allowed for a lot of fun stuff like coding size competitions. Those executable files did not include any header and got executed as code starting from first byte of the file.

My favorite production of the old days is this 29 byte small intro that I developed 2001 using assembly language. It did fit into the 32 byte intro competition of the 0a000h demo party 2002 and won the first place.

org 100h                        ; tell nasm that this program will be loaded at 0x100 address
    mov     al, 13h
    int     10h                 ; set 320*200 graphics mode with 256 colors
    lds     bp, [bx]

    rdtsc                       ; read time stamp counter
    xchg    ax, di              ; use it as random offset to VGA buffer
    mov     cl, 8ch             ; use a 140*140 box

    mov     bl, 8ch

    inc     byte [bx+di]        ; increase color index in this box
    or      byte [bx+di], 80h   ; ensure upper bit is set
    dec     bx
    jnz     next_pixel
    add     di, 140h            ; add 320 = point to next line
    loop    next_line
    jmp     next_box

If you want to assemble it yourself into a .com file, download nasm, save the code as fr29b.asm and execute this command on a shell:

nasm -f bin -o fr29b.asm

To run DOS programs on a system that doesn't include a DOS subsystem (Like Windows Version up to Windows XP 32 Bit did) you should get DOSBox. If you run Windows Vista or later, open "Control Panel\Default Programs\Associate a file type or protocol with a program", scroll to .com file extension and then set the installed DOSBox binary as default program for this file type. Now you are able to execute .com files directly in an emulated and safe environment on your machine.

The algorithm behind the plasma effect is quite simple. At a random offset on the screen i increase the color index of a 140 pixel *140 pixel box that starts at this random offset. You can increase or decrease the size of the box and you will get a slightly different looking effect, but in my tests I preferred the one using 140. If the random offsets starts near a line end, drawing over the end of the line will draw on the beginning of the start of the next line. Looking at the default VGA palette you will notice that the first 16 (CGA compatible) colors don't fade nicely into each other when we just increase the palette index in memory for those. To circumvent this problem i make sure that the highest bit of each pixel (which represents an index to the VGA palette) is always 1, so that increasing random pixel will just rotate the index to the palette in the range of 128 and 255.



If you are unfamiliar with the lds trick at third code line, let me explain it: The lds instruction will load a pointer into ds and the register specified as argument. On startup of a dos program the cpu register bx is 0, therefore the instruction "lds bp, [bx]" will load the 2 bytes at memory address 0x0000 into bp and the following 2 bytes at address 0x0002 into ds. The com file will get loaded at offset 0x0100 by the program loader, the 0x100 bytes in front of that memory contains the Program Segment Prefix.

Looking at the meaning of the first values you can see that bx will contain the opcode of the int 20h instruction and ds the memory size in paragraphs. The memory size in paragraphs is usually 0x9fff and only 1 number and therefor 16 byte off (1 paragraph represents 16 byte) of our desired 0xa000 address which represents the VGA memory segment.

Back To Top