The "Go" tools
The GoAsm manual
by Jeremy Gordon -
In Part 2 you will find information which is not essential reading for assembler programmers but may be of interest.
Stack is in virtual address spaceThe value in ESP is a virtual address. If it is for example, a value of 64FE3Ch at start up, we are not talking here of this address in real physical memory. To obtain the actual physical memory address the system needs to convert (or "map") the 64FE3Ch according to its own internal records. For example this address might in reality be 2FE3Ch in real physical memory. A virtual address is therefore just a convenient representation of a position in memory. It is often said that each application runs in its own virtual address space. In theory the whole range of 32-bit addresses (zero to 4GB) is available to each application. In practice this is not the case, but it is true that each application running on the system may use the same range of virtual addresses. There is no conflict between them because the system knows which application is addressing memory at any one time and can therefore point the application to the correct place in physical memory. So at any one time it is likely that there are several applications with the same value in ESP. But this value will actually point to different parts of physical memory.
The stack at start-up: contentscontentsIn Windows the main thread is allocated its own stack area by the system when it loads. The system itself uses this thread and the stack area for its own purposes prior to calling the program's starting address. You can see this in the debugger. Start your program up to the starting address and look at the value of ESP. Now open an inspector window for the value of ESP. Now you might expect it to be at the bottom of the memory area, but it is not. If you scroll the inspector to the bottom of the memory area (scroll to the highest address) you see that there has already been a lot of activity in the stack where the system has prepared for its call to the program's starting address. Of interest is that (in W98 anyway) the last value on the stack before the application was called is a return address in Kernel32.dll. This indicates that a function within Kernel32.dll called the application. Because of this return address it is possible to use a simple RET to end a process, rather than calling ExitProcess. Of course this only works if the stack is in equilibrium so that code execution continues in the caller function in Kernel32.dll.
A little further down the stack we can see the filename of the application, and much further down we can see the address of the of the system's own exception handler for the application's main thread has been put on the stack. These things all show that the application's own stack area (and own thread) is used by the system to prepare for a call to the application.
The stack at start-up: amountcontentsIn Windows, when memory is reserved for the use of an application a range of virtual addresses is allocated by the system. This allocation preserves those addresses for the application's use. If the application asks for more memory the same addresses cannot be reused. No physical memory is actually used until the memory is committed. At that point the virtual addresses which have been allocated are mapped to the area or areas of physical memory the system has available.
Obviously for this arrangement to work, the system needs to know the maximum size of contiguous memory which may have to be committed. This is then the allocated range of addresses.
When preserving memory for stack use the same applies. On an application's start-up the system needs to know how much memory to allocate for the stack, and how much to commit in the first instance. These two amounts are contained within the executable at +48h and +4Ch respectively in the optional header (to understand exactly where this is in the executable, you need to know the PE file format). As we see below they apply not only to the application's main thread but also to new threads made by the application.
Most linkers use a default of 1MB and 4K (the normal page size) for these values respectively. With GoLink you can alter the defaults using the /stacksize switch and /stackinit switches respectively (see the GoLink manual how to use these).
Enlargement of the stack at run-timecontentsThe system senses whether an application is attempting to read or write outside the committed stack area by using exception handling. Providing (in W9x) the attempt is within the permitted usable stack area new memory will be committed as required. Even if an attempt is made to enlarge the stack beyond the allocated area, under NT (but not W9x) the system will try to allocate further memory, but this will not be possible if the virtual addresses then required have been allocated to other memory areas.
Permitted usable stack areacontentsThe stack is not considered suitable for keeping large amounts of data and this view is enforced by Windows by its exception mechanism. In W9x the permitted usable stack area is between the current ESP and the next page boundary plus the page size. For example if ESP is 64FE3Ch, then the next page boundary is 64F000h and the extra page (which is usually set at 4K by the system) takes you to 64E000h:-
So if ESP is 64FE3Ch you will find that the instruction
MOV D[ESP-1E40h],0will cause an exception, because the actual point on the stack being addressed here is 64DFFCh which area is unavailable because it has not been committed by the system.
And you can't get round this by moving ESP either. In W9x the system allows you to move ESP only up to the next page boundary + the page size less four bytes. For example if ESP is 64FE3C a single instruction will only be permitted to move ESP by 1E38h (in decimal this is 7836 bytes). This means that the instruction
SUB ESP,1E38hcauses ESP to become 64E004h and is permitted. But the instruction
SUB ESP,1E3Chwill cause an exception. The difference in 4 bytes in the position which triggers the exception suggests that there are two different protection mechanisms at work here.
From the above it might appear that the size of data which might be put on the stack
is limited to 4K, but this is not true. There are two ways to avoid these
exceptions from occurring and thereby to use the stack for larger data areas.
MOV ECX,10 L0: SUB ESP,1000h MOV D[ESP],0 LOOP L0Here the system is made to commit ten 4K chunks of stack memory. ESP then ends up at the top of this stack area. This will not be particularly quick since the system has to commit memory ten times. A quicker method is to instruct the system to commit a larger than usual area of memory for the stack when the application is loaded. With GoLink you can do this using the /stackinit switch. For example:-
/stackinit 0A000will ensure that 40K of memory is committed on the stack at start-up. You will then be able safely to move ESP using the instruction:-
SUB ESP,0A000hgiving you 40K of memory on the stack to play with.
Using the stack to keep data streamscontentsProvided precautions are taken, the stack can be used to hold a fairly large stream of data. The things to remember are:-
The stack in multi-thread applicationscontentsEach thread in your application has its own registers and its own stack. That is to say, when the system gives processor-time to a thread, it will switch to the register context for the thread. This holds all the values of the registers when processor-time was last removed from the thread. Since the registers include ESP, its value will also be correctly switched so that the correct area of physical memory will be used by the thread as its stack. The result is that a thread can rely on the fact that it can use its stack as a discrete area of memory which will not be interfered with by other threads. You can see this in the debugger. You can see that the ESP always changes substantially when execution changes from thread to thread.
When a thread starts it is allocated its stack area. As a practical example, it was found under W98 that the stack of the main thread of an application ran from 64FE3Ch (downwards) and when a new thread was made its stack ran from 75FF9Ch (downwards). In another test, when six new threads were made their stacks started at 19DEF9Ch, 1AFFF9Ch, 1C1FF9Ch, 1D3FF9Ch, 1E5FF9Ch and 1F7FF9Ch respectively. Here you can see that the system is separating the virtual address of each stack area by 128KB more than the default 1MB area. This is probably to allow room for the system's own use of the stack and also some extra leeway. Changing the allocation stack size to 200000h (2MB) using the /stacksize switch and then creating six new threads resulted in the stack areas being separated by 128KB more than 2MB.
The stack frame and local datacontentsA stack frame is a discrete area of the stack which holds a return address of a function and data used by that function without risk of overwrite because the value of ESP has been decreased. The data kept in a stack frame is called "local data". That's because it is intended only for use within the stack frame concerned and is not intended to be addressed by the program generally. Lets take this simple example:-
PROCEDURE1: SUB ESP,20h ;make space on stack for local data ; ;use local data area CALL PROCEDURE2 ; ;return from PROCEDURE2 ; ;continue to use local data ADD ESP,20h ;restore ESP to equilibrium RETand
PROCEDURE2: PUSH EAX,EBX,ECX ; ;carry out various calculations POP ECX,EBX,EAX RETHere the stack frame is created using the SUB ESP,20h instruction. This decreases the value of ESP by 32 bytes creating space on the stack for 8 dwords. Now because ESP has been moved, whatever happens in PROCEDURE2 will not overwrite these 8 dwords. Lets check this visually assuming that ESP is 64FE38h at the start of PROCEDURE1:-
Addressing the local datacontentsNote: this is automated in GoAsm using FRAME..ENDF and in MASM using PROC..ENDP.
Since ESP points to the top of the local data area you can address the data using ESP. So in the example above the first dword of local data would be available at [ESP] immediately after the SUB ESP,20h. But using ESP to keep track of the local data on the stack can be difficult because ESP will move on each CALL or PUSH within the procedure. For this reason the EBP register tends to be used for this purpose instead. This is usually set early in the stack frame to the bottom of the local data and it will not be changed until execution leaves the stack frame. In this way you can be confident that the local data can always be addressed using a particular offset from EBP.
So the code for a typical stack frame now looks like this:-
TypicalStackFrame: PUSH EBP ;save the value of ebp which will be altered } MOV EBP,ESP ;give current value of stack pointer to ebp } "prologue" SUB ESP,0Ch ;make space for local data } ; ;POINT "X" ; ; ;code within the procedure ; MOV ESP,EBP ;restore the stack pointer to previous value } POP EBP ;restore the value of ebp } "epilogue" RET ;return to caller adjusting the stack pointer }Here we have moved the stack pointer by 12 bytes. At point "X" the stack by reference to EBP actually looks like this:-
Now throughout the stack frame, whatever happens to ESP the local data will always be accessible at [EBP-4h], [EBP-8h] and [EBP-0Ch].
Note how ESP is restored to equilibrium automatically by the use of MOV ESP,EBP just before returning to the caller.
You don't have to use EBP for this purpose, any register will do. But EBP is traditionally used for this and your code will be more understandable to others if you stick to this.
Accessing parameters from the stackcontentsWe have already seen how to pass parameters on the stack to other procedures. Now we are going to see how to use parameters passed to procedures in your own code. Basically these parameters are further down the stack so they will not be overwritten under any normal circumstances. For this reason it is not necessary to retrieve and save them at all. Upon entry to a procedure ESP will point to the return address of the procedure (inserted by CALL). So the parameters will be at [ESP+4h], [ESP+8h], [ESP+0Ch] and so on, depending on how many parameters there are. But it may be difficult to keep track of exactly where the parameters are using ESP because it will change upon the next PUSH or CALL. So again you can use EBP to point to the parameters.
If you have the prologue code:-
PUSH EBP ;save the value of ebp which will be altered } MOV EBP,ESP ;give current value of stack pointer to ebp } "prologue" SUB ESP,0Ch ;make space for local data }When ESP is given to EBP it is 4 bytes less in value than at the beginning of the call (this is because of the first "PUSH EBP"). Therefore the parameters can now be accessed using [EBP+8h], [EBP+0Ch], [EBP+10h] and so on, depending on how many parameters there are.
Use of the stack in Window's callback procedurescontentsThe two techniques just dealt with (making space for local data and addressing parameters) are required in Windows callback procedures. The callback procedure most frequently found in Windows programs is the window procedure. It is to this procedure that Windows sends "messages" and Windows expects the correct reply. What is happening here is that Windows calls the windows procedure using the program's own thread. This usually happens while the program is in the message loop either waiting for a return from the API GetMessage, or whilst executing the API DispatchMessage.
Luckily in GoAsm you can use FRAME..ENDF to retrieve the parameters sent by windows and to address them by name. You can also easily make local data areas addressable by name. And you can preserve registers and also restore the stack to equilibrium automatically too. See the GoAsm manual for a full description of how to do this or go back to understanding the stack (part 1).
Copyright © Jeremy Gordon 2002-2003