A "Go" development tool:
http://www.GoDevTool.com
by Jeremy Gordon -
with assistance from Wayne Radburn
GoAsm Assembler and Tools forum
(in the MASM Forum)
Go to Alphabetical Index |
Some assemblers, like NASM, do no type checking at all. Others, like
A386, do only basic type checking based on the byte, word, dword, qword and
tword types. MASM and TASM, like "C", allow you to specify your own types
using TYPEDEF and then type-check based on those specified types.
Parameter checking checks that the correct number of parameters are
passed to an API and also type checks the parameters. Most assemblers do
not parameter check but MASM permits
parameter checking when the INVOKE pseudo mnemonic is used.
The overheads required to achieve full type and parameter checking
like a "C" compiler are enormous. Just
look through a Windows header file and see the long lists of various types
allocated to various structures and to the
parameters of APIs. Then look at the efforts of the programmer which
are required in the source script to ensure that no error is thrown up
by the assembler or compiler.
I decided to follow the NASM example and not even offer basic type checking as A386 provides. I have used A386 over many years and have enjoyed its clean syntax, but I have only found its basic type checking a hindrance when programming for Windows. This is because there are often occasions when you want to write to data, or read from data using a different size of data than used to declare the data in the first place.
As for parameter checking, again I have not even tried to offer this since in my view it unnecessarily complicates things. It again requires enormous lists of APIs and parameters to be provided to the assembler or compiler so that it can check that these match what you are giving the API. Miss one and your program does not compile. Take the example of
PUSH 40h,EDX,EAX,[hwnd] CALL MessageBoxAHere is a call to an API which takes 4 parameters. Now it is said that you would like the assembler to tell you if you send the wrong number of parameters. But you don't need this warning. Your program would simply crash if you sent the wrong number of parameters, and you are going to test this call aren't you? Yes! There is no hidden, latent, fault here which will not be noticed at testing stage. Then, it is said that you need type checking in case you send the wrong data size to an API as a parameter. I just can't see this. All parameters to APIs are dwords (with one or two exceptions out of thousands). So you won't be sending the wrong size of data to an API.
Abolishing parameter and type checking not only frees the assembler from a great deal of work, making it faster in operation, but it also frees the programmer from the headache of manipulating header and include files. It provides greater fluidity in memory addressing, since errors will not be thrown up if you want to use data of a size which does not match the size of the data declaration. So in GoAsm even if lParam has been declared as a dword,
MOV [lParam],ALis still allowed. And if LOGFONT is a simple structure of dwords, GoAsm is quite happy for example with
MOV B[LOGFONT+14h],1which you might want to use to set a font to italic.
By not type and parameter checking I have been able to abolish
EXTRN. GoAsm does not need to know the type of symbols which are declared
outside the source file (ie. to be found during linking). I hope
you will agree this relieves you from a lot of hard work and anguish in
having to add those EXTRNs in your larger programs.
The corollary to the abolition of type and parameter checking is that
you must tell GoAsm the size of the
data to be worked on, if this is not obvious.
So, for example,
MOV [MemThing],23h is an error. To load 23h as a byte into MemThing you
need to code MOV B[MemThing],23h. This is because GoAsm will not know at
assembly time whether the 23h should be loaded as a byte, word, or dword,
all of which are permitted by the MOV instruction.
In some ways the requirement for a type indicator (when the type is not obvious) is helpful. This is because you can see from the instruction itself how much memory is affected by the instruction. You don't have to look up a particular data declaration to see its type in order to see what the instruction will do. So, for example:-
MOV B[MemByte],23h ;comforting to see this is limited to a byte operation FLD Q[NUMBER] ;useful to know real number loaded with double precision INC B[COUNT] ;essential to know this can count only up to 256Another advantage arising from no parameter checking is that there is no need for GoAsm to decorate the names of calls to other modules or to imports. When using GoLink this is a considerable advantage since there is no need for LIB files at the linking stage. But it does mean that GoAsm object files will differ from object files made using a "C" compiler or with MASM because those files will contain decorated symbols while the GoAsm ones will not. Since Version 0.26.10 GoLink has been able to accept object files from both sets of tools and link them with GoAsm object files (just use the GoLink /mix switch - see the GoLink help file).
MOV EAX,lParam MOV EAX,[lParam]However, A386 differentiated between labels with and without colons so that the above was only true if lParam was declared as follows
lParam DD 0but not if it was declared as:-
lParam: DD 0In that case MOV EAX,lParam in A386 would act the same was as MOV EAX,OFFSET lParam. Very confusing!
MOV EAX,lParamIn NASM this is the same as MOV EAX,OFFSET lParam for other assemblers.
MOV EAX,[lParam]or
MOV EAX,OFFSET lParamI tend to agree with this approach. The main aim here is to ensure that coding is unambiguous.
MOV EBX,wParamis completely outlawed, unless wParam is a defined word. In order to get the offset in GoAsm you must use
MOV EBX,ADDR wParam or if you prefer MOV EBX,OFFSET wParam which means the same thingIn order to address memory in GoAsm you must use
MOV EBX,[wParam]
CMPS - use CMPSB or CMPSD INS - use INSB or INSD LODS - use LODSB or LODSD MOVS - use MOVSB or MOVSD OUTS - use OUTSB or OUTSD SCAS - use SCASB or SCASD STOS - use STOSB or STOSD XLAT - use XLATB
The asm file is a file which you make and edit using an ordinary text editor,
such as Paws which you can download from my web site,
www.GoDevTool.com, or a program
like Notepad or Wordpad which comes with Windows. If you use Notepad
or Wordpad you should make sure you save the file in a format which
adds no control or formatting characters, other than the usual end of
line characters (carriage return and line-feed). This is because GoAsm
only looks for plain text. You can achieve this by
saving the file as a "text" document. If you don't use an extension
for the file (the extension is the characters after the "dot") then
the editor may give the file a ".txt" extension
but you can change this by renaming the file (you can rename the file
by right-clicking on the name using Windows Explorer or My Computer).
It may be that you cannot see the extension on your computer, because
it may be set that way. To see the extensions of your files
from Windows Explorer, choose the menu item "View", "Folder options",
then the "View" tab and ensure that the "Hide file extensions for known
file types" is not checked. The procedure may differ slightly in different
versions of Windows.
It is traditional amongst programmers to give their source scripts
an extension which matches the language in which the source code
is written. For example you might have an assembler file called
"myprog.asm". Similarly you will usually find source code written
in the "C" language with the extension ".c" or ".cpp" (for "C++"),
".pas" for pascal and so on. However, there is no magic in these
extensions. GoAsm will accept files of any extension or
files which do not have an extension.
The .asm file contains your instructions
to the processor in words and numbers. These are executed by the processor
when the program is run. It is said therefore, that the .asm file
contains your "source code" or your "source script".
As an example let's look at the code and data in a simple 32-bit Windows program which writes "Hello World" to the MS-DOS (command prompt) window (the "console"). This is what you would put into your asm file:-
DATA SECTION ; KEEP DD 0 ;temporary place to keep things ; CODE SECTION ; START: PUSH -11 ;STD_OUTPUT_HANDLE CALL GetStdHandle ;get, in eax, handle to active screen buffer PUSH 0,ADDR KEEP ;KEEP receives output from API PUSH 24,'Hello World (from GoAsm)' ;24=length of string PUSH EAX ;handle to active screen buffer CALL WriteFile XOR EAX,EAX ;return eax=0 as preferred by Windows RETNote that anything after a semi-colon is ignored, so you can insert comments. See operators for other comment forms. See provide good comments and descriptions for the importance of comments.
Having put in the code and data to your file you are ready to make your program. This is done in two steps. First you need to assemble your file and then you need to link it. In order to do this you need to open an MS-DOS (command prompt) window. See how to do this. In this case you use the command line:-
GoAsm /fo HelloWorld.obj filenamewhere filename is the name of your asm file. See starting GoAsm for how to use the command line for GoAsm.
GoLink /console helloworld.obj kernel32.dll(add "-debug coff" if you want to watch the program in the debugger).
Note that the GetStdHandle and WriteFile calls are to kernel32.dll which is why the name of that Dll appears in the GoLink command line. See for more information about Dlls. See using GoAsm with various linkers for more information about using GoLink and other linkers if you prefer. See the GoLink help file for other GoLink options.
GoLink creates the file HelloWorld.exe. You can then run this program from the MS-DOS (command prompt) window. Type in HelloWorld and press enter. You will see the string you sent to WriteFile is written in the console.
So let's recap by looking back at the lines in your source script.
See that first you asked Windows for a handle to the console window. This
was returned by the API GetStdHandle and held in the EAX register. This
handle and the string to write were
passed to WriteFile. In other words you told Windows to write the specified
string to the console. Information about exactly how to use the APIs and
the parameters which need to be passed to them is available from Microsoft
from the MSDN site (look for the
"Platform SDK").
Finally see suggestions how to
organise your programming work.
The command line syntax is:-
GoAsm [command line switches] filename[.ext]
Where,
filename is the name of the source file
Command-line Switches
/b beep on error
/c always put output file in current directory
/d define a word (eg. /d WINVER=0x400)
/e empty output file allowed
/fo specify output path/file eg. /fo asm\myprog.obj
/gl retain leading underscore in external "C" library calls
/h or /? help
/l create listing output file
/ms decorate for mslinker
/ne no error messages
/ni no information messages
/nw no warning messages
/no no output messages at all
/sh share header files (header files may be opened by other programs during assembly)
/x64=assemble for AMD64 or EM64T
/x86=assemble 64-bit source in 32-bit compatibility mode
If no extension is given for the inputfile, GoAsm looks for the
file without any extension. If that file is not found than GoAsm looks
for the file with an assumed .asm extension.
If no path is given for the input file it is assumed to be in the
current directory.
If no filename is given for the output file an object file with the same
name as the inputfile is created. For example MyAsm.asm will create a
file called MyAsm.obj.
The directory which receives the output file is as follows:-
CODE SECTION
DATA SECTION
CONST SECTION or CONSTANT SECTION
The words "code", "data", "const" and "constant" are reserved to section declarations and an error will be signalled if these words are used elsewhere in your source.
GoAsm also allows shortened forms to declare a section as follows:-
CODE
DATA
CONST
You can also use .CODE, .DATA and .CONST if you wish.
GoAsm automatically adds the attributes to suit the processor and
Windows. A code section is given the attributes read, execute, code.
A data section is given the attributes read, write, initialised data.
A const section is given the attributes read, initialised data (you won't
be able to write to a const section). Uninitialised data has the attributes
read, write, uninitialised data.
Except to add the shared attribute, you
can't override these attributes yourself. This is because to do
so is pointless in the Windows system which has control over the attributes
of the section as loaded and running. For example even if you give a code
section the write attribute, Windows will not allow you to write to it.
Also Windows will not permit you to execute code in a data section. You can
change this behaviour however, by calling the API VirtualProtect at run-time.
In GoAsm you can use a code section to hold read-only data, although
there may be a reduction in performance if you do this.
Declaring a section also sets certain switches in GoAsm which affect syntax and coding. The rules are as follows:-
See also sections - some advanced use on naming sections, shared sections, section ordering, and section alignment.
HELLO1 DB 0 ;one byte with label "HELLO1" set to zero DB 0 ;second byte set to zero HELLO2 DW 34h ;two bytes (a word) set to 34h HELLO3 DD 12345678h ;four bytes (a dword) set to 12345678h HELLO4 DD 12345678D ;four bytes (a dword) set to 12345678 decimal HELLO5 DD 1.1 ;four bytes (a dword) set to real number 1.1 HELLO6 DQ 0.0 ;8 bytes (a qword) set to real number 0.0 HELLO7 DQ 123456789ABCDEFh ;8 bytes (a qword) set to 123456789ABCDEFh HELLO8 DQ 1234567890123456 ;8 bytes (a qword) set to 1234567890123456 decimal HELLO9 DT 1.1E0 ;10 bytes (a tword) set to real number 1.1 HELLOA DT 123456789ABCDEFh ;10 bytes (a tword) set to 123456789ABCDEFhNote that DB, DW, DD and DQ accept numbers in both decimal and hex; DD, DQ and DT accept real numbers too.
Label DB 0,0,0,0 ;four bytes set to zero DW 33h,44h,55h,66h ;four initialised words DD 33h,44h,55h,66h ;four initialised dwords DD 1.1,2.2 ;two DD real numbers DQ 1.1,2.2 ;two DQ real numbers DQ 3333h,4444h ;two DQ hex numbers DT 1.1,2.2 ;two DT real numbers DT 5555h,6666h ;two DT hex numbers
HELLO1 DB ? ;one byte with label "HELLO1" recorded as uninitialised HELLO2 DW ? ;two bytes (a word) HELLO3 DD ? ;four bytes (a dword) HELLO4 DQ ? ;8 bytes (a qword) HELLO5 DT ? ;10 bytes (a tword)Orphaned uninitialised data is not allowed: you cannot mix initialised and uninitialised data so this is an error:-
DATA6 DD 5 DUP 0 DB ? DB 0However this is ok:-
DATA6 DD 5 DUP ? ;5 dwords for the customer DB ? ;a byte to hold the main course DB ? ;and a byte to hold the saucesThis is to allow you to separate areas of uninitialised data so that each separate area can have its own comment
HELLO1: DB ? ;one byte with label "HELLO1" recorded as unitialised HELLO2: DW ? ;two bytes (a word)
HELLO1 DB 2 DUP 0 ;two bytes with label "HELLO1" both set to zero HELLO1A DB 800h DUP ? ;2K buffer not initialised HELLO2 DW 2 DUP 0 ;four bytes all set to zero HELLO3 DD 2 DUP ? ;eight bytes in uninitialised section HELLO4 DD 2 DUP 1.1 ;real number 1.1 repeated twice as dwords HELLO5 DQ 2 DUP 1.1 ;real number 1.1 repeated twice as qwords HELLO6 DQ 2 DUP 333h ;qword repeated twice HELLO7 DT 2 DUP 1.1 ;real number 1.1 repeated twice as twords HELLO8 DT 2 DUP 444h ;tword repeated twiceYou can use DUP to declare some data and then initialise each data component individually for example:-
HELLO300 DB 3 DUP <23,24,25> ;declare three bytes and set them to 23,24,25which does the same as:-
HELLO300 DB 23,24,25 ;declare three bytes and set them to 23,24,25Although it may seem pointless to do this, the syntax does make it easier to initialise a member of a structure if it contains DUP. See initialising structure members which have DUP data declarations.
Letters DB 'a' DW 'xy' Sample DD 'form' ZooDay DQ 'Saturday'Unless inserting Unicode strings GoAsm carries out no conversion to the character, so that the actual value inserted in the object file will depend on the current character set at the time of assembly.
MOV EDI ADDR BUFFER MOV EAX,[Sample] STOSDwhich inserts into the buffer the string: form.
DW 'a' ;first byte is a, second is zero DD 'ab' ;'a' then 'b' then two zero bytesRepeat character value initialisations are allowed, for example:-
DD 3 DUP "Hi"This inserts H then i then two zeroes and this is done three times.
String1 DB 'This is a string' DB 'This is a string with "internal" quotes' String2 DB "A string in double quotes" DB "I enjoyed the string's contents" String3 DB '"A string itself in double quotes"' DB "'A string itself in single quotes'" DB "'A string's own single quotes'" String4 DB """A string's own single and double quotes""" DB '''A string itself with "internal" quotes'''In String4 the doubled-up quotation marks are retained as part of the string itself as one quote. This only occurs for the leading and trailing cases as shown (unlike GoRC, which also does this within the string).
String1 DB 'This is a string with null terminator',0 DB 'First string',0,'And another string',0 String2 DB 22h,"A string's own double quotes",22hThe ASCII values you can use here if you wish are 22h for double quotes and 27h for single quotes.
LongString1 DB 'His first program looked like it would be a great success ' DB 'until he ran it for the first time',0 LongString2 DB 'His fundamental error:',0Dh,0Ah DB 'he did not test it as he went along',0The ASCII values 0Dh and 0Ah are carriage return and line feed respectively, used to start a new line when the string is drawn on the screen.
DB L'Hello how are you?'
DUS 'I am a Unicode string with new line and null terminator',0Dh,0Ah,0See also overriding using the STRINGS directive.
The syntax for a DATABLOCK is as follows:-
MyBlockData DATABLOCK_BEGIN ;comment . . data is inserted here . DATABLOCK_ENDHere all the material between DATABLOCK_BEGIN and DATABLOCK_END is inserted in the output file, and you can then address the data using the label MyBlockData.
MS1 DB 'First string to use',0 MS2 DB 'Second string to use',0 Strings DD MS1,MS2 ;Strings to hold address of the stringsthen to get ready to use the string MS2 instead of coding
MOV ESI,ADDR MS2you can code
MOV ESI,[Strings+4]Whole tables can be created using this method and addressed by taking advantage of the * index register multiplier (scaling) for example
MOV ESI,[Strings+EAX*4]Here eax, which is zero indexed, holds which string to use. When eax is zero the first string will be used, when eax is one the second string and so on if there are more strings.
PROCEDURE_TO_CALL DD FIRSTPROC,SECONDPROC MOV ESI,ADDR PROCEDURE_TO_CALL ;get procedures in esi MOV ESI,[ESI+EAX*4] ;get correct procedure CALL [ESI] ;call the procedure
START:This can be upper or lower case or a mixture.
If you don't want to use START, you can specify the starting address using one of these methods in GoLink's command line or command file:-
-entry STARTINGADDRESS /entry STARTINGADDRESSIf you are using ALINK only the first method works.
If you are using the MS linker you need to make a slight change to your label. It must be preceded by an underline character. So your label is _START: in your source script. Then you would use one of these instructions to the linker (without the underline character):-
-ENTRY START /ENTRY STARTWhat is happening here is that the MS linker is designed to work with a "C" compiler which will decorate global labels with the underline character. So the linker looks for the label _START, rather than START. Assembler programmers have had to put up with such quirks in Windows tools for many years but now we have our independence!
NAMEOFLABEL:This does not output any code, but sets a bookmark called NAMEOFLABEL at the point in data or code where it appears. If you are in a data section, the colon is not obligatory, nor is it obligatory if the label gives the name of an automated stack frame. Therefore the following lines all create unique labels:-
(in data section) HELLO DB 0 ;label HELLO BYE: DB 0 ;label BYE MEAGAIN ;label MEAGAIN (in code section) RICE: ;label RICE PEAS: FRAME ;label PEAS BEANS FRAME ;label BEANSYou can see from this that a single word which is not known to GoAsm to be a directive, mnemonic, data declaration, initialisation of data, or a defined word will be regarded as label. GoAsm expects a colon after a code section label. This is because there are numerous words which must be used in a code section and if they are misspelt, it is important that an error is declared rather than the word being misconstrued as a label.
The scope of a label defines from where it can be accessed using it own unmodified name. Lets look at these two types of re-usable labels in turn.
.looptop ;label looptop .fin ;label finThe boundary of the scope of these labels is defined by the unique code labels in the source script. In other words the label can be jumped to provided there is no unique label in the way. So for example:-
JZ >.fin CALCULATE: .fin REThere the jump instruction will not find .fin because the label CALCULATE is a unique code label in the way.
If you want to jump past a unique code label to a locally scoped re-usable label, you can either use another unique code label as the destination of the jump, or you can use an unscoped re-usable label. Or for advanced use, you can use the locally scoped label within an automated stack frame see re-usable label scope in automated stack frames.
Locally scoped re-usable labels are sent to the debugger as symbols together with their "owner". Therefore the symbol sent to the debugger in the above example is CALCULATE.fin, and another way to jump past that unique label would be with JZ >CALCULATE.fin.
L1: 24: 24.6:You can even use a single stand-alone colon. You might use this for those extremely insignificant jump destinations in your code.
JZ >.fin ;jump forward to .fin JMP >.exit ;jump forward to .exit LOOP .looptop ;loop backwards to .looptop LOOP <.looptop ;loop backwards to .looptop (alternative form)Here is an example using unscoped labels:-
JZ >L10 ;jump forward to L10 JNC L3 ;jump backwards to L3 JNC <L3 ;jump backwards to L3 (alternative form) JMP 100 ;jump backwards to 100
JZ EXTERNALLABELyou should code
JNZ > JMP EXTERNALLABEL :This is to help with error checking. GoAsm assumes a conditional jump was meant to be to a place inside the existing source script.
JMP LABEL ;look for label in all source scripts JMP <INTERNALLABEL1 ;only look for label earlier in source script JMP >INTERNALLABEL2 ;only look for label later in source script
: CALL PROCESS LOOPZ <or
CMP EAX,EDX JZ > CALL PROCESS : RET
JZ >>.fin ;long forward jump to .fin JZ LONG >.fin ;long forward jump to .fin (alternative form) JC <<A1 ;long backward jump to A1 JC LONG A1 ;long backward jump to A1 (alternative form) JC LONG <A1 ;long backward jump to A1 (alternative form)Note that there is no long form of LOOP and its variations, nor of JECXZ. If you need a long jump for these instructions use this instead:-
DEC ECX JNZ LONG L2 ;long jump replacing LOOP OR ECX,ECX ;test for ecx=0 JZ LONG >L44 ;long jump replacing JECXZ
Here are examples using unique labels:-
MOV ESI,ADDR Process_dabs ;get in esi the address of the code label Process_dabs MOV ESI,ADDR Hello2 ;get in esi the address of the string labelled Hello2 MOV ESI,ADDR HelloX+10h ;get in esi the address 16 bytes beyond HelloXHere is an example using a locally scoped re-usable label:-
MOV ESI,ADDR CALCULATE.fin ;get in esi the address of the code label .fin in the CALCULATE procedureHere is an example using a formal structure:-
MOV ESI,ADDR Lv1.pszText ;get in esi the address of the psztext member in the formal structure Lv1For 64-bit code, note that a PUSH, ARG, or MOV to memory of an ADDR or OFFSET for a non-local label (local labels are handled differently) will make use of the R11 register. and take advantage of the shorter RIP-relative addressing of the LEA instruction as follows:-.
LEA R11,ADDR Non_Local_Label PUSH R11 LEA R11,ADDR Non_Local_Label MOV [MEMORY64],R11This will also take place with INVOKE when pushing arguments with ADDR, which also includes use of pointers to a string or raw data (ex. 'Hello' or <'H','i',0>).
MOV ESI,ADDR Hello1 ;get in esi the address of the dword Hello1 MOV EAX,[ESI] ;get in eax the value of Hello1or this which does the same thing:-
MOV EAX,[Hello1] ;get in eax the value of Hello1
MOV ESI,ADDR Hello1 ;get in esi the address of the dword Hello1 MOV [ESI],EAX ;write the value in eax to Hello1or this which does the same thing:-
MOV [Hello1],EAX ;write the value in eax to Hello1
PARAM_DATA DD 0 ;+0h DD 0 ;+4h DD 55h ;+8h DD 0 ;+0Ch DD 0 ;10hThen you can use the label to read from and write to a particular part of the structure using a displacement value as follows:-
MOV ESI,ADDR PARAM_DATA MOV EAX,[ESI+8h] ;get in eax value of third dword MOV [ESI+8h],EDX ;and insert edx insteador this which does the same thing:-
MOV EAX,[PARAM_DATA+8h] ;get in eax value of third dword MOV [PARAM_DATA+8h],EDX ;and insert edx insteadThe displacement value can be any value up to 0FFFFFFFFh. It can be positive or negative. Non-numeric elements must be separated by the plus sign.
PARAM_DATA DD 10h DUP 0Then you could use indexation (scaling) to multiply the index register to suit:-
MOV ESI,ADDR PARAM_DATA MOV EAX,[ESI+ECX*4] ;get in eax value of ecx dword MOV [ESI+ECX*4],EDX ;and insert edx insteador this which does the same thing:-
MOV EAX,[PARAM_DATA+ECX*4] ;get in eax value of ecx dword MOV [PARAM_DATA+ECX*4],EDX ;and insert edx insteadYou can use indexation of 0,2,4 or 8. The following instructions are all valid:-
MOVZX EAX,B[PARAM_DATA+ECX] ;get in eax value of ecx byte MOVZX EAX,W[PARAM_DATA+ECX*2] ;get in eax value of ecx word MOV Q[PARAM_DATA+ECX*8],EDX ;insert edx at ecx qwordNon-numeric elements must be separated by the plus sign.
PARAM_DATA DD 19h,0,0,22222h DD 1Ah,0,0,44444h DD 1Bh,0,0,66666h DD 1Ch,0,0,88888h DD 1Dh,0,0,0AAAAAh DD 1Eh,0,0,0CCCCChThen you could use indexation (scaling) and displacement as follows:-
MOV ESI,ADDR PARAM_DATA CMP EAX,[ESI+ECX*4] ;see if there is eax value at ecx dword JNZ >L2 ;no MOV EDX,[ESI+ECX*4+0Ch] ;yes so get the result in edxor this which does the same thing:-
CMP EAX,[PARAM_DATA+ECX*4] ;see if there is eax value at ecx dword JNZ >L2 ;no MOV EDX,[PARAM_DATA+ECX*4+0Ch] ;yes so get the result in edxYou can use indexation of 0,2,4 or 8. The displacement value can be any value up to 0FFFFFFFFh. In your source script it can be positive or negative. Non-numeric elements must be separated by the plus sign.
PROCESS_HASH: ;label to the procedure XOR EAX,EAX MOV EDX,ESI CALL PH23 MOV EDX,866h ;return from the procedure with edx=866h RET
PROCESS_HASH: XOR EAX,EAX MOV EDX,ESI CALL PH23 ;transfer execution to the PH23 procedure and return MOV EDX,866h ;return from the procedure with edx=866h JMP >SOMEWHERE_ELSE ; START: ;start place for execution JMP PROCESS_HASH ;
CALL PROCESS_HASH JMP PROCESS_HASHSometimes the address of the procedure to go to is held in memory pointed to by a label or a register or even held at a known place in memory in which case you can use for example:-
CALL [PROCADDRESS] CALL [PROCTABLE+20h] CALL [ESI] CALL [ESI+EDX] JMP [4000000h]Sometimes the address of the procedure to go to is held in a register in which case you can use for example:-
CALL EAX JMP EDI
#define Hello PROCESS_HASH CALL Hello ;treated as a call to PROCESS_HASH CALL 100h ;treated as a call to a relative address CALL [HELLO3+ECX+EDX*4] CALL [HELLO3+ECX+EDX*4+9000h] CALL $$ ;a call to the start of the current section CALL $+20h ;a call 20h bytes ahead
So if you want to call a procedure in another source script (which will be producing another object file) just call it in the usual way. Similarly if you have a procedure in another executable (usually a Dll) you can do the same.
For example, suppose you have written My.Dll containing a calculation algorithm you wish to use with the label CALCULATE. You could call it as follows:-
CALL CALCULATE
In your list of Dlls you give to GoLink you will specify My.Dll. GoLink
will first look for the code label CALCULATE in the object files, but will then
look in the specified Dlls. Most other linkers look in library files (.lib files)
for the functions they contain, which means you have to make a lib file.
Either way, in GoAsm syntax there is nothing further for you to do in
your source script. If the linker does not find the destination of the call,
an error will be shown.
This form of the call is a relative call using the opcode E8.
You could also use this form:-
CALL [CALCULATE]For this type of call GoAsm uses the opcodes FF15. This is a call to an absolute address. In 32-bit assembly this is a call to a 32-bit address, but in 64-bit assembly its a call to a 64-bit address.
See also:-
using static code libraries
direct importing by ordinal or specific Dll
using the C Run-time library
Calling Windows APIs (which reside in Windows system Dlls) is very simple where there are no parameters, for example in 32-bit Windows you can use:-
CALL GetModuleHandleor its more advanced alternative which can be used either for 32-bit or 64-bit Windows:-
INVOKE GetModuleHandleThere is nothing else to put in the source script. Since the function being called resides outside the executable you are making, it is the linker's job to find the Dll which contains the GetModuleHandle procedure and it will record the name of the Dll in your executable. GoLink does this from a list of Dlls which you supply.
Most Windows APIs, however, expect to be sent parameters (also known as "arguments") when they are called. It is the programmer's job to ensure that these parameters are sent to the API correctly. The parameters contain the information, or pointers to information, which tell the API what to do. Sometimes the parameters contain addresses of places in memory where the API will insert information.
How you send the parameters depends on whether you are assembling for 32-bit or 64-bits Windows. This is because they each use different calling conventions, and this affects the way parameters are sent and used. 32-bit Windows uses the standard calling convention (STDCALL) and 64-bit Windows uses the so-called fast calling convention (FASTCALL).
GoAsm provides ARG and INVOKE which can be used for both platforms. GoAsm creates the correct code to suit the calling convention to be used. If you are writing only for 32-bits you can use PUSH and CALL to send the parameters, but if you want to port your code to 64-bit Windows later, you will need to change these to ARG and INVOKE. In both 32-bit and 64-bit source code you would use CALL to call procedures in your own executables, unless you are sending parameters to them using one of these calling conventions.
In the STDCALL calling convention used in 32-bit Windows, all the parameters are put on the
stack by the caller, and the stack pointer (ESP) is moved to the top of the parameters on the stack.
Then the API is called. The API uses the parameters on the stack and before returning it restores
the stack to equilibrium by moving the stack pointer to the position it was before the first
parameter was put on the stack.
In the FASTCALL calling convention used in 64-bit Windows, the first
four parameters are put in the RCX,RDX,R8 and R9 registers instead of on the stack. However,
subsequent parameters are put on the stack. The caller needs to ensure that the stack
pointer (in this case RSP) is moved to the top of the parameters as usual, allowing for the
first four parameters which are held in registers (this is to permit the API to keep them on
the stack as if they had been put there in the first place). Another difference is that the
API does not restore the stack into equilibrium before returning from the call (this change makes
it easier for a handful of APIs which do not have a fixed number of parameters).
To enable the same source to be used both for 32-bit and 64-bit programming you would send the parameters using ARG and then call the API using INVOKE, for example:-
ARG 40h,RDX,RAX,[hwnd] INVOKE MessageBoxAIn 32-bit assembly the ARG simply does the same as PUSH, and INVOKE does the same as CALL. GoAsm accepts a PUSH instruction of a 64-bit General Purpose register, so PUSH RDX is treated the same as PUSH EDX. Therefore the above call works on both platforms. In 32-bit assembly it translates as:-
PUSH 40h,EDX,EAX,[hwnd] CALL MessageBoxAHowever in 64-bit assembly, the same code translates as:-
MOV R9,40h MOV R8,RDX MOV RDX,RAX MOV RCX,[hwnd] SUB RSP,20h CALL MessageBoxA ADD RSP,20hSee writing 64-bit programs for more details.
int MessageBox( HWND hwnd, // handle of owner window LPCTSTR lpText, // address of text in message box LPCTSTR lpCaption, // address of title of message box UINT uType // style of message box );Using INVOKE you can follow the same order, for example:-
INVOKE MessageBoxA, [hwnd],EAX,EDX,40hwhich is the same as:-
ARG 40h,RDX,RAX,[hwnd] INVOKE MessageBoxANote that ARG (like PUSH) reads the parameters one way, whereas parameters after INVOKE are read the other way.
INVOKE lets you straddle two or more lines using the continuation character:-
INVOKE CreateWindowExA, WS_EX_OVERLAPPEDWINDOW, ADDR szClassName, \ ADDR szWindowName,\ WS_OVERLAPPEDWINDOW+THING,\ 100,16,400,0,0,0,[hInstance],0Since GoAsm looks at the parameters to INVOKE starting from the end, errors near the end will be found first.
When using INVOKE, if you like to tuck away your parameters in a defined word then GoAsm will still get them in the correct order, for example:-
z_function_params=3,2,1 INVOKE z_function, z_function_paramsproduces the same code as:-
ARG 1,2,3 INVOKE z_function
MBTITLE DB 'Hello',0 MBMESSAGE DB 'Click OK',0 PUSH 40h, ADDR MBTITLE, ADDR MBMESSAGE, [hwnd] CALL MessageBoxATo make this easier GoAsm permits the use of PUSH or ARG like this:-
PUSH 40h,'Hello','Click OK',[hwnd] CALL MessageBoxAor, if you were writing source for 32-bit or 64-bit platforms:-
ARG 40h,'Hello','Click OK',[hwnd] INVOKE MessageBoxAor if you prefer to send parameters after INVOKE:-
INVOKE MessageBoxA, [hwnd],'Click OK','Hello',40hYou can also use this with Unicode strings as follows:-
ARG 40h,L'Hello',L'Click OK',[hwnd] INVOKE MessageBoxW INVOKE MessageBoxW, [hwnd],L'Click OK',L'Hello',40hWhen you use any of these forms the string will always be null-terminated. What is happening here is that GoAsm places the string in the const section if there is one (or the data section if there is one, if not, in the code section) and adds a null-terminator. Then GoAsm creates the correct instruction and gives it a pointer to the string. No symbol is made for debugging purposes.
In 64-bit assembly, GoAsm ensures that Unicode strings are aligned on a word boundary as required by the system. Note that this is similar to PUSH ADDR and will make use of the R11 register and take advantage of the shorter RIP-relative addressing of the LEA instruction.
PUSH <23,24,25> ;push a pointer to the bytes 23,24,25or
PUSH <23,6 DUP 20h,23> ;push a pointer to the bytes 23,six spaces then 23or
PUSH <'Hi',0Dh,0Ah,'There',0> ;push a pointer to the null terminated string on two linesYou can also use the < and > operators in this way with ARG and after INVOKE. What is happening here is that GoAsm places the data declaration between the < and > operators in the const section if there is one (or the data section if there is one, if not, in the code section). Then GoAsm creates the correct instruction and gives it a pointer to the data. No symbol is made for debugging purposes.
In 64-bit assembly, GoAsm ensures that data is aligned on a word boundary as would be required by the system if the data contains Unicode strings.
MOV EAX,ADDR 'This is a string' MOV EAX,ADDR <'String',0Dh,0Ah>When GoAsm deals with this code it places a null terminated string or the data between the < and > operators in the const section if there is one (or the data section if there is one, if not, in the code section). Then GoAsm gives the pointer to the data so created to the instruction. No symbol is made for debugging purposes.
This works the same way in 64-bit programming except that GoAsm ensures that a Unicode string or data is word aligned in memory as required by the system. Note that this is similar to PUSH ADDR and will make use of the R11 register and take advantage of the shorter RIP-relative addressing of the LEA instruction.
MOV AL,'1' MOV AX,'12' ;regarded as bytes - 1 first then 2 MOV EAX,'ABCD' ;regarded as bytes - A first, then B then C then DThis makes it much easier to add short strings to memory eg. to add the extension .fil to a filename in memory you can code:-
MOV [EDI],'.fil' ;or MOV EAX,'.fil' MOV [EDI],EAXand not
MOV [EDI],'lif.' ;or MOV EAX,'lif.' MOV [EDI],EAXCMP works in the same way for example:-
CMP AL,'1' CMP EAX,'ABCD' CMP [EDI],'.fil'This does not change the usual reverse order of material not in quotes so for example when you want to add a carriage return and then a linefeed to text you can still use:-
MOV AX,0A0Dh STOSWHere the carriage return (0Dh) which is in AL, is loaded into memory first, then the linefeed (0Ah) in AH is loaded into memory.
MOV EAX,'ABC' ;codes as A then B then C then zeroWhen writing source code for Unicode programs you can ensure that character immediates are Unicode or if necessary, switched between ANSI and Unicode see using the correct string in quoted immediates and switching quoted strings and immediates.
In 64-bit programming you can use the 64-bit registers to contain character immediates which are 8 characters long, for example:-
MOV RAX,'Saturday'However, the CMP instruction is limited to 32-bits, so for example
CMP RAX,'Saturday'would show an error.
MOV [ESI],20hThis puts the number 20h into a place in memory whose address is contained in the register esi. But what is missing from this instruction is whether the number should be loaded as a byte, as a word or as a dword. In other words should one, two or four bytes of memory be altered? All assemblers require a type indicator in instructions of this sort. The syntax in other assemblers is (using dword as an example):-
MOV DWORD PTR [ESI],20h ;MASM MOV DWORD [ESI],20h ;NASM MOV D[ESI],20h ;A386Of course I have used the A386 syntax which requires a lot less typing so that in GoAsm the type indicators you can use are:-
B meaning byte W meaning word (two bytes) D meaning dword (four bytes) Q meaning qword (eight bytes) T meaning tword (ten bytes)You can also use these two switchable type indicators:-
S meaning string (default of 1 for ANSI byte, or 2 for Unicode word) P meaning pointer (default of 4 for 32-bit dword, or 8 for 64-bit qword)See here for more on using the switched type indicator for Unicode/ANSI switching.
INC [COUNT]Here GoAsm does not know (and in fact does not care) whether COUNT is a byte, word or dword. Therefore you must give this a type indicator too for example:-
INC B[COUNT]Although this is a little more work for the programmer, in fact it can be argued that it makes your source script easier to read and understand, since you can always see the size of the operation from the instruction itself, rather than having to go back to see if COUNT was declared as a byte, word or dword.
AND B[MAINFLAG],0FEh ADC W[EAX],66h ADD D[MEM_AREA],66h BT D[EBX],31D CMP D[HELLOWORD],0Dh DEC D[ECX] DIV B[HELLO] INC D[EDX] MOV B[MEM_AREA],23h MOVSX EDX,B[EDI] MUL B[HELLO] NEG W[ESI] NOT D[HELLO3] OR B[MAINFLAG],1h SETZ B[BYTETEST] SHL W[IAMAWORD],23h SHL D[IAMADWORD],CL SUB D[EBP+10h],20D TEST B[ESP+4h],1h XOR D[IMAWORD],11111111hAnd in 64-bit programming you might also see, for example
ADC W[RAX],66h BT D[R12],31D INC Q[RDX] NEG W[R15D]
AND [MAINFLAG],CL CMP [HELLOWORD],EDI MOV [IAMABYTE],AL MOV [IAMADWORD],ESI OR [MAINFLAG],BH XCHG CL,[ESI]Also none of the mmx, xmm or 3DNow! instructions require a type indicator. Several of the x87 floating point instructions do not need a type indicator. Those which do can take more than one operand size. There are also several instructions which can only take one operand size so with these there is no need for a type indicator. For example CALL, JMP, PUSH, and POP always take a dword. See half stack operations for the use of PUSHW and POPW. Also some less common instructions do not need a type indicator, for example ARPL, BOUND, BSF, BSR, CMOV (in all forms), CMPXCHG, and CMPXCHG8B.
PUSH 0,23h,[hwnd],ADDR lParam,EAX POP EAX,[EBP+2Ch],[hwnd] DEC ECX,EDX,[COUNT] INC [EBP+10h],EDI DB 23h,24h,25hThe instructions here are always assembled in left-to-right order.
66ABCDEh ;a hex number 34567789 ;a decimal number 1100011B ;a binary number 1.0 ;a real number 1.0E0 ;a real numberGoAsm accepts these numbers but also supports numbers in these formats:-
9999999D ;a decimal number 0x456789 ;a hex numberA hex number which begins with a letter (that is A to F, being values 10 to 15 decimal) must begin with a zero, for example:-
0A789ABCDh or 0xA789ABCD
Be careful using the OR, AND and NOT logical operators, since these are actually mnemonics. Although GoAsm recognises them if you use them in places where mnemonics are not expected, you can use instead | for OR, & for AND, and ! for NOT.
Arithmetic in brackets is carried out first, otherwise calculations are carried out in strict left-to-right order. Here are some examples:-
DB 2*3 DB (2+30h)/(2+1) DD (2000h+40h-20h)/2 DD SIZEOF HELLO/2 DD 444444h & 226222h DB 20h/2 DUP 44h DB 6+2 DUP 0 #define globule (2*3)/2 DB globule DD globule|100h DD 2D00h>>8 DQ 2D00h<<48 MOV EAX,globule|100h MOV EAX,SIZEOF HELLO*2 MOV EAX,ADDR HELLO+10h MOV EAX,0x68+0x69-0x70 MOV EAX,[MemName+0x68+0x69-0x70] MOV EAX,[ESI*4+45000h] MOV EAX,[ESI*4+SIZEOF HELLO/2] MOV EAX,8+8*2 ;result is 32 MOV EAX,8+(8*2) ;result is 24Divisions are rounded according to the result eg.
MOV EAX,32/3 ;puts 11 into eax MOV EAX,31/3 ;puts 10 into eax MOV EAX,10/4 ;puts 3 into eaxGoAsm assumes that all multiplication and division is carried out using unsigned numbers. MUL and DIV are used at compile-time and not their signed counterparts IMUL and IDIV. See understand signed numbers for more about signed numbers.
DD 1.6789E3 DQ 1.6789E3 DT 1.6789E3 DD 3 DUP 7.6789E-2 DQ 678.27896435E3 DT 1.2You may also declare PI directly either as a tword, qword or dword as follows:
DD PI ;pi as a dword DQ PI ;pi as a qword DT PI ;pi as a twordGoAsm tries to achieve maximum accuracy in providing pi by writing a known number directly into the mantissa.
You can also declare real numbers as follows:-
PUSH 1.1 MOV EAX,1.1Both of these use a 32-bit format for the real number. The first places that number on the stack and the second moves it into the specified register.
DIRECT_PI DT 4000C90FDAA22168C235hand load it using:-
FLD T[DIRECT_PI]The most significant bit (bit 79) in this tword declaration is a sign bit indicating whether the real number is positive or negative. In this case the number is positive because the sign bit is not set. The remainder of the first four hex digits contain the exponent. This is biased by a value of +3FFEh in 80 bit real numbers. This permits exponents of between -3FEEh and +4001h to be handled without using the most significant bit (the exponents become 0 to 7FFFh). The remainder of the hex digits contain the mantissa.
Mess DB 'I am a string of characters',0 PUSH 'This is supposed to be a carat ^' MOV EAX,'£$|@'It must be asked what actual values are loaded by GoAsm when issuing these instructions? At assemble time GoAsm views your source script using Windows file mapping, and then reads it character by character. In other words GoAsm is given the value of the characters in the source script by Windows. When GoAsm loads in the object file strings of the sort shown above, it loads the same value character as given to it by Windows. In the case of conversions from ANSI to Unicode strings, these are passed first through the API MultiByteToWideChar. This means that the value given to GoAsm by Windows will match that in the current character set (code page). Accordingly you need to ensure that the character set used in the computer which runs GoAsm is the character set for which your program is designed to run.
If you are using a source script which is in a Unicode format (UTF-8 or UTF-16) then the codepage issue disappears. The correct characters are given by their Unicode value.
CMP AL,124D ;see if character is an OR as in some character sets JZ >L4 ;yes CMP AL,221D ;see if character is an OR as in some character sets JZ >L4 ;yesHere you have already allowed for a possible variation in the user's own character set. If necessary you can arrange for your code to test the user's character set at run-time, and to test for the correct characters or use the correct strings accordingly. You can also test the language of the user's machine and provide strings in the correct language. The resource APIs provide a way this can be done automatically - see the manual to GoRC, my resource compiler.
, - the instruction is not finished, continue ; or // - a comment line - ignore to end of line /*.........*/ - continuous comment - ignore between the marks \ - the material is continuing on the next line - number - the number is negative ! number - invert the number (like NOT) NOT - invert the number ~ number - same + - the plus sign - - the minus sign * - the multiply sign / - the divide sign | - bitwise OR OR - bitwise OR & - bitwise AND AND - bitwise AND << number - bit shift left by the number >> number - bit shift right by the number (....) - perform calculation in brackets first
## in a definition has a special meaning see
using double hashes in definitions.
PUSH ADDR LV_COLUMN,EAX,101Bh,hListView CALL SendMessageA ;insert eax columnNow let's look more closely at the LV_COLUMN structure.
LV_COLUMN DD 6 DUP 0However, in the Windows information, each of the six dwords has a name which gives some idea of what it is used for, which is useful. Also the very first dword is a mask which identifies which of the later members of the structure are valid. This mask is important because a later version of the structure has another two members, and the mask needs to be different. So it might be better to declare the structure in data like this so that the mask can be initialised with a value, and so that you can see the names in your source script:-
LV_COLUMN DD 0Fh ;+0h mask DD 2h ;+4h fmt=LVCFMT_CENTER=2 DD 0 ;+8h cx DD 0 ;+0Ch pszText DD 0 ;+10h cchTextMax DD 0 ;+14h iSubItemHere see that whilst declaring the structure in data we have taken the opportunity to initialise two of the members with values which will not change and have included in the comments the offset details, member names and other information.
MOV EDI,ADDR LV_COLUMN MOV ESI,ADDR ColumnText ;get the column text to use MOV [EDI+0Ch],ESI ;and give it to the structure MOV D[EDI+8h],50D ;and make the width 50 pixelsor you can use:-
MOV ESI,ADDR ColumnText ;get the column text to use MOV [LV_COLUMN+0Ch],ESI ;and give it to the structure MOV D[LV_COLUMN+8h],50D ;and make the width 50 pixels
Here is an example of a structure template made with the name LV_COLUMN:-
LV_COLUMN STRUCT mask DD 0Fh ;mask fmt DD 2h ;LVCFMT_CENTER=2 cx DD 0 pszText DD 0 cchTextMax DD 0 iSubItem DD 0 ENDSI have added some comments here to help understand the initialisation of two members of the structure. Note ENDS (literally END STRUCT) marks the end of the template. If you prefer you can also mark the end of the template by giving the structure name again followed by ENDS eg.
LV_COLUMN ENDSThe second stage is to use the template. You do this by using the template in the data section, usually preceded by a label, for example:-
Lv1 LV_COLUMNHere you have declared six dwords using the LV_COLUMN structure template and you have given the structure declaration the label Lv1.
RECT STRUCT left DD top DD right DD bottom DD ENDS rc RECTcreates the following symbols:-
rc rc.left rc.top rc.right rc.bottom
MOV ESI,ADDR ColumnText ;get the column text to use MOV [Lv1.pszText],ESI ;and give it to the structure MOV D[Lv1.cx],50D ;and make the width 50 pixelsor even
MOV ESI,ADDR ColumnText ;get the column text to use MOV EDX,ADDR Lv1.pszText ;get the psztext member MOV [EDX],ESI ;and load the text to use MOV EDX,ADDR Lv1.cx ;get the cx member MOV D[EDX],50D ;and make the width 50 pixelsBut there is still nothing to stop you from doing this which is the same thing:-
MOV ESI,ADDR ColumnText ;get the column text to use MOV [Lv1+0Ch],ESI ;and give it to the structure MOV D[Lv1+8h],50D ;and make the width 50 pixelsAlthough it is more complex to set up, the advantage of the former method is that when you look at your code in the symbolic debugger the symbols in the structure will appear in full, with both the structure label and the member name appearing which is some advantage. This is because GoAsm creates symbols for all the members of the structure and passes these to the linker. As far as I am aware this is unique to GoAsm and other assemblers do not do this.
POINT STRUCT left DD 0 right DD 0 ENDSThen
MOV EBX,POINT.rightThis loads the value 4 into EBX, which is the distance of the member from the beginning of the structure.
This way of getting an offset is sometimes useful to get information sent by Windows in a structure. As an example, the OFNHookProc callback procedure receives from Windows information in a WM_NOTIFY message. The lParam parameter contains a pointer to an OFNOTIFY structure. This is a nested structure with the following form:-
OFNOTIFY STRUCT hdr NMHDR lpOFN DD pszFile DD ENDSwhere the NMHDR structure is:-
NMHDR STRUCT hwndFrom DD idFrom DD code DD ENDSSo within your window procedure you can get the value of the member idFrom in the NMHDR (identifier of the control sending the message) as follows:-
MOV ESI,[EBP+14h] ;get the pointer to the OFNOTIFY structure MOV EAX,[ESI+OFNOTIFY.hdr.idFrom] MOV EDX,[ESI+OFNOTIFY.pszFile]In fact what is happening here is that OFNOTIFY.hdr.idFrom resolves to a value of 4; OFNOTIFY.pszFile resolves to a value of 10h. These are their correct offsets from the beginning of the OFNOTIFY structure. Of course the structures concerned must be known to GoAsm. This is done by including the structure templates in the assembler source script, somewhere earlier in the file.
RECT STRUCT left DD 10 top DD 10 right DD 120 bottom DD 90 ENDSYou can override the initialisation of the structure using the < and >, { and } operators for example
rc1 RECT <0,20,120,300>sets the dwords in the data structure to 0, 20, 120 and 300 respectively.
rc1 RECT <0,?,?,300> rc1 RECT <0,,,300>here you override only the first and fourth members of the structure.
Using braces you can pick and choose which members to override:-
rc1 RECT {left=2,top=5}or you can mix the two methods:-
rc1 RECT <{left=2,top=5},300h>When using braces you don't need to specify the full symbol name (in the above example this would be "rc1.left" and "rc1.top"). Instead you only specify the ultimate name ("left" and "top"). The override is also carried out into nested structures, so if you use the same names for members within a nested structure it is possible to initialise several members at once using one brace override.
UP STRUCT DB 27 DUP 0 DB 2 DUP 0 ENDS Pent UP <'My cat was born on 23 April',<23h,4h>>So, for example here is the GUID structure and a typical initialisation for COM:-
GUID STRUCT Data1 dd ? Data2 dw ? Data3 dw ? Data4 db 8 dup ? GUID ENDS IID_IShellLink GUID <0000214eeh, 00000h, 00000h, <0c0h, 00h, 00h, 00h, 00h, 00h, 00h, 46h>>
RECT STRUCT left DD top DD right DD bottom DD ENDSand
RECT STRUCT left DD 0 DD 2 DUP 0 bottom DD 0 ENDSand
RECT STRUCT DD 4 DUP 0 ENDSare equally valid structure declarations. However, where members are named they must be on a new line.
RECT STRUCT left DD 0 top DD 0 right DD 0 bottom DD 0 ENDS RECT2 STRUCT left DD 0 top DD 0 right DD 0 bottom DD 0 ENDSIf you use ? in the initialisation of the structure members this has the same effect as using zero. This does not result in the data being recorded as uninitialised, as it would do with an ordinary data declaration, so
RECT STRUCT left DD ? top DD ? right DD ? bottom DD ? ENDS rc1 RECTis perfectly valid, but the data will go in the section initialised to zero as if zeroes had been used.
In a structure template you can make additional data on one line in the usual way so that this would be a structure template of four dwords:-
RECT STRUCT lefttop DD 0,0 rightbottom DD 0,0 ENDS
RECT <>,<>,<>,<>Creates four RECT structures (four dwords in each). Since no label has been used in front of the RECT, no symbols at all will be created and passed to the debugger. In this example:-
Buffer RECT <0,0,10,10>,<5,5,20,20>,<8,8,30,30>an array is made of three RECT structures (four dwords in each) initialised to the values provided. Symbols will only be made for the very first structure. This is to avoid duplication of symbol names.
If you want the members of the array to have unique symbol names you would need to use (for example):-
Buffer1 RECT <0,0,10,10> Buffer2 RECT <5,5,20,20> Buffer3 RECT <8,8,30,30>or
Buffer RECT3 <0,0,10,10, 5,5,20,20, 8,8,30,30>where RECT3 is a structure of 3 RECTS.
If you don't need to initialise the structures you can repeat them using either:-
Buffer RECT <>,<>,<>which creates three RECT structures, or
Buffer RECT,RECT,RECTwhich does the same thing.
You can also use DUP to repeat structures for example:-
ThreeRects RECT 3 DUP <> FiveRects RECT 5 DUP <23,24,25,26>In the second example each RECT is initialised to the same value. Initialisation of duplicated structures in this way can only be done at the top level and not in nested structures.
RECT STRUCT left DD 0 top DD 0 right DD 0 bottom DD 0 ENDS StructTest STRUCT a DD 6 b RECT c DD 7 d DD 8 ENDSThen
Hello StructTestCreates seven dwords. The symbols created (and passed to the debugger) are:-
Hello Hello.a Hello.b Hello.b.left Hello.b.top Hello.b.right Hello.b.bottom Hello.c Hello.dand they can be read from or written to in the usual way, for example
MOV D[Hello.b.left],100h ;make rectangle start at 256 pixelsLike structure members, nested structures need not be named, so that this is perfectly valid:-
StructTest STRUCT DD 6 RECT c DD 7 d DD 8 ENDS
StructTest STRUCT a DD 6 b STRUCT left DD 0 top DD 0 right DD 0 bottom DD 0 ENDS c DD 7 d DD 8 ENDSThen
Hello StructTestproduces the same result as StructTest in the previous example. The only difference is that the RECT structure is not available for use elsewhere.
rc1 StructTest <23,<10,20,120,300>,44,55>will initialise the main structure and also its nested RECT member
rc1 StructTest <,<10,20,120,?>,44,55>will only override the initialisation of some members as will
rc1 StructTest <,<10,20,120,>,44,55>but this will not change the nested RECT member:-
rc1 StructTest <,,44,55>A good way to keep track of the brackets is to visualise a question mark for those members which you do not want to alter. Or you can even insert the mark for easier reading, for example the last example can be written:-
rc1 StructTest <?,?,44,55>
RECT STRUCT left DD 1 top DD 2 right DD 3 bottom DD 4 ENDS StructTest STRUCT a DD 6 b RECT <3333h,4444h,5555h,> c DD 7 d DD 8 ENDSThen
Hello StructTest <,<,0Bh,0Ch,>,,>Then RECT would be initialised to 3333h,0Bh,0Ch,4
Overrides naming members using { } braces have a higher priority than overrides using the < and > brackets.
StringStruct STRUCT DB 'I am a lonely string in a struct',0 DB 'I will keep you company',0 ENDS
Rect STRUCT a DB ? DB 0 b DB ? DB 0 ENDS RC1 Rect <'Hello',,'Goodbye'>will set the Rect structure to null terminated strings of 5 and 7 bytes respectively.
Rect STRUCT a DB 'Hello' DB 0 b DB 'Goodbye' DB 0 ENDSThen overriding the initialisation will not change the size of the members, so that eg.
RC1 Rect <'Goodbye',,'Hello'>would result in a string at label Rect.a of 'Goodb' and a string at label Rect.b of 'Hello ', where the rest of the string is padded with nulls.
The initialisation of structures members established using DUP can also be overriden by strings for example:-
UP STRUCT DB 20 DUP 0 ENDS Pent UP <"Hello">results in the string Hello followed by 15 nulls.
See also Using Unicode strings in structures.
NMHDR STRUCT hwndFrom DD idFrom DD code DD ENDS ; NMTTDISPINFO STRUCT hdr NMHDR lpszText DD #if STRINGS UNICODE szText DW 80 DUP ? #else szText DB 80 DUP ? #endif hinst DD uFlags DD lParam DD ENDS DATA Use1 NMTTDISPINFO Use2 NMTTDISPINFO <<>,,"Hello",,,,>The second use of the structure will assemble the string "Hello" either in Unicode or in ANSI depending on whether STRINGS are defined as Unicode.
For example, take the union template declared as follows:-
Thing UNION Cat DD 0 Dog DW 0 Rat DB 0 ENDSThen you can use this template as follows:-
Hungry ThingThis then sets aside a data area of 4 bytes (a dword). Why only 4 bytes? Because each member starts in the same place. The end of the union template is marked by ENDS although you can use ENDUNION if you prefer.
The symbols created by this union are:-
Hungry Hungry.Cat Hungry.Dog Hungry.RatAnd you can address these labels in the usual way, for example:-
MOV [Hungry.Cat],EAX MOV AL,[Hungry.Dog] MOV ESI,[Hungry.Rat]Which of course since each member starts in the same place, is the same as:-
MOV [Hungry.Cat],EAX MOV AL,[Hungry.Cat] MOV ESI,[Hungry.Cat]
You can nest unions in structures, or nest structures in unions, or nest unions in unions, for example
Laugh STRUCT Balm DW 0 Ointment DB 0 ENDS ; Zebra UNION Tiger DD 0 Hyaena Laugh ENDS ; Lion STRUCT BagPuss DB 3 DUP 0 Striped Zebra ENDS ; Fierce LionWhich produces the following symbols at the following offsets from Fierce:-
Fierce +0 Fierce.BagPuss +0 Fierce.Striped +3 Fierce.Striped.Tiger +3 Fierce.Striped.Hyaena +3 Fierce.Striped.Hyaena.Balm +3 Fierce.Striped.Hyaena.Ointment +5
Lion STRUCT BagPuss DB 3 DUP 0 Striped UNION Tiger DD 0 Hyaena STRUCT Balm DW 0 Ointment DB 0 ENDS ENDS ENDS ; Fierce Lionproduces the same result as the previous example. The only difference is that Laugh and Zebra are not available for use elsewhere.
Cat UNION Ginger DB Tortie DW Grey DD Tabby DQ ENDS Hungry Cat <"a string for Ginger"> Anxious Cat <,4444h> ;initialises the word Sleepy Cat <,,55555555h> ;initialises the dword Insistent Cat <,,,6666666666666666h> ;initialises the qwordIt is even more difficult when using unions to keep track of the < and > operators, so instead if you prefer, you can specify the name of the member inside { and } operators, or you can initialise them at run time, for example:-
Scaredy Cat {Ginger="a string for Ginger"} GString DB "a string for the Grey cat" MOV [Scaredy.Grey],ADDR GString ;loads a pointer to GStringRemember that since union members are at the same place a later initialisation can rub over an earlier one.
Example:-
Laugh STRUCT Balm DW 6666h Ointment DB ENDS ; Zebra UNION Tiger DD 88888888h Hyaena Laugh ENDS ; Lion STRUCT BagPuss DB DB DB Striped Zebra ENDS ; Fierce Lion <{Ointment=0AAh}22h,33h,44h,<?,<?,55h>>>Which initialises the data area as follows:-
At Fierce.BagPuss (at offset +0) 22h,33h,44h Then at Fierce.Striped.Tiger (at offset +3) 66h,66h,0AAh,88hWhat happened here is that the Tiger dword at +3 in the Zebra union was initialised to 88888888h but then overidden by the values of Balm and Ointment (which were within the same union). Only the very last byte survived.
When using GoAsm, for definitions which can be fitted onto one line, you might like to use EQU or =, or #define as you would in "C". Just use the one you like best. You can use the continuation character ("\") to allow definitions to span more than one line, but it is better if you use MACRO...ENDM instead. This avoids syntax problems.
Since GoAsm is a one-pass assembler, you must ensure that your definitions are not used before they are declared in the source script. Once a word has been defined you can change its definition but GoAsm will warn you of this since it may not be intended.
Here are some examples how definitions can be used.
WS_CHILD=40000000h WS_CHILD EQU 40000000h #define WS_CHILD 40000000hYou can use arithmetic or strings or even other definitions when you define a word. Here are some examples:-
SKIP_VALUE EQU 20h|40h #define SKIP_VALUE 20h|40h HelloText='Hello world' #define HelloText "Hello world" MANIA=SKIP_VALUE+WS_CHILDIf you don't give a value for the equate it is set to a value of 1 that is, in Windows-speak, TRUE. For example:-
NT_VERSION= NT_VERSION EQU #define NT_VERSIONOnce a word is defined you can use the word in almost any situation where the definition is valid, for example:-
DB HelloText PUSH WS_CHILD|WS_VISIBLE|SS_OWNERDRAW MOV EAX,WS_CHILD MOV EAX,[ESI+SKIP_VALUE] MOV EAX,MANIA+800h
#define lParam [EBP+14h]Then to use the definition you could code as follows:-
MOV lParam,EAX ;same as MOV [EBP+14h],EAX
RECTB(%a,%b,%c,%d) = DD %a,%b,%c,%dThen, you can declare four dwords initialised as specified in the arguments:-
rc1 RECTB (10,10,100,200) ;same as DD 10,10,100,200Here is another example using #define:-
#define DBDATA(%a,%b) DB %a DUP %b DBDATA(3,'x') ;same as DB 3 DUP 'x'There is an important syntax rule when using arguments in definitions. When giving the definition the arguments in brackets must be tight against the word which is defined so that
RECTB(%a,%b,%c,%d)is good but
RECTB (%a,%b,%c,%d)is bad. This rule is to ensure that GoAsm knows the things in brackets are arguments and not something else.
#define WS_POPUP 0x80000000L #define WS_BORDER 0x00800000L #define WS_SYSMENU 0x00080000L #define WS_POPUPWINDOW (WS_POPUP | \ WS_BORDER | \ WS_SYSMENU)This last example was taken straight out of the Windows header file Winuser.h and you can see that it is in typical "C" syntax. GoAsm is quite happy with this. In fact in GoAsm the brackets are optional, so this is also perfectly good syntax:-
#define WS_POPUPWINDOW WS_POPUP | \ WS_BORDER | \ WS_SYSMENUYou may prefer to use MACRO...ENDM instead of the continuation character. The above example can then be re-written as:-
WS_POPUPWINDOW MACRO WS_POPUP | WS_BORDER | WS_SYSMENU ENDM
You can use the multi-line definition method to make a word mean several lines of code instruction, for example:-
OPEN_STACKFRAME(a) = PUSH EBP \ MOV EBP,ESP \ SUB ESP,a*4 \ PUSH EBX,EDI,ESI CLOSE_STACKFRAME = POP ESI,EDI,EBX \ MOV ESP,EBP \ POP EBPUsing MACRO...ENDM this is:-
OPEN_STACKFRAME(a) MACRO PUSH EBP MOV EBP,ESP SUB ESP,a*4 PUSH EBX,EDI,ESI ENDM CLOSE_STACKFRAME MACRO POP ESI,EDI,EBX MOV ESP,EBP POP EBP ENDMIn this example the word OPEN_STACKFRAME is defined to make a stack frame which could typically be used in a windows procedure called by the Windows system. It has an argument which holds the number of dwords in the stack frame to accept local data (the stack pointer is moved by that amount so that the stack can be used to hold local data). The second definition in this example closes the stack frame. Now here is how to use these definitions. In the code section:-
WndProc: ;name of this procedure OPEN_STACKFRAME (6) ;create space for 6 dwords of local data ;----------------- insert window procedure code here CLOSE_STACKFRAME RET 10h ;remove from stack 4 parameters sent by windowsNow lets add some refinement so that the stack can be accessed easily:-
lParam =[EBP+14h] ; wParam =[EBP+10h] ; get ready to access the parameters which uMsg =[EBP+0Ch] ; are sent by Windows to the window procedure hwnd =[EBP+8h] ; ; hDC =[EBP-4h] ; some names of data things hBrush =[EBP-8h] ; often used in different hPen =[EBP-0Ch] ; window procedures DATA1 =[EBP-10h] ; DATA2 =[EBP-14h] ; space for more local data DATA3 =[EBP-18h] ;Inside the stack frame which has been made here the parameters sent by Windows (hwnd, uMsg, wParam and lParam) will always be on the stack at EBP+14h to EBP+8h. Then at EBP+4h we find the return address after the call. At EBP we have the previous value of EBP which we pushed when the stack frame was made. Then at EBP-4h to EBP-18h we have the space for our local data, which in this example can be accessed using the definitions hDC, hBrush, hPen, DATA1, DATA2 and DATA3 (or whatever you want to call them). Then at EBP-1Ch we have the value of EBX when it was pushed when the stack frame was made. Likewise EDI is at EBP-20h and ESI is at EBP-24h. All these values are protected while the stack frame remains open (they will not be written over by other functions until the callback is finished). To access the data within the stack frame you must make sure you don't change ebp (or if you do use it, you restore it to its original value). You don't have to access the data by name. In this example MOV EAX,[hBrush] is the same as MOV EAX,[EBP-8h]. This is a matter of style and taste. Using these methods you can establish as much local data as you want, and if you stick to a fixed method like this you will always know where your local data is. In this example, the first dword of local data is always at EBP-4h. Subtract 4 from this value to access each additional dword of local data.
There are many other ways of dealing with the stack in callbacks.
See callback stack frames in 32-bits and 64-bits,
and automated stack frames using FRAME...ENDF, LOCALS, and USEDATA.
See also "understand the stack" part 1 and
part 2.
Here is an example:-
STRINGS UNICODE CODE ; #define REPORT ; MBMACRO(%lpTextW) MACRO #ifdef REPORT INVOKE MessageBoxW,0,addr %lpTextW,"Report",40h #endif ENDM ; MBMACRO("This code was assembled!")In the above code the Message Box is displayed. But if the line defining REPORT is commented out, then it is not displayed. The "This code was assembled!" string is sent as an argument to the macro.
ARGCOUNT returns the number of arguments given when the definition is used and this can be used with conditional assembly in definitions. For example, suppose you have macro26 defined as:-
macro26(%a,%b,%c,%d,%e,%f) = #if ARGCOUNT=6 \ PUSH %f \ #endif \ #if ARGCOUNT >=5 \ PUSH %e \ #endif \ #if ARGCOUNT >=4 \ PUSH %d \ #endif \ #if ARGCOUNT >=3 \ PUSH %c \ #endif \ #if ARGCOUNT >=2 \ PUSH %b \ #endif \ #if ARGCOUNT >=1 \ PUSH %a \ #endifand then you use macro26 as follows
macro26(4,3,2,1)then the value of ARGCOUNT would be four and the code would be the same as if you had coded:-
PUSH 1,2,3,4In the above example, the first two pushes are skipped over because ARGCOUNT is neither 6 (in the first test) nor greater than or equal to 5 (in the second test).
This can be enlarged to provide a "C" function call macro where the stack is cleared up after the call (the correct number of bytes is added to the stack pointer ESP after the call):-
macro26(%x,%a,%b,%c,%d,%e,%f) = #if ARGCOUNT=7 \ PUSH %f \ #endif \ #if ARGCOUNT >=6 \ PUSH %e \ #endif \ #if ARGCOUNT >=5 \ PUSH %d \ #endif \ #if ARGCOUNT >=4 \ PUSH %c \ #endif \ #if ARGCOUNT >=3 \ PUSH %b \ #endif \ #if ARGCOUNT >=2 \ PUSH %a \ #endif \ CALL %x \ ADD ESP,ARGCOUNT-1*4and here is another way to do it:-
cinvoke(funcname,%1,%2,%3,%4,%5) = \ #if ARGCOUNT=1 \ invoke funcname \ #elif ARGCOUNT=2 \ invoke funcname,%1 \ #elif ARGCOUNT=3 \ invoke funcname,%1,%2 \ #elif ARGCOUNT=4 \ invoke funcname,%1,%2,%3 \ #elif ARGCOUNT=5 \ invoke funcname,%1,%2,%3,%4 \ #elif ARGCOUNT=6 \ invoke funcname,%1,%2,%3,%4,%5 \ #endif \ #if ARGCOUNT>1 \ ADD ESP,ARGCOUNT-1*4 \ #endifThese would then be used as follows:-
cinvoke(_cprintf,23,24,25,26,27) macro26(_cprintf,23,24,25,26,27)If you don't like using the continuation character, you can use MACRO...ENDM instead.
LVERS=0030 MVERS=0044h VERSION=LVERS##MVERS ; MOV EAX,VERSIONHere VERSION is defined as the number 00300044h.
PUSH WS_CHILD|WS_VISIBLE|SS_OWNERDRAWmeans a lot more than:-
PUSH 5000000Dhalthough the comment could provide clarity without using a definition:-
PUSH 5000000Dh ;WS_CHILD, WS_VISIBLE, SS_OWNERDRAWThis example reduces clarity in your code and should be avoided:-
#define wParam EBP+10h MOV EAX,[wParam] ;same as MOV EAX,[EBP+10h]The reason this is bad is that the reference [wParam] makes it appear that wParam is a label. Better is:-
#define wParam [EBP+10h] MOV EAX,wParamThis is clearer because in GoAsm the only thing you can address in this way without using square brackets is a definition.
This is also bad and should be avoided at all costs:-
#define GET_LPARAM MOV EAX,[EBP+14h] GET_LPARAMBetter programming practice is to use
CALL GET_LPARAMand call it properly as a function. However when manipulating the stack it is very difficult to use a procedure since the CALL and RET themselves alter the stack. So in this instance it may be convenient to use a definition if source script clarity does not suffer. See for an example the OPEN_STACKFRAME and CLOSE_STACKFRAME examples above.
Also there seems little point in doing this:-
THOUSAND=1000D MOV EAX,THOUSANDwhen this would do perfectly well:-
MOV EAX,1000DOne use for definitions is if you want your source to be understandable to non-English speakers. Then you can translate all mnemonics using equates: see using defined words in Unicode files.
MOV EBX,[DATA_VALUE] ;get pointer to DATA_VALUE MOV EAX,[EBX] ;get the valueIn the same way as for importing procedures from other programs at link-time you give the linker (GoLink anyway) the name of the executable containing the import.
Using GoAsm and its companion program GoLink, you can import by ordinal using this simple syntax:-
CALL MyDll:6This will call procedure number 6 in MyDll.dll. Note that the extension "dll" is assumed if no extension is given. Suppose you want a Dll to call a function in the main executable by ordinal, then you could use:-
CALL Main.exe:15This calls the 15th function in Main.exe.
CALL [Main.exe:15]You should not include the path of the file in the call. GoLink carries out a wide search for specified files, but if it is necessary to provide a path this should be given to the linker and not incorporated in the call in the assembler script.
Note: the above only applies to GoLink
There is another way to use ordinals using LoadLibrary to load the Dll (or return a handle if it is already loaded) and then calling GetProcAddress passing the ordinal value to get the value of the procedure to call. Finally you call the procedure as returned by GetProcAddress.
CALL NameOfDll:NameOfAPINote: this only applies to GoLink
EXPORTS CALCULATE, ADJUST_DATA, DATA_VALUEHere is an example of declaring an export as you go along:-
EXPORT CALCULATE: CMP EAX,EDX ;code for the JZ >4 ;procedureThis code exports the label to the procedure CALCULATE.
EXPORT CALCULATE: CMP EAX,EDX ;code for the JZ >4 ;procedure
EXPORT DATA_VALUE DD 0This exports the data label DATA_VALUE. The recipient would obtain the value of DATA_VALUE in the following way:-
MOV EBX,[DATA_VALUE] ;get pointer to DATA_VALUE MOV EAX,[EBX] ;get the value
EXPORTS CALCULATE:2, DATA_VALUE:6Here the linker will be instructed to use the ordinals 2 and 6 for the exports. If you are using the alternative method of declaring exports (within a section) you can use for example:-
EXPORT:2 CALCULATE:or in the case of data:-
EXPORT:6 DATA_VALUE DD 0
EXPORT:2:NONAME CALCULATE:Here the value of the code label CALCULATE will be exported as ordinal number 2, but the name of the export will not appear in the final executable. This means that if another program tried to call the CALCULATE function it would fail. The function can only be called by ordinal.
ProcX: USES EAX,EBX,ECX ;ready to PUSH the registers on the stack CMP EAX,ESI ;first mnemonic causes the PUSHes to occur ; ; code for the procedure ; .finnc CLC ;ready to return carry flag not set RET ;POP all registers in reverse followed by RET ; .finc STC ;ready to return carry flag set RET ;POP all registers in reverse followed by RET ENDU ;end all special POP action when RET foundYou can also automatically push and pop the flags, using FLAGS which is a reserved word in GoAsm:-
USES FLAGSYou cannot change or add to the list of registers from within the procedure. To do that you would need an ENDU followed by a fresh USES list. If you need to RET without popping the registers you can use RETN ("normal" RET).
In 64-bit programming you can use the extended versions of the general purpose registers (RAX to RSP) and also the new 64-bit registers (R8 to R15). You can also use the 32-bit versions of the general purpose registers (EAX to ESP). This is because when PUSHing registers the 32-bit forms and the 64-bit forms are interchangeable (they produce the same opcode). So, whether you are assembling for 32-bits or for 64-bits,
USES RAX,RBX,RCXwill code the same as
USES EAX,EBX,ECXThis helps towards transportability of your code between the two platforms.
See also "understand the stack" part 1 and part 2.
In GoAsm the creation and use of stack frames is all automated when you use FRAME...ENDF.
See automated stack frames for how to use this in practice.
Since 32-bit and 64-bit Windows stack frames are different, they need to be treated separately. But the syntax for using FRAME...ENDF and their companion instructions such as LOCALS and USEDATA..ENDU are the same on both platforms. For this reason when you use FRAME...ENDF it is possible to use the same source script for both. See writing 64-bit programs for more information about 64-bit programming generally.
All four of these are equally important.
hwnd window handle uMsg message identifier wParam data lParam dataYour window procedure needs to access these parameters. One way is to POP them off the stack into static data, but an easier (and safer) way is to keep them on the stack and reference them directly from the stack. This is better because window procedures sometimes call themselves. This may seem odd, but one example will suffice. Suppose your window procedure needs to fill the window with the correct material at the right time. This is called "painting" the window. This is done by responding to the Windows message WM_PAINT (message number 0Fh). Now the proper way to respond to this message is first to call the API BeginPaint, then to paint the window then to call the API EndPaint. One of the things BeginPaint does is to prepare the window for the paint. In doing so it sends another message to your window procedure, this time WM_ERASEBKGRND (message number 14h). So while your window procedure is dealing with this second message it has not yet returned from the API BeginPaint. After the second message has been dealt with BeginPaint will return. So the window procedure is recursive that is to say, it can come back to itself. For each message (except for hwnd) the parameters will be different. If they are kept on the stack it means that each time the window procedure will be called the parameters will be kept on a different part of the stack and cannot be written over.
A typical 32-bit stack frame will be set up as follows:-
TypicalStackFrame: PUSH EBP ;save the value of ebp which will be altered } called the MOV EBP,ESP ;give current value of stack pointer to ebp } "prologue" ; ;POINT "X" ; ;code to isolate WM_PAINT message PUSH ADDR PAINTSTRUCT PUSH [hwnd] CALL BeginPaint ;get ready to paint window ; ;paint window and call EndPaint ; MOV ESP,EBP ;restore the stack pointer to previous value } called POP EBP ;restore the value of ebp } the RET 10h ;return to caller adjusting the stack pointer } "epilogue"During any recursion esp will be changed since further use of the stack will be made. But ebp is always saved and restored by this procedure so it can always be relied on to access the correct part of the stack for the message being dealt with.
ebp-4h | the next push will go here |
ebp | saved value of ebp |
ebp+4h | caller's return address |
ebp+8h | hwnd |
ebp+0Ch | uMsg |
ebp+10h | wParam |
ebp+14h | lParam |
ebp+18h | ) |
ebp+1Ch | ) other use of stack |
ebp+20h | ) (BeginPaint etc) |
ebp+24h | ) |
ebp+28h | saved value of ebp |
ebp+2Ch | caller's return address |
ebp+30h | hwnd |
ebp+34h | uMsg |
ebp+38h | wParam |
ebp+3Ch | lParam |
PUSH ADDR PAINTSTRUCT PUSH [hwnd] CALL BeginPaint ;get ready to paint windowthe stack pointer (esp) was made more negative by 8 bytes by reason of the two pushes before the call to the API BeginPaint. But after the return from BeginPaint, the stack pointer is back where it started. This is because within BeginPaint it was made more positive by 8 bytes before returning to your code.
TypicalStackFrame: PUSH EBP ;save the value of ebp which will be altered } MOV EBP,ESP ;give current value of stack pointer to ebp } "prologue" SUB ESP,0Ch ;make space for local data } ; ;POINT "X" ; ; ;window procedure code ; MOV ESP,EBP ;restore the stack pointer to previous value } POP EBP ;restore the value of ebp } "epilogue" RET 10h ;return to caller adjusting the stack pointer }Here we have moved the stack pointer by 12 bytes, which is exactly the same as if we had done 3 PUSHes. This provides an area of the stack which cannot be used for any other purpose.
Using FRAME...ENDF you can create the stack frame automatically.
Here is a typical use of FRAME...ENDF which does the same as the TypicalStackFrame
code above, as well as providing names for the parameters which are sent to the
window procedure and names for each dword of local data:-
WndProc FRAME hwnd,uMsg,wParam,lParam LOCALS hDC,hInst,KEEP ; ;POINT "X" ; ; ;code goes here ; RET ENDFThe stack actually looks like this (at point "X" again):-
ebp-10h | the next push will go here |
ebp-0Ch | space for local data KEEP ← esp is currently here (top of local data) |
ebp-8h | space for local data hInst |
ebp-4h | space for local data hDC |
ebp | saved value of ebp ← ebp given value of esp when it was here |
ebp+4h | caller's return address |
ebp+8h | hwnd |
ebp+0Ch | uMsg |
ebp+10h | wParam |
ebp+14h | lParam |
ebp+18h | ) |
ebp+1Ch | ) other use of stack |
WndProc FRAME hwnd,uMsg,wParam,lParam USES EBX,ESI,EDI LOCALS hDC,hInst,KEEP ; ; ;code goes here ; RET ENDF
To deal with the requirement to record the parameters sent in the registers, when you use FRAME...ENDF in 64-bit code, GoAsm creates instructions like this at the beginning of the stack frame:-
MOV [RSP+8h],RCX MOV [RSP+10h],RDX MOV [RSP+18h],R8 MOV [RSP+20h],R9 PUSH RBP MOV RBP,RSPThis code puts the parameters in their placeholders on the stack. If there are fewer than four parameters not all these instructions are emitted. Note that parameter five (if present) is already on the stack at [RSP+28h], parameter six is at [RSP+30h] etc. The final two instructions set up RBP as a pointer to the data, having saved RBP first so it can be restored later.
In the epilogue you would expect to see something like:-
LEA RSP,[RBP] POP RBP RETThe LEA instruction here is used instead of the simpler MOV RSP,RBP to help the Windows exception handler to identify the epilogue.
Taking a typical use of FRAME...ENDF as follows:-
WndProc FRAME hwnd,uMsg,wParam,lParam USES RBX,RSI,RDI LOCALS hDC,BUFFER[256]:B ; ;POINT 'X' ; ;code goes here ; RET ENDFHere is how the stackframe turns out in 64-bit assembly, using the RSP and RBP values at the start of code proper at POINT 'X' (note that RBP is 32 bytes less than RSP was when entering the procedure: this is because of the PUSH RBP,RBX,RSI and RDI instructions):-
rbp-118h | next push goes here |
rbp-110h | ) 256 byte buffer ← rsp is currently here (top of local data) |
) | |
) | |
rbp-8h | hDC |
rbp | saved value of rdi ← rbp given value of rsp when it was here |
rbp+8h | saved value of rsi |
rbp+10h | saved value of rbx |
rbp+18h | saved value of rbp |
rbp+20h | caller's return address |
rbp+28h | place holding hwnd (originally in rcx) |
rbp+30h | place holding uMsg (originally in rdx) |
rbp+38h | place holding wParam (originally in r8) |
rbp+40h | place holding lParam (originally in r9) |
rbp+48h | param #5 if present |
rbp+50h | param #6 if present |
Practice
Some practical considerations
Calling procedures within the stack frame - use of RETN
Calling procedures outside the stack frame - use of USEDATA
Advanced use
Declaring message-specific local data - positioning the LOCAL statement
Creating a minimised window procedure
Locally defined words using #localdef or LOCALEQU
Re-usable label scope in automated stack frames
Inheritance and scope when using USEDATA..ENDU
Releasing local data and making new local data - use of LOCALFREE
General
Some syntax points when using FRAME...ENDF
Some syntax points when using USEDATA...ENDU
What you can see in the debugger.
FRAME...ENDF is similar to MASM's PROC...ENDP, but with GoAsm you can do a lot more.
Subroutines can also use data on the stack using USEDATA...ENDU. And you can declare
local data dynamically. This permits you within the window procedure
to declare only the local data which is actually required for a particular
message. You can use locally defined words which only operate within
their own FRAME..ENDF envelopes or USEDATA..ENDU areas associated with them.
Also in GoAsm the syntax is more relaxed and the
source script is much easier to understand because there is no type
or parameter checking.
You use FRAME...ENDF to make an automated stack frame. Here is how you would use it:-
WndProc FRAME hwnd,uMsg,wParam,lParam USES EBX,ESI,EDI LOCALS hDC,BUFFER[256]:B ; ; ;code goes here ; RET ENDFWhen you use FRAME..ENDF in this way GoAsm creates a stack frame "behind your back". For this reason it may be mistrusted a little. Assembler programmers like to know what is going on, which is why they are assembler programmers in the first place! So I'm going to describe this in some detail. You don't need to know all these details.
In the code above, FRAME tells GoAsm that you want to make an automated stack frame.
ENDF signifies the end of the automated stack frame.
The words after "FRAME" are the parameters. In this case there
are four parameters which are being given the names shown here. There is
no need to add anything else since GoAsm knows the size of the parameters. In
32-bit coding they are always dwords; in 64-bit coding they are always qwords.
USES tells GoAsm which registers need to be preserved in the frame. Here
we use the 32-bit registers, but in 64-bit assembly this is read as USES RBX,RSI,RDI
without having to change the source code (in the case of PUSH register its the
same opcode which is generated for each platform).
LOCALS permits you to declare and label local data in the frame. GoAsm adds
up the size of this local data and sets aside space for it on the stack. When declaring
local data, in 32-bit assembly dword is the default, and in 64-bit assembly qword
is the default. The default is used if you don't give a size for the data.
So in the example, hDC is a dword. There is also an area on the stack called BUFFER. This
is 256 bytes because of the [256]:B. Instead of B you can use W,D,Q or T
for words, dwords, qwords or twords respectively or you can use the
name of a structure see
using structures as local data in a stack frame.
GoAsm automatically creates the prologue code as described in the
32-bit or 64-bit sections above.
GoAsm will add the epilogue code each time it sees a RET within the
frame delineated by FRAME...ENDF.
PUSH [hwnd] ;equivalent to PUSH [EBP+8h] MOV EAX,[uMsg] ;equivalent to MOV EAX,[EBP+0Ch] MOV EBX,ADDR wParam ;equivalent to LEA EBX,[EBP+10h] PUSH ADDR lParam ;equivalent to PUSH EBP then ADD D[ESP],14h MOV EBX,[hDC] ;equivalent to MOV EBX,[EBP-10h] MOV EBX,ADDR BUFFER ;equivalent to LEA EBX,[EBP-110h]In this coding GoAsm finds on the stack the correct position of the label you are accessing and codes it appropriately for you. If you use the /l option on the command line you can see this in the list file output. Alternatively you can see this in the debugger.
Note that the address of the buffer is given at its most negative point. This is correct, so if you code:-
MOV D[BUFFER+10h],44hYou are inserting the value 44h at a dword which is 16 bytes into the buffer.
Note that GoAsm sets the value of EBP after PUSHing the registers named in the USES statement. This arrangement permits the amount of local data to be adjusted dynamically on a message-specific basis. But it also means that if you have a USES statement in a FRAME the offset of the parameters from EBP will be greater than otherwise. So, for example if you PUSH three registers in a FRAME with the USES statement as follows:-
USES EBX,EDI,ESIthen EBP will be pushed further away from the parameters by 12 bytes. So in this example hwnd would be at [EBP+14h], uMsg at [EBP+18h], wParam at [EBP+1Ch] and lParam at [EBP+20h]. When coding you need not worry about the exact position of the parameters relative to EBP, but you will need to be aware of this when looking at your code in the debugger. See also what you can see in the debugger.
RECT STRUCT left DD 0 top DD 0 right DD 0 bottom DD 0 ENDS ; WndProc FRAME hwnd,uMsg,wParam,lParam LOCALS hDC,rc1:RECT ; ; ;code goes here ; RET ENDFEach element of the RECT structure can be accessed in the same way as if it were in static data for example (again using 32-bit examples):-
MOV EAX,[rc1.right] ;equivalent to MOV EAX,[EBP-0Ch] MOV EAX,[ESI+RECT.right] ;equivalent to MOV EAX,[ESI+8h] MOV EAX,SIZEOF RECT ;equivalent to MOV EAX,10h MOV EAX,ADDR rc1.right ;equivalent to LEA EAX,[EBP-0Ch] PUSH [rc1.right] ;equivalent to PUSH [EBP-0Ch] PUSH ADDR rc1.right ;equivalent to PUSH EBP then ADD D[ESP],-0Ch PUSH ADDR rc1 ;equivalent to PUSH EBP then ADD D[ESP],-14h
WndProc FRAME hwnd,uMsg,wParam,lParam USES EBX,ESI,EDI LOCAL hDC,BUFFER[256]:B MOV EAX,[uMsg] ;get message sent by Windows CMP EAX,0Fh ;see if it is WM_PAINT JNZ >L2 ;no CALL WINDOW_PAINT ;paint the window XOR EAX,EAX ;return zero to show message dealt with RET ;restore stack and return to Windows ; L2: ARG [lParam],[wParam],[uMsg],[hwnd] INVOKE DefWindowProcA ;allow Windows to deal with the message RET ;restore stack and return to Windows ; WINDOW_PAINT: ; code to paint the window RETN ;do ordinary return from paint procedure ; ENDF ;stop all FRAME action from this point onwards
PAINT: USEDATA WndProc INVOKE BeginPaint, [hwnd],ADDR lpPaint ;get in eax the DC to use MOV [hDC],EAX INVOKE Ellipse, [hDC],[lpPaint.rcPaint.left], \ [lpPaint.rcPaint.top] , \ [lpPaint.rcPaint.right], \ [lpPaint.rcPaint.bottom] INVOKE EndPaint, [hwnd],ADDR lpPaint XOR EAX,EAX RET ENDUHere the procedure PAINT is using local data in the FRAME called WndProc. All the code above is outside the FRAME...ENDF envelope.
WndProc FRAME hwnd,uMsg,wParam,lParam USES EBX,ESI,EDI LOCAL hDC ;declare hDC for frame-wide use MOV EAX,[uMsg] ;get message sent by Windows CMP EAX,0Fh ;see if it is WM_PAINT JNZ >L2 ;no CALL WINDOW_PAINT ;paint the window XOR EAX,EAX ;return zero to show message dealt with RET ;restore stack and return to Windows ; L2: ARG [lParam],[wParam],[uMsg],[hwnd] INVOKE DefWindowProcA ;allow Windows to deal with the message RET ;restore stack and return to Windows ; ENDF ;stop all FRAME action from this point onwards ; WINDOW_PAINT: USEDATA WndProc ;use parameters and local data from WndProc LOCAL ps:PAINTSTRUCT ;make local data areas LOCAL BUFFER[1024]:B ;specifically for this message ; ARG ADDR ps,[hwnd] INVOKE BeginPaint ;get ready to paint window MOV [hDC],EAX ;keep the device context in hDC local data ; code to paint the window RET ;do ordinary return from paint procedure ENDU ;end use of WndProc frame data
WndProc FRAME hwnd,uMsg,wParam,lParam MOV EAX,[uMsg] MOV ECX,SIZEOF MESSAGES/8 MOV EDX,OFFSET MESSAGES : DEC ECX JS >.notfound CMP [EDX+ECX*8],EAX ;see if its the correct message JNZ < ;no CALL [EDX+ECX*8+4] ;call the correct procedure for the message JNC >.exit .notfound INVOKE DefWindowProcA,[hwnd],[uMsg],[wParam],[lParam] .exit RET ENDFSomewhere in the data or const section would be the following table for the messages (in practice there would be a lot more messages than this):-
MESSAGES DD 1h, CREATE ;the message value then the code address DD 2h, DESTROY DD 0Fh, PAINT NextLabel:And then in the code section (and below WndProc) you would have the code for these messages, for example:-
CREATE: USEDATA WndProc ;use stack data in the window procedure frame USES EBX,EDI,ESI ;preserve the registers for Windows LOCALS LocalData ;establish required local data area ; ; code to execute on the WM_CREATE message ; XOR EAX,EAX ;return nc and eax=0 to continue creating the window RET ;restore the registers and then RET ENDU ;stop all automated action and access to dataIn the minimised window procedure DefWindowProc is not called unless either the message is not found in the message table or the message code returns with the carry flag set. Some messages must call DefWindowProc even if they are processed by the window procedure - see the Windows SDK.
For example:-
FrameProc1 FRAME Param #localdef THING1 23h THING2 LOCALEQU 88h ; MOV EAX,THING1 ;local define 23h MOV EAX,THING2 ;local define 88h ; RET ENDF ; MyFunction44: USEDATA FrameProc1 #localdef THING3 0CCh ; MOV EAX,THING1 ;local define should be 23h MOV EAX,THING2 ;local define should be 88h ; RET ENDUIn the above example, if THING1 and THING2 are defined globally (using #define or EQU), that definition is ignored (the local definition takes priority).
#undef has local scope priority. If the word to be undefined is found locally, then #undef applies to that. If not, #undef will apply to a global label.
ExampleProc FRAME Param CMP EDX,EAX JZ >.fin LABEL1: XOR EAX,EAX .fin RET ENDF LABEL2:Here the jump to .fin will still work despite the existence of LABEL1. This is because the label .fin has the scope of the whole frame, and not just the code between LABEL1 and LABEL2. In other words using FRAME...ENDF enlarges the scope of the re-usable label to the whole frame.
The usual arrangement is to have each USEDATA area a child of the FRAME:-
![]() |
FrameExample FRAME Param LOCAL LocalLabel1 #localdef CONSTANT 23h ; RET ENDF ; Usedata#1: USEDATA FrameExample MOV EAX,[LocalLabel1] MOV EAX,CONSTANT MOV EAX,[Param] RET ENDU ; Usedata#2: USEDATA FrameExample LOCAL Specific #localdef SPECIFIC_CONSTANT 444444h MOV EAX,[LocalLabel1] MOV EAX,CONSTANT MOV EAX,[Param] MOV EAX,[Specific] MOV EAX,SPECIFIC_CONSTANT RET ENDU |
In the next arrangement, the first USEDATA area is a child of the FRAME and the second USEDATA area is its grandchild.
![]() |
FrameExample FRAME Param LOCAL LocalLabel1 #localdef CONSTANT 23h ; RET ENDF ; Usedata#1: USEDATA FrameExample LOCAL Specific #localdef SPECIFIC_CONSTANT 444444h RET ENDU ; Usedata#2: USEDATA Usedata#1 MOV EAX,[LocalLabel1] MOV EAX,CONSTANT MOV EAX,[Param] MOV EAX,[Specific] MOV EAX,SPECIFIC_CONSTANT RET ENDU |
You can use LOCALFREE to release areas of local data ready to make
new local data. This may help to conserve memory if you use the stack
a lot. LOCALFREE will render existing local data declared in the
FRAME or USEDATA...ENDU area in which it appears
inaccessible to all later code in your source script. It will not affect
local data in other FRAMES or USEDATA areas. When GoAsm sees LOCALFREE in the
source script it causes the value of ESP/RSP to be restored to its value
in the current FRAME or usedata area before any local data was declared.
You can then declare new local data using LOCAL or LOCALS.
Only use LOCALFREE when the stack is in equilibrium. Do not use it
if there are any outstanding PUSHes which need to be POPped. This is
because the change to ESP/RSP will effectively rub over any outstanding PUSHes.
At the end of a procedure you do not need to use LOCALFREE since the
stack is restored automatically on a RET anyway.
Here is an example of how to use LOCALFREE:-
CREATE: USEDATA WndProc ;use stack data in the window procedure frame USES RBX,RDI,RSI ;preserve the registers for Windows LOCALS BUFFER[4000]:B ;establish large buffer on the stack ; ; part code to execute on the WM_CREATE message ; LOCALFREE ;rub out large buffer LOCALS BUFFER[256]:B ;establish smaller buffer on the stack ; ; part code to execute on the WM_CREATE message ; XOR RAX,RAX ;return nc and rax=0 to continue creating the window RET ;restore the registers and then RET ENDU ;stop all automated action and access to data
CodeLabel: FRAME Parameter List ;if any parameters are needed USES Register List ;if registers need to be saved LOCAL Local List ;if local variables are required ; ret ENDF
CodeLabel: USEDATA SourceData USES Register List ;if registers need to be saved LOCAL Local List ;if local variables are required ; ret ENDU
In 32-bit FRAMEs, GoAsm PUSHes EBP and registers specified by USES
first, then keeps the value of the stack pointer ESP in EBP
using MOV EBP,ESP. On a RET this is reversed, so you will see MOV ESP,EBP
followed one or more register POPs.
Space is made for the local data using SUB ESP,x where x depends on the
amount of space for local data required.
In 64-bit FRAMEs, GoAsm's first job is to store parameters #1 to #4
on the stack using an instruction like MOV [RSP+8h],RCX as
described earlier. Then after PUSHing RBP and registers
specified by USES, MOV RBP,RSP is used to keep the stack pointer. Coming out of the FRAME
LEA RSP,[RBP] is used to restore RSP ready to POP the registers and a RET.
Code such as MOV EAX,[EBP-34h] or LEA,[EBP-56h] or PUSH EBP, ADD D[ESP],-60h (or their 64-bit register equivalents) will be generated when local data is accessed. The offset values will be positive when accessing parameters.
In USEDATA areas, since GoAsm does not know at
assemble-time how much use of the stack there has been before the call
to the USEDATA procedure, it makes a "shield" of 100h bytes (200h bytes in 64-bit assembly)
to ensure that such stack use is protected from over-write. For this reason the
offset numbers used when local data is accessed may be larger than expected.
Advanced users may like to adjust the size of the shield. This can
be done using this syntax, for example:-
USEDATA WndProc SHIELDSIZE:20h ;in 32-bit assembly USEDATA WndProc SHIELDSIZE:40h ;in 64-bit assemblyThis sets the shield to only 8 push values (8 dwords in 32-bit assembly, 8 qwords in 64-bit assembly), which would be suitable if you were sure that at run-time there would never be more than seven pushes and one call prior to the local data declaration in the USEDATA procedure. (Remember you must count all PUSHes before the CALL, the CALL itself and any sub-CALLs, and also all PUSHes caused by the USES statement within the USEDATA procedure). Once SHIELDSIZE is set it remains at that value for the remainder of the source script until changed.
In USEDATA areas the value of the stack pointer is not kept in the EBP or RBP register because this already holds the stack pointer on entry into the FRAME. Instead, GoAsm keeps the value of the stack pointer at a convenient place on the stack. This is done when the first local data in the USEDATA area is declared. In order to do this safely GoAsm adds several lines of code culminating in MOV EAX,[EAX-4h] (or MOV RAX,[RAX-8h] in 64-bit assembly). GoAsm uses EAX/RAX during this process but restores its value afterwards, so you can still use it to pass information to the procedure. On a RET you will see the value of ESP/RSP being restored using POP ESP/RSP.
LOCALFREE will also cause a restoration of the stack pointer.
USES statements will cause PUSHes of the registers concerned
before ESP/RSP is saved and POPs of the registers after it is restored.
#if condition
text A
#endif
Here if the condition is TRUE text A will be assembled. If, however, the condition is FALSE, the assembler will jump over text A and will continue compiling from the #endif.
You can add something to do if the condition is FALSE as follows:-
#if condition
text A
#else
text B
#endif
Here if the condition is TRUE, text A will be assembled, but text B will not be assembled.
If, however, the condition is FALSE, text A will be jumped over but text B will be assembled.
The #endif indicates the end of the conditional frame, so that all text after that will be assembled.
The #else statement must always be next before the #endif.
You can add a further condition to the frame:-
#if condition1
text A
#elif condition2
text B
#endif
Here if condition1 is TRUE, text A will be assembled, but text B will be jumped over and assembly will continue from the #endif. If, however, condition1 is FALSE, text A will be jumped over to the #elif when condition2 will be tested. If then condition2 is TRUE, text B will be assembled. "#elif" is the same as "#elseif".
Adding the #else to the above conditional frame produces:-
#if condition1
text A
#elif condition2
text B
#else
text C
#endif
Here if condition1 is TRUE, text A will be assembled, but text B and text C will be jumped over and assembly will continue from the #endif. If, however, condition1 is FALSE, text A will be jumped over to the #elif when condition2 will be tested. If then condition2 is TRUE, text B will be assembled, and text C will be ignored; if, however condition2 is FALSE text B will be jumped over to the #else and text C will be assembled.
You can have as many #elifs as you like in each conditional frame, but there can only be one #else per frame, and each #if must have a corresponding #endif. Some programmers nest the conditional frames, but this can become very confusing and may not be good programming practice. If this is done it is recommended that you label each #endif with a comment so that you can see to which #if it refers.
#ifdef identifier
the relational operator can be one of the following:-
>= greater than or equals
<= less than or equals
== equals
= equals
!= not equal
> greater than
< less than
value can be a number or a word which is defined elsewhere in the file, in an include file or in the command line, which evaluates to a number.
#define HELLO ; #ifdef HELLO BSWAP EAX ;swap the bytes in eax if HELLO is defined #endif
#if HELLO==3 OUTPUT DD 3h ;if HELLO is defined as 3 declare data label OUTPUT as 3 #elif WINVER>=400h OUTPUT DD 4h ;alternative data if WINVER is equal to or greater than 400h #else OUTPUT DD 5h ;alterative data if neither of the above apply #endifYou can define a word in the command line in order to trigger the correct parts of your source script for assembly, for example
GoASM /l /d WINVER=401h MyProg.ASMor simply
GoASM /l /d VERSIONA MyProg.ASMMeans that the word VERSIONA will be defined, and it can be tested using #ifdef.
See also:-
conditional assembly in macros
conditional assembly in structures
As far as GoAsm is concerned there are two types of include file as follows:-
Type | Effect |
---|---|
Files with an "a" or "A" extension, for example MyInclude.asm, or simply MyInclude.a | With this type of file, at the time the include is
declared in your source script assembly is diverted into the include file.
And if you are making a list file using the /l option, a full list of the contents of the file will be made. Do not use this type of file if your include file contains only definitions, structures and the like. This will slow GoAsm down unnecessarily, because it will look for mnemonics and assembler instructions in the file. |
Files without an "a" or "A" extension, for example MyInclude.inc, or simply MyInclude |
With this type of file, no assembly is carried out in the include file
and only the definitions and structures in the include file are examined
and recorded.
And if you are making a list file using the /l option, the contents of the file will not be listed. Use this extension if your include file contains only definitions, structures and the like (commonly called header files). GoAsm will make a record of all these in case they are referred to later in the main source script. For this reason a large include file will slow GoAsm down. Normally GoAsm does not permit any other programs to open include files without an "a" or "A" extension that it has opened. This is to help error checking. But if you want to allow this for example to permit the same header files to be available in a parallel compilation environment, you can specify the /sh ("share header" files) switch in the command line. |
path\filename can be either:-
DATA SECTION BULKDATA INCBIN MyFile.txt ;load the whole of MyFile.txt into data section with label BULKDATA INCBIN MyFile.txt, 100 ;miss out the first 100 bytes but load rest of file INCBIN MyFile.txt, 100, 300 ;miss out the first 100 bytes but load 300 bytesSee also inserting blocks of data.
CALL zlibstat.lib:compressYou can also use equates to shorten the call for example:-
LIB1=c:\prog\libs\zlibstat.lib CALL LIB1:compressExamples using INVOKE are:-
INVOKE zlibstat.lib:compress,[pCompHeap],ADDR ComprSize,[pHeap],[DataSize] INVOKE LIB1:compress,[pCompHeap],ADDR ComprSize,[pHeap],[DataSize]If your path contains spaces you must put the path and filename in quotes.
JMP MyLib.lib:MainProcThis can be used if, for example the library code contains a call to ExitProcess to end the program.
LIB calculate.objThis will make calculate.lib. To add another object file to the library you can then use:-
LIB calculate.lib added.objThis will add added.obj to the library calculate.lib. This is useful if you want to keep your functions in libraries, so that they can be re-used without having to insert the source code into your source scripts. They are also useful to distribute your functions whilst keeping your source code to yourself. LIB.EXE and its components are part of the MSDN tools which can be downloaded free from the Microsoft MSDN site (part of the SDK). The exact download keeps changing so it may be trial and error getting these files. It is most likely also part of various main compilation tools such as VC++ or MASM.
CALCULATE1: ;some code here RET CALL Lib1:DUMMY ;ensure Lib1 is loaded at compile-time CALCULATE2: ;some code here RET
So where should it be declared
and what if it is declared more than once? A similar question arises with
functions. The code for these may be in the main source script or in a
library. The code may be duplicated in several places. The answer to the
question lies in the priority rules. They are as follows:-
1. The GoAsm main source script and any "a" include files always
have priority. In other words, any code label or data declaration in
the scripts will always find their way into GoAsm's output file together
with the code and data that they are labelling.
2. Subject to 1 above, the formal library calls (using the format
library:functionname) have priority in the order in which they
are called.
These rules mean that code libraries are able to call functions and to
use data within the source scripts directly (without any assistance from
the linker). They also mean that any label in a library which has
already been used in the source script or in a library which has already
been called will be ignored. For example suppose BUFFER is
declared in the source script to be 256 bytes. If library1 declares it
again as 128 bytes, the label is ignored. BUFFER will be 256 bytes in the
output file. And further, although the data area reserved for BUFFER in
library1 will be loaded in the output file, the label BUFFER will not point
to that area but to the area declared in the source script. Now if in a
later call, library2 declares BUFFER as 1024 bytes, again this data area
will go into the output file, but BUFFER will point to the original data
area. The reason GoAsm deals with same-name labels in this way is twofold.
Firstly, it would be impossible for any assembler (or linker for that matter)
to work out which label has priority from its size. This is because
in GoAsm at assemble time and certainly at link time the size of a particular
area pointed to by a label is not known for certain. This is because
sometimes areas are enlarged by areas of unlabelled data or code, or sometimes
other labels are used as pointers to an intermediate place within the area.
The rules also have significance for the ordering of data. Suppose
your library relies on data being held in BUFFER and also upon data sometimes
overflowing into an enlarged area labelled BUFFER_EXT, declared
immediately after BUFFER in the libary. Now in the above example BUFFER
would not actually point to the expected place (just before BUFFER_EXT).
Instead it would point to the first BUFFER declaration somewhere else in
the data section.
So I would suggest the following rules are followed when making lib files:-
1. Only use same-name labels in the source script and in the library
if the labels are intended to point to the same thing at the same place,
ie. to the first declared such label.
2. If data labels of the same-name are to be used in different functions
in the lib file, be aware that the size of the data area which
they identify will be set by the first declared such label. Also don't expect
the data area which the label identifies to be placed in any particular position
in the executable.
See
Calling Windows APIs in 32-bits and 64-bits
Callback stack frames in 32-bits and 64-bits
Writing 64-bit programs.
MOV RSI,ADDR String MOV [RDI],AL MOV RAX,[hInst]These and similar instructions will work in both x64 and x86 compatibility mode.
In addition to this, in x86 compatibility mode,
Note that /x86 should not be used in the command line for Win32 source code (use it only for 32/64-bit switchable source code).
Even in x86 compatibility mode, you cannot use the new AMD64 registers, R8 to R15, XMM8 to XMM15, nor the new register addressing formats SIL,DIL,BPL,SPL,R8W to R15W, or R8D to R15D. This is because they are not available for use by a 32-bit executable.
Any source code which is incompatible with a 32-bit excutable should be switched at assemble
time using conditional assembly.
See Calling Windows APIs in 32-bits and 64-bits in the
GoAsm manual for more information about ARG and INVOKE.
See the file Hello64World3 for example source code which can make
either a simple Win32 "Hello World" Window program or a Win64 one.
See also writing 64-bit programs for the detailed differences
between 32-bit and 64-bit programs.
DATA SECTION "MyData" SHAREDNote that a shared section must have a unique name specified in the way shown above. You will probably not want to use one of the default section names, "data", "code" or "const", since these would normally be reserved for non-shared sections.
CODE SECTION 'Asm$b' ; CODE SECTION 'Asm$a' ;Here the linker will ensure that the code in the section called Asm$a will appear in the executable earlier than the code in the section called Asm$b. In fact, the linker will combine the code into one section (called Asm) in this correct order. Material after the dollar sign is used only to provide correct ordering and when comparing the section names the linker will only look at the characters in front of the dollar sign. Because, in the executable, sections cannot have names of more than eight characters, in practice you ought to limit the number of characters in front of the dollar sign to eight.
CONST SECTION '.xdata' ALIGN 8 ; ;UNWIND_INFO ; CONST SECTION '.pdata' ALIGN 4 ; ;RUNTIME_FUNCTION ;
AdaptAsm [command line switches] inputfile[.ext]If no input extension is specified, .asm is assumed.
/h=this help /a=adapt a386 file /m=adapt masm file /n=adapt nasm file /fo=specify output path/file eg. /fo GoAsm\adapted.asm /l=create log output file /o=don't ask before overwriting input file /x64=adapt file for 64-bitsWhen adapting a TASM file, you can regard it as a MASM file if it is written for masm mode. I have not included a version for TASM's ideal mode.
What AdaptAsm does when adapting various input files.
For what happens using the /x64 switch see using AdaptAsm to help convert to 64-bit programs | |||
Action | a386 files using /a | masm files using /m | nasm files using /n |
Unless the word is defined (eg. using an equate) square brackets
are added to all memory references where they don't already have them eg.
MOV EBX,MEM_REFERENCE becomes MOV EBX,[MEM_REFERENCE]but MOV EBX,OFFSET MEM_REFERENCE is left alone | yes | yes | no |
Puts memory references combined with square brackets into correct form
eg.
MOV DX,sHEXw[ECX*2] becomes MOV DX,[sHEXw+ECX*2] | yes | yes | no |
Adds ADDR to NASM memory references which do not have square brackets | no | no | yes |
Type indicators and overrides BYTE, BYTE PTR, WORD, WORD PTR, DWORD, DWORD PTR, QWORD, QWORD PTR, and TWORD, TWORD PTR are replaced by the shortened equivalents, B,W,D,Q and T | yes | yes | yes |
Swaps all direct quote immediates and word and dword character based
data declarations so they read the correct way round
eg.
MOV [ESI],'exe.' becomes MOV [ESI],'.exe' MOV EAX,'morf' becomes MOV EAX,'from' DW 'GJ' becomes DW 'JG' DD 'dcba' becomes DD 'abcd' | yes | yes | no |
Changes HELLO LABEL to HELLO: | yes | yes | no |
The MASM-type @@: local labels and @F and @B jumps are converted to GoAsm format. They are given numbers in sequence through the file and where necessary the > indicator is added in the jump instructions. | no | yes | no |
The NASM-type local labels preceded by a period eg. (".23") are converted to GoAsm format. The number remain unchanged but where necessary the > indicator is added in the jump instruction. | no | no | yes |
In jump instructions NEAR and SHORT are removed (no longer used) | yes | yes | yes |
Changes FPU registers from just a number to ST0 to ST7 eg.
FDIV 0,1 becomes FDIV ST0,ST1 | yes | no | no |
Changes FPU registers from ST(0) to ST(7) to read ST0 to ST7 eg.
FDIV ST(0),ST(1) becomes FDIV ST0,ST1 | no | yes | no |
Data declarations made using BYTE, ACHAR, SBYTE, are changed to DB.
Data declarations made using WORD, SWORD, SHORTINT are changed to DW. Data declarations made using DWORD, HDC, ATOM, BOOL, HDWP, HPEN, HRGN, HSTR, HWND, LONG, LPFN, UINT, HFILE, HFONT, HICON, HHOOK, HMENU, HRSRC, HTASK, LPINT, LPSTR, LPVOID, WCHAR, HACCEL, HANDLE, HBRUSH, HLOCAL, LPARAM, LPBOOL, LPCSTR, LPLONG, LPTSTR, LPVOID, LPWORD, SDWORD, WPARAM, HBITMAP, HCURSOR, HGDIOBJ, HGLOBAL, INTEGER, LONGINT, LPBYTE, LPCTSTR, LPCVOID, LPDWORD, LRESULT, POINTER, WNDPROC, COLORREF, HPALETTE, HINSTANCE, HINTERNET, HMETAFILE, HTREEITEM, HCOLORSPACE, LOCALHANDLE, GLOBALHANDLE, HENHMETAFILE are all changed to DD. Data declarations made using QWORD and DWORDLONG are changed to DQ. Data declarations made using TWORD are changed to DT. | no | yes | no |
TIMES duplicate data syntax used in NASM changed to DUP method of declaring duplicate data. Also RESB/RESW/RESD used in NASM to reserve uninitialised data changed to DUP ? method of declaring uninitialised data. | no | no | yes |
TEXTEQU changed to its "C" type #define version. The equates EQU and = are not changed since GoAsm supports these. | no | yes | no |
INCLUDE directive changed to #INCLUDE | yes | yes | yes |
%INCLUDE directive changed to #INCLUDE | no | no | yes |
Changes the IF/ELSE/ELSEIF/ENDIF/IFDEF series of directives (conditional assembly) to #IF/#ELSE/#ELSEIF/#ENDIF/#IFDEF The .IF/.ELSE/.ELSEIF/.ENDIF/.IFDEF and .WHILE and .BREAK directives are left untouched. These will have to be changed back to "pure" assembler to hand. | no | yes | no |
Changes the %IF/%ELSE/%ELSEIF/%ENDIF/%IFDEF series of directives (conditional assembly) to #IF/#ELSE/#ELSEIF/#ENDIF/#IFDEF Changes the %DEFINE directive to #DEFINE | no | no | yes |
Comments out all lines beginning with EXTRN or EXTERN, GLOBAL, or PUBLIC. | yes | yes | yes |
PROC is changed to FRAME and ENDP to ENDF. In masm code, the parameters and the USES statement are adjusted to GoAsm syntax. | yes | yes | no |
The size of LOCAL data in an automated stack frame is changed to the GoAsm shorter versions (B,W,Q,T) and D is removed altogether since this is the GoAsm default. | no | yes | no |
Changes EVEN to ALIGN | yes | yes | yes |
Various lines which GoAsm does not support commented out eg. NAME, TITLE, SUBTITLE, SUBTTL, PROTO lines etc. | yes | yes | yes |
Various lines which GoAsm does not support are removed altogether eg. .ERR, .EXIT, .LIST, .286 etc. | yes | yes | yes |
The word COMMENT is replaced by a semi-colon | yes | yes | yes |
In Win32 data on the stack is held in dwords, and the value of ESP is always on a dword boundary after a PUSH or POP operation. GoAsm does support half stack operations, however, which push onto and pop from the stack only two bytes at a time instead of four. When using these instructions you must push or pop a second time to restore ESP to a dword boundary. To make the syntax obvious, GoAsm requires the use of PUSHW and POPW for these half-stack operations. PUSH and POP cannot be used - they always perform a dword stack operation. As an example, half stack instructions can be used in response to the WM_LBUTTONDOWN message:-
MOUSEX_POS DD 0 MOUSEY_POS DD 0 PUSH [EBP+14h] ;push lParam onto the stack POPW [MOUSEX_POS] ;take the loword first POPW [MOUSEY_POS] ;then the hiwordor when using some APIs which receive data from the stack in both the loword and hiword:-
PUSH ADDR lpFileTime PUSHW [wFatTime] PUSHW [wFatDate] CALL DosDateTimeToFileTimeGoAsm also supports the PUSHAW, PUSHFW and POPFW, POPAW instructions, although you would not normally use these because GoAsm is used only in 32-bit programming.
PUSH FLAGSand
POP FLAGSThis feature can also be used with invoke and with uses.
See also:-
PUSH or ARG pointers to strings and data,
callback stack frames in 32-bits and 64-bits.
In GoAsm segment overrides can be either before or after the mnemonic, for example:-
FS OR D[24h],100h OR FS D[24h],100h FS MOV [ESI],EAX MOV FS[ESI],EAXBut segment overrides cannot be in a position where they can be confused with a segment register, nor can they be inside square brackets, so this is not allowed:-
PUSH FS[0] ;use FS PUSH [0] instead POP FS[0] ;use FS POP [0] instead MOV [FS:0],EAX ;use FS MOV [0],EAX instead
@line current line being assembled @filename main source script being assembled @filecur current file being assembledThese words are case insensitive. @filecur shows the current file following assembly into "a" include files, whereas @filename shows the name of the very first source script given to GoAsm at start-up.
@line provides a 32-bit integer and can be used as follows:-
MOV EAX,@line ;eax given the line number PUSH @line ;line number pushed on the stack DD @line ;line number declared in memory@filename and @filecur provide pointers to a string containing the name and can be used as follows:-
PUSH @filename ;pointer to null terminated string DB @filename ;not null terminated string DB @filename,0 ;null terminated string
Because both of these operators give positions in memory in the executable as loaded by Windows their values are not known to GoAsm at assemble-time, nor to the linker at link-time. In this respect they act like a code or data label. When you use them they do not have a value but they can be subtracted from each other or from memory references within the same section to produce a value. This is because their relative values are known.
HELLO DB 'He finally got the courage to talk to her',0 LENGTHOF_HELLO DB $-HELLONote that the $ location counter refers to the position of the label LENGTHOF_HELLO, so that the length of the string will be contained in data at that place and will be exact.
MESSAGES DD ENDOF_MESSAGES-$ DD MESS1,MESS2,MESS3,MESS4,MESS5 ENDOF_MESSAGES:Which is the same as this:-
MESSAGES DD ENDOF_MESSAGES-MESSAGES DD MESS1,MESS2,MESS3,MESS4,MESS5 ENDOF_MESSAGES:See that the first dword of the table of values contains the value of 24 since the first dword is counted too. With a little bit of arithmetic you can get the number of values in the table, in this case five:-
MESSAGES DD (ENDOF_MESSAGES-$-4)/4 DD MESS1,MESS2,MESS3,MESS4,MESS5 ENDOF_MESSAGES:In this instruction
LABEL400: JMP $+20h ;continue execution 20h bytes aheadThe location counter refers to the position of LABEL400, which is the same position as the beginning of the JMP instruction. Therefore the five bytes in the JMP instruction itself (relative call using the opcode E9) must be allowed for in the calculation.
Here are some other examples of use in the code section:-
CALL $$ ;a call to the start of the current section MOV EAX,$-$$ ;get distance to current location from start of current sectionHere are some other examples of use of $ and $$ in the data section:-
HELLOZ DD $$ ;HELLOZ to hold position at top of section DB 100-($-$$) DUP 0 ;pad with zeroes to offset 100When used inside a definition the location counter refers to the location when the definition is used rather than when it is declared. For example, here the $$ refers to the start of the code section since the definition globule is used within the code section:-
#define globule $$+2+3 CODE SECTION MOV EAX,globule
In assembler, you can control both data and code alignment. We shall look first at data alignment and then briefly look at code alignment.
Alignment to satisfy Windows
For 32-bit Windows (NT/2000/XP and Vista running as Win32) the destination of many pointers to data given to the APIs need to be dword aligned, and often this is undocumented.
Even under Windows 9x there are several pointer destinations which must be dword aligned, for example the structures DLGITEMTEMPLATES and DLGITEMTEMPLATESEX. Also the Menu, class, title and font data in a DLGTEMPLATE must be word aligned and the structures used in the Network Management APIs must be dword aligned.
Certain members of bitmap structures must be aligned internally. XP requires that the height and width of compatible bitmaps is always divisible by four to ensure that each line is aligned properly.
There are certain SSE and SSE2 instructions which require 16-byte alignment of the memory area which they are dealing with, for example FXSAVE, FXRSTOR, MOVAPD, MOVAPS and MOVDQA.For 64-bit Windows the alignment requirements are even stricter. It is essential to ensure that structure members are aligned on their "natural boundary". So a word should be on a word boundary, a dword on a dword boundary, qword on a qword boundary etc. This only works if the structure itself is properly aligned on the correct boundary. Basically the structure should be aligned on the natural boundary of its largest member. It is also important for the structure to end on the natural boundary of its largest member, if necessary by adding padding. Also in 64-bit Windows, the stack pointer RSP should also always be aligned to a 16-byte boundary on making an API call. More information about alignment requirements in 64-bit programming.
For both 32-bits and 64-bits, if the alignment is wrong the results are unpredictable, varying from mere non-appearance of controls, to program exit.
Alignment for speed
The alignment which achieves the greatest speed varies from processor to processor but generally it is a good idea to ensure that data is aligned in memory to suit the size in which it operates. For example a table of dwords would best be on a dword boundary. In theory both qwords and twords ought to be qword aligned for best performance.
For Win64 GoAsm automatically aligns structures and structure members
to suit the natural boundary of the structure and its members. GoAsm also pads
the size of the structure to suit. GoAsm also automatically aligns the stack
pointer ready for an API call.
See writing 64-bit programs for more information
about how this works in practice.
ALIGN 4 ;the next data will be dword aligned ALIGN 16 ;(or ALIGN 10h) will align on the next 16 byte boundaryIn order to achieve the alignment, in a code section GoAsm pads with instruction NOP (opcode 90h), which performs no operation. In data or const sections GoAsm pads using zeroes to the correct place.
See also sections - some advanced use on section alignment.
MOV EAX,SIZEOF Hello MOV EAX,SIZEOF(Hello) MOV EAX,[ESI+SIZEOF Hello] SUB ESP,SIZEOF Hello DD SIZEOF Hello Label DB SIZEOF Hello DUP 0 MOV EAX,SIZEOF Hello+4 MOV EAX,SIZEOF Hello-4 MOV EAX,SIZEOF Hello/2 MOV EAX,SIZEOF Hello*2When referring to a data label SIZEOF finds the distance in the raw data from the specified label to the next label in the section in which the label was declared or to the end of the section whichever is the earlier so that in:-
Hello DB 0 DD 0 Hello2 DB 0then
MOV EAX,SIZEOF Helloloads into eax the value five.
WrongB DB 'You pressed the wrong button!,'0then
MOV EAX,SIZEOF WrongBreturns the length of the string and the null terminator
START: XOR EAX,EAX XOR EAX,EAX XOR EAX,EAX XOR EAX,EAX LABEL: MOV EAX,SIZEOF STARTThe value eight will be loaded into eax. This is the size of the four xor eax,eax instructions.
Rect STRUCT left DD top DD right DD bottom DD ENDS rc Rectthen both
MOV EAX,SIZEOF Rect and MOV EAX,SIZEOF rcload into eax the value 16.
Rect STRUCT left DD DD right DD bottom DD ENDS rc Rectthen both
MOV EAX,SIZEOF rc.left and MOV EAX,SIZEOF Rect.leftreturn a value of 8.
Sleep STRUCT DW 2222h DB 0h ENDS Ness UNION Possums DB L'Balance' Koalas Sleep Devils DB 'Roar' ENDS Happy Ness SizeLabel DD SIZEOF Happy DD SIZEOF Ness DD SIZEOF Happy.Possums DD SIZEOF Happy.Koalas DD SIZEOF Happy.DevilsEach dword in SizeLabel contains 14, which is the size of the largest union member, Happy.Possums, which contains a Unicode string which is 14 bytes long.
When getting the size, any arguments which might otherwise be used to change the size of the structure are ignored for example
StringStruct STRUCT DB ? ENDSthen
MOV EAX,SIZEOF StringStructwould return size of 1 byte even if the structure has been implemented using LongString StringStruct <'I hate structures'> which enlarged the structure to 17 bytes in this implementation Similarly, if the length of the string relies on resolution of a definition, for example
StringStruct STRUCT DB LONGSTRING ENDSthen
MOV EAX,SIZEOF StringStructwould return 1 byte regardless of the value of LONGSTRING
Rect STRUCT DB TWELVE DUP 0 ENDSwill be resolved properly.
PARAM_STRUCT STRUCT DD 0 DD 0 DD 0 ENDS ps1 PARAM_STRUCT <SIZEOF PARAM_STRUCT,,>
MyWndProc FRAME LOCALS hDC,DemonFlag:B,Buffer[256]:B,MyRect:RECT MOV EAX,SIZEOF hDC ;4 in 32-bits, 8 in 64-bits (default size) MOV EAX,SIZEOF DemonFlag ;1 MOV EAX,SIZEOF Buffer ;256 MOV EAX,SIZEOF MyRect ;16 RET ENDFThe size which is returned ignores any padding added by GoAsm to align the local data properly on the stack (all local data is dword aligned in 32-bit assembly, and qword aligned in 64-bit assembly).
CMP EDX,EAX ;compare edx and eax JZ >L1 ;branch to L1 if edx=eax ;lots of other instructions L1:The processor cannot be 100% sure in advance whether edx will equal eax when line 1 is executed. It might predict this to be unlikely, however, in which case it will set up the instructions just after the conditional jump to be executed next. If this prediction is right, those instruction will be executed immediately without any time loss. If the prediction is wrong, there will be some time loss in switching to the correct instruction (just after L1) instead. Various branch prediction algorithms are used by the processor, including ones which learn from their mistakes. One of the most basic predictions used as a starting point assume that:-
All forward conditional jumps will not take place, and All backwards conditional jumps (loops back) will take place.It is documented that in normal code, backward conditional jumps tend to take place 80% of the time, whereas forward conditional jumps will be likely not to take place. It is said that overall, the default prediction is correct 65% of the time. So predicting whether or not the conditional jump will take place can speed up the code particularly on a series of loops back. As an assembler programmer, you can produce fast code knowing the default prediction for the processor that you are programming for since you are in complete control over your code.
2Eh - hint that the branch will not occur most of the time. 3Eh - hint that the branch will occur most of the time.Using branch hint 2Eh would be useful (for example) if you have a backwards branch in a conditional jump which occurs only in the case of an error. The processor would usually predict that the backwards branch is likely to happen, thereby slowing down the code at that particular point. Inserting 2Eh will stop the processor making that prediction.
DB 2Eh ;or DB 3EhIf you prefer you can insert the branch hint bytes automatically.
hint.nobranch ;insert 2Eh hint.branch ;insert 3EhGoing back to the first example, if edx normally does equal eax, then this code will run faster on the P4 by adding a branch hint as follows:-
CMP EDX,EAX ;compare edx and eax hint.branch JZ >L1 ;normally does branch to L1 ;lots of other instructions L1:I have included a speed test in TestBug which proves the speed improvement which can be obtained using branch hints. I found for example that some code ran some 1.5 times faster on a P4 processor using the correct branch hint.
ST0 ST1 ST2 ST3 ST4 ST5 ST6 ST7the MMX and 3DNow! registers can be addressed using
MM0 MM1 MM2 MM3 MM4 MM5 MM6 MM7and the XMM registers can be addressed using
XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7In 64-bit processors there are eight new XMM registers, addressable using
XMM8 XMM9 XMM10 XMM11 XMM12 XMM13 XMM14 XMM15See writing 64-bit programs for the other new registers and register addressing methods.
FCLEX FINIT FWAIT FSAVE FSETPM FSTCW FSTENV FSTSW
GOASM_REPORTTIMEone or more times in your source script. The time which is reported is the time to assemble from the last GOASM_REPORTTIME (or start of assembly) up to that line, and the time to assemble to the next GOASM_REPORTTIME (or end of assembly). Times exclude set-up and clean-up times.
GOASM_ECHO Assembly has reached new heightsThe material to write to the console does not have to be in quotes, although it can be.
You can force GoAsm to stop assembly and exit using:-
GOASM_EXIT
GoAsm Test.asm >output.filAnother way to control GoAsm's error and warning output is by using these switches on the command line:-
/b beep on error /ne no error messages /ni no information messages /nw no warning messages /no no output messages at allA warning only will also be given if a word has been defined more than once in the command line or in the source script, but assembly is allowed to continue. This is because it would be unusual to define a word more than once and it may be that this is a programming error. It is perfectly permissible to cancel a previous definition using #undef so that the word can be defined. In that case no warning is given.
In a batch file, you can use the error return with ERRORLEVEL, for example the following will pause if there is an error return:-
GoAsm MyFile.asm IF ERRORLEVEL 1 PAUSE
See the GoLink help file for full information about GoLink and how to use it. A typical batch file (with an extension .bat) to create a simple executable might be:-
GoAsm MyProg.asm GoLink MyProg.obj Kernel32.dll User32.dllThis will create a Windows PE file MyProg.exe with imports from the mentioned Dlls. The entry address START is assumed but this may be specified.
You can use a command file with GoLink for example:-
GoLink @command.filInstead of (or in addition to) specifying the Dlls in GoLink's command line or file you can use #DYNAMICLINKFILE in GoAsm source code. The syntax is:-
#dynamiclinkfile path/filename, path/filenameThe comma is optional. The path/filename can be in quotes. One or more path/filenames can be specified. You don't need to provide the path when specifying system files since GoLink looks inside the system folders automatically. The filename must have its extension which can be .dll, .ocx, .exe or .drv.
There are several other command line switches and options when using GoLink. For more details please see the Golink help file, GoLink.htm.
GoAsm MyProg.asm ALINK @Respons.fil >link.optYou run the batch file by entering its name on the command line in an MS-DOS (command prompt) window and pressing enter.
-m ;produces map file -oPE ;makes a PE file -o MyProg.Exe ;gives the output file name -entry START ;signifies the starting address -debug ;signifies debug symbols to be made kernel32.lib ; COMCTL32.lib ; a lib file for each API COMDLG32.lib ; which is called in your program, user32.lib ; made using ALIB which is gdi32.lib ; part of the ALINK package shell32.lib ; MyProg.obj ;the input fileYou can organise your work by keeping the files in various folders in which case you would need to include the paths in the instructions given to GoAsm and ALINK.
In every such case the decoration is in this form:-
_CodeLabel@xwhere the label is declared as CodeLabel and where x is the number of bytes used by CodeLabel's parameters. Decorated in this way, CodeLabel is available to other object files being linked by the MS linker (and therefore can be called from those other object files). And if CodeLabel resides in a DLL, the MS Linker will recognise it as such from a lib file made from the DLL and given to it at link-time.
At link-time the MS linker expects the value of "@x" in both the caller and the callee to match exactly. This is therefore a limited form of parameter checking. When the /ms switch is used, GoAsm therefore needs to count the number of parameters used by the code label in order to get the value of "@x" correct. To achieve this, in the case of labels to FRAMEs, GoAsm counts the number of parameters declared in the FRAME and adjusts the decoration accordingly. In the case of a call using INVOKE again GoAsm counts the number of parameters used.
However, GoAsm cannot count the number of parameters to a call using CALL and assumes there are none. For this reason if there are any parameters you must use INVOKE for the decoration to work properly (and not an ordinary PUSH xxx, then CALL). Also note that if using ARG before INVOKE, each argument needs to be on its own line (not ARG 1,2,3).
So, for example if you use the following code with the /ms switch in GoAsm's command line:-
HelloProc FRAME hwnd,arg1,arg2 INVOKE MessageBoxA, [hwnd],'Click OK','Hello',40hThen GoAsm will insert the symbol HelloProc in the object file as
_HelloProc@12and the called function in the object file as
_MessageBoxA@16This is because GoAsm knows that 12 bytes are on the stack in the case of HelloProc and 16 bytes are pushed on the stack before MessageBoxA is called.
From GoAsm Version 0.49, ordinary code labels without parameters are also decorated. This is to enable such code labels to be recognised externally at link-time so that they can be called by other object files created by MS tools. They are given a parameter byte count of zero. This includes the label giving the starting address itself. So suppose your starting address in your source script is START: (no leading underline character). This is now decorated as:-
_START@0and to link properly using the MS Linker you would include this line in the linker's command line or file:-
-ENTRY START@0
For example:-
PUSH 12h CALL _GetKeyState@4 ;check for alt-key pressedThere is only one parameter to the API GetKeyState so four bytes are put on the stack.
If you are making a dll, you will need to use a label for the starting address decorated in the same way, for example
_DLLENTRY@12:This indicates that the label DLLENTRY is called with 12 bytes pushed on the stack ie. 3 dwords.
After you have made those changes to your source script you are ready to make a batch file with the extension .bat to assemble the source and run the linker. These lines might be in the batch file:-
GoAsm MyProg.asm LINK @Respons.filYou run the batch file by entering its name on the command line in an MS-DOS (command prompt) window and pressing enter. The file Respons.fil might contain the following lines:-
/OUT:MyProg.Exe ;gives name of output file /MAP ;produces map file /SUBSYSTEM:WINDOWS ;makes a Windows GDI executable /ENTRY:START ;you have it as _START in the source! /DEBUG:FULL ;do a debug output /DEBUGTYPE:COFF ;do embedded COFF symbols MyProg.obj ;the input file comctl32.lib ; user32.lib ; lib files for each API which gdi32.lib ; your program calls, these kernel32.lib ; files come with the linker
GetKeyState=_GetKeyState@4 CALL GetKeyStateor, to call the API more directly you can use:-
GetKeyState=__imp__GetKeyState@4 CALL [GetKeyState]
Copyright © Jeremy Gordon 2001-2016
Back to top