Chapter 7 - Arrays and Strings
An Array is a continuous storage block in memory. Each element of the array have the same size. We access each element of the array using:
- Base address / address of the first element of the array.
- Size of each element of the array.
- Index of the element we want to access.
One Dimensional Arrays
In NASM there is no array element accessing/dereferencing operator like [ ] in C / C++ / Java using which we can access each element. Here we compute the address of each element using an iterative control structure and traverse though the elements of the array.
-
Declaring/ Initializing an array
We can also use TIMES keyword to repeat each element with a given value and thus easily create array elements:
-
Reading an n-sized array
Pseudo Code:
i=0
while(i < n)
read(num)
*(arr+i)=num
i++
endwhile
NASM Code:
-
Printing an n-sized array
Pseudo Code:
while(i < n)
print ∗(arr+i)
i++
endwhile
NASM Code:
The label which we use to create array( eg: 'array1' )acts as a pointer to the base address of the array and we can access each element by adding suitable offset to this base address and then dereferencing it.
Read an array and find the average of the numbers in the array
The general syntax of using array offset is:
[basereg + factor *indexreg + constant ]
- basereg: It should be general purpose register containing the base address of the array.
- factor: It can be 1, 2, 4 or 8.
- indexreg: It can also be any of the general purpose registers.
- constant: It should be an integer.
Example:
byte[ebx+12]
word[ebp + 4 * esi]
dword[ebx - 12]
A sample program to search an array for an element (Traversal) is given in the appendix
Strings
Strings are stored in memory as array of characters. Each character in English alphabet has an 8-bit unique numeric representation called ASCII. When we read a string from the user, the user will give an enter key press at the end of the string. When we read that using the read system call, the enter press will be replaced with a new line character with ASCII code 10 . Thus we can detect the end of the string.
Now let's get started to work with strings.
-
Declaring/ Initializing a string
-
Reading a string
Psuedo Code
i=0
while(num!='\n')
read(num)
(arr+i)=num
i++
endwhile
NASM Code
- Printing a string
Psuedo Code
while(*(arr+i)!=’\n’)
print(arr+i)
i++
endwhile
NASM Code
- Sample Program - To count the number of each vowels in a string :
Two-Dimensional Array / Matrices
Memory / RAM is a continuous storage unit in which we cannot directly store any
2-D Arrays/Matrices/Tables. 2-D Arrays are implemented in any programming
language either in row major form or column major form.
In row major form we first store 1st row, then the 2nd, and so on. In column
major form we store the 1st column, then the 2nd, and so on till the last element
of last column. For example if we have a 2 x 3 matrix say A of elements 1 byte
each. Let the starting address of the array be 12340. Then the array will be stored
in memory as:
- Row Major Form
Address Element
12340 A[0][0]
12341 A[0][1]
12342 A[0][2]
12343 A[1][0]
12344 A[1][1]
- Column Major Form
Address Element
12340 A[0][0]
12341 A[1][0]
12342 A[0][1]
12343 A[1][1]
12344 A[0][2]
12345 A[1][2]
Using this concept we can implement the 2-D array in NASM Programs.
- Declaration/ Initialization
- Read elements into a matrix
Psuedo Code
read(m)
read(n)
i=0
k=0
while(i < m)
j=0
while(j < n)
read(num)
*(matrix + k)=num
++j
++k
endwhile
++i
endwhile
NASM Code
- Read and print the elements of a matrix
Psuedo Code
read(m)
read(n)
i=0
k=0
while(i < m)
j=0
while(j < n)
read(num)
*(matrix + k)=num
++j
++k
endwhile
++i
endwhile
i=0
k=0
while(i < m)
j=0
while(j < n)
num=*(matrix+k)
print(num)
++j
++k
endwhile
++i
endwhile
NASM Code
Array / String Operations
x86 Processors have a set of instructions designed specially to do array / string op-
erations much easily compared with the traditional methods demonstrated above.
They are called String Instructions. Even though it is termed as string instructions,
it work well with general array manipulations as well. They use index registers(ESI
& EDI ) and increments / decrements either one or both the registers after each
operation. Depending on the value of Direction Flag(DF) it either increments or
decrements the index register’s value.
The following instructions are used to set
the value of DF manually:
- CLD - Clears the Direction Flag. Then the string instruction will increment
the values of index registers.
- STD - Sets the Direction Flag to 1. Then the string instructions will decrement
the values of index registers.
NB: Always make sure to set the value of Direction Flags explicitly, else it may
lead to unexpected errors.
For string operations we must make sure to have DS to be the segment base of
Source string and ES to be the segment base of Destination String. As we are
using the protected mode we need not set them manually. But in real mode we
have to set the register values to the base address of the suitable segments properly.
- Reading an array element to reg(AL/AX/EAX):
LODSx : x = B / W / D - Load String Instruction
This instruction is used to copy one element from an array to the register.
It can transfer an element of size 1 Byte / 1 Word / 4 Bytes at a time.
LODSB
AL = byte[DS:ESI]
ESI = ESI + 1
LODSW
AX = word[DS:ESI]
ESI = ESI + 2
LODSD
EAX= dword[DS:ESI]
ESI = ESI + 4
- Storing a reg(AL/AX/EAX) to an array:
STOSx : x = B / W / D - Load String Instruction
This instruction is used to copy one element from a register to
an array. It can transfer an element of size 1 Byte / 1 Word /
4 Bytes at a time.
STOSB
byte[ES:EDI] = AL EDI = EDI + 1
STOSW
word[ES:EDI] = AX
EDI = EDI + 2
STOSD
dword[ES:EDI] = EAX
EDI = EDI + 4
NB: ESI - Source Index reg is used when the array acts as a source ie. A
value is copied from that EDI - Destination Index reg is used when the array
acts as a destination ie. A value is copied to that.
Eg: Program to increment the value of all array elements by 1
- Memory Move Instructions:
These instructions are used to copy the elements of one array/string to an-
other.
MOVSx : x = B / W / D - Move String Instruction
MOVSB
byte[ES:EDI] = byte[DS:ESI]
ESI = ESI + 1
EDI = EDI + 1
MOVSW
word[ES:EDI] = word[DS:ESI]
ESI = ESI + 2
EDI = EDI + 2
MOVSD
dword[ES:EDI] = dword[DS:ESI]
ESI = ESI + 4
EDI = EDI + 4
Eg: Program to copy elements of an array to another
- REP - Repeat String Instruction
REP 'string-instruction'
Repeats a string instruction. The number of times repeated is equal to the
value of ecx register(just like loop instruction)
Eg: Previous program can also be written as follows using REP instruc-
tion:
- Compare Instructions
CMPSx : x = B / W / D
Compares two array elements and affects the
CPU Flags just like CMP instruction.
CMPSB
Compares byte[DS:ESI] with byte[ES:EDI]
ESI = ESI + 1
EDI = EDI + 1
CMPSW
Compares word[DS:ESI] with word[ES:EDI]
ESI = ESI + 2
EDI = EDI + 2
CMPSD
Compares dword[DS:ESI] with dword[ES:EDI]
ESI = ESI + 4
EDI = EDI + 4
- Scan Instructions
SCASx : x = B / W / D - Compares a register(AL/AX/EAX) with an array
element and affects the CPU Flags just like CMP instruction.
SCASB
Compares value of AL with byte[ES:EDI]
EDI = EDI + 1
SCASW
Compares value of AX with word[ES:EDI]
EDI = EDI + 2
SCASD
Compares value of EAX with dword[ES:EDI]
EDI = EDI + 4
Eg: Scanning an array for an element