Learn ASM for basic reverse engineering
Hello, welcome back with me Bayu Aji. This time I will apply asm or assembly tutorials, okok I will tell a story for a while. ASM in my opinion is difficult because this is a low level language language low level language, yep this language is like a machine is very difficult to be categorized as rich in your heart ea.. Ea.. ea hahaha joke. Okok seriously a little, well before entering into the material I will give a warning about ASM, anything? Cekidot
1. Assembly language is a low level language A difficult language like women, unlike other languages or high-level languages
2. Assembly language is not portable This language depends on the CPU and x86 architecture and the arm has different architectures (different ASMs).
3. Assembly language takes a lot of time Because when we create software with this language we initiate the cpu directly and this is difficult.
How can anyone use this language? It's complicated, but is there still someone wearing it? Why? Okok see again, there are several functions and benefits of Assembly language including
# Assembly language Benefit
Assembly language as symbolic form of machine code, repertensitation of machine code is assembly language Benefit
1. Implemntasi micro-optimization Functioning for software engineering, in software engineering micro-optimization is optimization that is done at the last, the first optimization we are algorithms, data structures, eliminate hidden costs, if we have done all this we just enter into micro-optimization. Do not do micro-optimization at the beginning will result in fail optimization
2. Gain in-depth knowledge of computer architecture When we explore the ASM language, we will understand how the hardware, working memory, branch work and others.
3. Improve debugging capabilities
4. Basic reverse engineering
# How to use Assembly language
Software required among others
1. Assembler (NASM)
2. GCC
3. Build Essential
Install
sudo apt-get install nasm gcc build-essential
# Implentation ASM
Okok after installing next we learn to write hello world in asm
[section .data]
my_str:
db "hello world", 10 ;code ASCII untuk karater enter
[section .text]
global _start
_start:
mov rax, 1
mov rdi, 1
mov rsi, my_str
mov rdx, 12
syscall
mov rax, 60
mov rdi, 0
syscall
How to compile
nasm -felf64 test.asm -o test.o
test.o is a relocatable file we need a link to build to be excutable
ld test.o -o test
my_str:
db "hello world", 10 ;code ASCII untuk karater enter
[section .text]
global _start
_start:
mov rax, 1
mov rdi, 1
mov rsi, my_str
mov rdx, 12
syscall
mov rax, 60
mov rdi, 0
syscall
Then there will be 3 new files.
test.asm is source code
test.o is file relocatable object
test excutable
run or excute
./name file
Explain
RDI = argument 1-st
RSI = argument 2-nd
RDX = argument 3-rd
RCX = argument 4-th
R8 = argument 5-th
R9 = argument 6-th
RAX = return value
mov rax
rax is a register used as a syscall number. Example 1. 1 is a syscall write or write
mov rdi
rdi is for the standard output or stdout of the syscall number
mov rsi
rsi is a buffer containing pointers that will be written the contents of memory
mov rdx
rdx is the length of the buffer
syscall calls system call into kernel space then returns to rax
Learn more : https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_64.tbl
example
ssize_t write(int fd, const void *buf, size_t count);
There are three variables, and they have example arguments.
argument 1 fd, this is the first argument fd is (descriptor file)
argument 2 buf, this is the second argument
argument 3 count, this is the argument to three counts is the number or length of buffer written in fd
so when in ASM from the example above is
fd is rdi
Buf is rsi
Count is rdx
syscall is the kernel space that we run from that function and the return is ssize_t. "S" in front indicates that there is a possibility that the return value is negative.
Watch this
ERRORS
EAGAIN The file descriptor fd refers to a file other than a socket and has been marked nonblocking (O_NONBLOCK), and the
write would block. See open(2) for further details on the O_NONBLOCK flag.
EAGAIN or EWOULDBLOCK
The file descriptor fd refers to a socket and has been marked nonblocking (O_NONBLOCK), and the write would block.
POSIX.1-2001 allows either error to be returned for this case, and does not require these constants to have the same
value, so a portable application should check for both possibilities.
EBADF fd is not a valid file descriptor or is not open for writing.
EDESTADDRREQ
fd refers to a datagram socket for which a peer address has not been set using connect(2).
EDQUOT The user's quota of disk blocks on the filesystem containing the file referred to by fd has been exhausted.
EFAULT buf is outside your accessible address space.
EFBIG An attempt was made to write a file that exceeds the implementation-defined maximum file size or the process's file
size limit, or to write at a position past the maximum allowed offset.
EINTR The call was interrupted by a signal before any data was written; see signal(7).
EINVAL fd is attached to an object which is unsuitable for writing; or the file was opened with the O_DIRECT flag, and ei‐
ther the address specified in buf, the value specified in count, or the file offset is not suitably aligned.
EIO A low-level I/O error occurred while modifying the inode. This error may relate to the write-back of data written by
an earlier write(), which may have been issued to a different file descriptor on the same file. Since Linux 4.13,
errors from write-back come with a promise that they may be reported by subsequent. write() requests, and will be
reported by a subsequent fsync(2) (whether or not they were also reported by write()). An alternate cause of EIO on
networked filesystems is when an advisory lock had been taken out on the file descriptor and this lock has been lost.
See the Lost locks section of fcntl(2) for further details.
ENOSPC The device containing the file referred to by fd has no room for the data.
EPERM The operation was prevented by a file seal; see fcntl(2).
EPIPE fd is connected to a pipe or socket whose reading end is closed. When this happens the writing process will also re‐
ceive a SIGPIPE signal. (Thus, the write return value is seen only if the program catches, blocks or ignores this
signal.)
Other errors may occur, depending on the object connected to fd.
this error message will be symbolized by a negative number entering the rax register after the syscall
mov rax, 60
The number 60 is the exit or sys_exit
We can see the user manual again with the command
man 2 exit
Exit requires only 1 argument, meaning it only requires rdi and rax
mov rax, 0
The number 0 is its status or its exit code is 0
How to prove it? We can use bash with the command below
echo $?
Vidio tutorial