Problem
This problem comes from pwn.college: Assembly Crash Course ⤴.
We will be testing your code multiple times in this level with dynamic values! This means we will be running your code in a variety of random ways to verify that the logic is robust enough to survive normal use.
In this level, you will be working with functions! This will involve manipulating the instruction pointer (
rip), as well as doing harder tasks than normal. You may be asked to use the stack to store values or call functions that we provide you.In the previous level, you learned how to make your first function and how to call other functions. Now we will work with functions that have a function stack frame.
A function stack frame is a set of pointers and values pushed onto the stack to save things for later use and allocate space on the stack for function variables.
First, let’s talk about the special register
rbp, the Stack Base Pointer.The
rbpregister is used to tell where our stack frame first started. As an example, say we want to construct some list (a contiguous space of memory) that is only used in our function. The list is 5 elements long, and each element is adword. A list of 5 elements would already take 5 registers, so instead, we can make space on the stack!The assembly would look like:
; setup the base of the stack as the current top mov rbp, rsp ; move the stack 0x14 bytes (5 * 4) down ; acts as an allocation sub rsp, 0x14 ; assign list[2] = 1337 mov eax, 1337 mov [rbp-0xc], eax ; do more operations on the list ... ; restore the allocated space mov rsp, rbp retNotice how
rbpis always used to restore the stack to where it originally was. If we don’t restore the stack after use, we will eventually run out. In addition, notice how we subtracted fromrsp, because the stack grows down. To make the stack have more space, we subtract the space we need. Theretandcallstill work the same.Consider the fact that to assign a value to
list[2]we subtract 12 bytes (3 dwords). That is because the stack grows down and when we movedrspour stack contains addresses <rsp,rbp).Once again, please make function(s) that implement the following:
most_common_byte(src_addr, size): i = 0 while i <= size-1: curr_byte = [src_addr + i] [stack_base - curr_byte * 2] += 1 i += 1 b = 0 max_freq = 0 max_freq_byte = 0 while b <= 0xff: if [stack_base - b * 2] > max_freq: max_freq = [stack_base - b * 2] max_freq_byte = b b += 1 return max_freq_byteAssumptions:
- There will never be more than
0xffffof any byte- The size will never be longer than
0xffff- The list will have at least one element
Constraints:
- You must put the “counting list” on the stack
- You must restore the stack like in a normal function
- You cannot modify the data at
src_addr
Solution
Let’s start at what this function does: count the amount of occurrences of each byte value and then return the byte that occurs the most frequently.
For example:
- 4 occurrences of
0x00 - 10 occurrences of
0x33 - 24 occurrences of
0xe4 - 56 occurrences of
0x4d - 11 occurrences of
0xee
The most common byte in this example is 0x4d, which occurs 56 times, so return 0x4d.
Now, most of the comments in the code explain what’s going on, but I’d like to point out what’s happening on the stack.
Since we’re working with bytes, there can only be 256 values and so we have that many counters. Each counter is 2 bytes in size,
since we can have a maximum of 0xffff occurrences of each byte, so we allocate 512 bytes on the stack to occurrences of each byte.
But this presents a problem. Clearly we’re only storing the occurrences on the stack, but not the actual byte value.
How would we know, for example, that 0x33 has 10 occurrences if we’re only storing 10 but not 0x33?
Turns out it’s not really a problem. Consider whaat would happen if we used rbp to store the occurrences for 0x00. Then the offset added to rbp is our byte value.
Using our example, [rbp - 0x00] would return 4, [rbp - 0x33] would return 10, and [rbp - 0x4d] would return 56.
That doesn’t actually occur here, but it’s close. Since our counters are actually 2 bytes long, we have to multiply our base offset by two as well to get to the next counter.
So it’s actually [rbp - 0x33 * 2] which would return 10. But since we know our base offset, we’ll always know what byte we’re actually looking at.
.intel_syntax noprefix
.global most_common_byte
# Counts most common byte in a contiguous memory location
# Arguments: rdi = src_addr, rsi = size
most_common_byte:
# Setup stack frame. We need 512 bytes (256 counters, each 2 bytes).
mov rbp, rsp
sub rsp, 512
# i = 0
xor rcx, rcx
# Counts byte frequency
while_1:
# If i >= size (equivalent to i > size-1)
cmp rcx, rsi
jge next_step
# curr_byte = [src_addr + i]
# Dealing with bytes, so use byte ptr.
# Also use movzx to zero out everything else in the register
movzx rbx, byte ptr [rdi + rcx]
# [stack_base - curr_byte * 2] += 1
# Dealing with words (2 bytes), so use word ptr.
add word ptr [rbp + rbx*2 - 512], 1
# i += 1
add rcx, 1
jmp while_1
next_step:
xor rcx, rcx # b = 0
xor r8, r8 # max_freq = 0
xor rax, rax # max_freq_byte = 0
# Find most common byte
while_2:
# If b > 0xff, jump to end
cmp rcx, 0xff
jg end
# If [stack_base - b * 2] <= max_freq, go to inc_b.
# Dealing with words, so use word ptr.
# Note movzx. r14 is 64-bits but we only use 16-bits,
# so movzx will zero out the rest of the register.
movzx r14, word ptr [rbp + rcx*2 - 512]
cmp r14, r8
jle inc_b
# max_freq = [stack_base - b * 2]
mov r8, r14
# max_freq_byte = b
mov rax, rcx
# b += 1
inc_b:
add rcx, 1
jmp while_2
end:
# Clean up stack and return
mov rsp, rbp
ret