most-common-byte

Assembly Crash Course

February 2, 2026

Read Time: 5 min

Problem

This problem comes from pwn.college: Assembly Crash Course ⤴.

We will be testing your code multiple times in this level with dynamic values! This means we will be running your code in a variety of random ways to verify that the logic is robust enough to survive normal use.

In this level, you will be working with functions! This will involve manipulating the instruction pointer (rip), as well as doing harder tasks than normal. You may be asked to use the stack to store values or call functions that we provide you.

In the previous level, you learned how to make your first function and how to call other functions. Now we will work with functions that have a function stack frame.

A function stack frame is a set of pointers and values pushed onto the stack to save things for later use and allocate space on the stack for function variables.

First, let’s talk about the special register rbp, the Stack Base Pointer.

The rbp register is used to tell where our stack frame first started. As an example, say we want to construct some list (a contiguous space of memory) that is only used in our function. The list is 5 elements long, and each element is a dword. A list of 5 elements would already take 5 registers, so instead, we can make space on the stack!

The assembly would look like:
; setup the base of the stack as the current top
mov rbp, rsp
; move the stack 0x14 bytes (5 * 4) down
; acts as an allocation
sub rsp, 0x14
; assign list[2] = 1337
mov eax, 1337
mov [rbp-0xc], eax
; do more operations on the list ...
; restore the allocated space
mov rsp, rbp
ret
Notice how rbp is always used to restore the stack to where it originally was. If we don’t restore the stack after use, we will eventually run out. In addition, notice how we subtracted from rsp, because the stack grows down. To make the stack have more space, we subtract the space we need. The ret and call still work the same.

Consider the fact that to assign a value to list[2] we subtract 12 bytes (3 dwords). That is because the stack grows down and when we moved rsp our stack contains addresses <rsp, rbp).

Once again, please make function(s) that implement the following:
most_common_byte(src_addr, size):
  i = 0
  while i <= size-1:
    curr_byte = [src_addr + i]
    [stack_base - curr_byte * 2] += 1
    i += 1

  b = 0
  max_freq = 0
  max_freq_byte = 0
  while b <= 0xff:
    if [stack_base - b * 2] > max_freq:
      max_freq = [stack_base - b * 2]
      max_freq_byte = b
    b += 1

  return max_freq_byte
Assumptions:

There will never be more than 0xffff of any byte

The size will never be longer than 0xffff

The list will have at least one element

Constraints:

You must put the “counting list” on the stack

You must restore the stack like in a normal function

You cannot modify the data at src_addr

Solution

Let’s start at what this function does: count the amount of occurrences of each byte value and then return the byte that occurs the most frequently.

For example:

4 occurrences of 0x00
10 occurrences of 0x33
24 occurrences of 0xe4
56 occurrences of 0x4d
11 occurrences of 0xee

The most common byte in this example is 0x4d, which occurs 56 times, so return 0x4d.

Now, most of the comments in the code explain what’s going on, but I’d like to point out what’s happening on the stack. Since we’re working with bytes, there can only be 256 values and so we have that many counters. Each counter is 2 bytes in size, since we can have a maximum of 0xffff occurrences of each byte, so we allocate 512 bytes on the stack to occurrences of each byte.

But this presents a problem. Clearly we’re only storing the occurrences on the stack, but not the actual byte value. How would we know, for example, that 0x33 has 10 occurrences if we’re only storing 10 but not 0x33?

Turns out it’s not really a problem. Consider whaat would happen if we used rbp to store the occurrences for 0x00. Then the offset added to rbp is our byte value. Using our example, [rbp - 0x00] would return 4, [rbp - 0x33] would return 10, and [rbp - 0x4d] would return 56.

That doesn’t actually occur here, but it’s close. Since our counters are actually 2 bytes long, we have to multiply our base offset by two as well to get to the next counter. So it’s actually [rbp - 0x33 * 2] which would return 10. But since we know our base offset, we’ll always know what byte we’re actually looking at.

.intel_syntax noprefix
.global most_common_byte

# Counts most common byte in a contiguous memory location
# Arguments: rdi = src_addr, rsi = size
most_common_byte:
  # Setup stack frame. We need 512 bytes (256 counters, each 2 bytes).
  mov rbp, rsp
  sub rsp, 512
  
  # i = 0
  xor rcx, rcx
  
  # Counts byte frequency
  while_1:
    # If i >= size (equivalent to i > size-1)
    cmp rcx, rsi
    jge next_step
    
    # curr_byte = [src_addr + i]
    # Dealing with bytes, so use byte ptr.
    # Also use movzx to zero out everything else in the register
    movzx rbx, byte ptr [rdi + rcx]
    
    # [stack_base - curr_byte * 2] += 1
    # Dealing with words (2 bytes), so use word ptr.
    add word ptr [rbp + rbx*2 - 512], 1
    
    # i += 1
    add rcx, 1
    
    jmp while_1
    
  next_step:
    xor rcx, rcx    # b = 0
    xor r8, r8      # max_freq = 0
    xor rax, rax    # max_freq_byte = 0
    
  # Find most common byte
  while_2:
    # If b > 0xff, jump to end
    cmp rcx, 0xff
    jg end
    
    # If [stack_base - b * 2] <= max_freq, go to inc_b.
    # Dealing with words, so use word ptr.
    # Note movzx. r14 is 64-bits but we only use 16-bits,
    # so movzx will zero out the rest of the register.
    movzx r14, word ptr [rbp + rcx*2 - 512]
    cmp r14, r8
    jle inc_b
    
    # max_freq = [stack_base - b * 2]
    mov r8, r14
    
    # max_freq_byte = b
    mov rax, rcx
    
    # b += 1
    inc_b:
      add rcx, 1
      
    jmp while_2

  end:
    # Clean up stack and return
    mov rsp, rbp
    ret