2024 Cyber Apocalypse: Crushing
Challenge Information
| Attribute | Details |
|---|---|
| Event | 2024 Cyber Apocalypse |
| Category | Reverse Engineering |
| Challenge | Crushing |
| Difficulty | Easy |
Summary
Crushing is a reverse engineering challenge requiring analysis of a custom compression/serialization algorithm. The binary reads input, builds linked list structures for each byte value, and serializes this structure to a file. By reverse engineering this process, the original message can be recovered.
Analysis
Algorithm Overview
The binary implements a custom serialization scheme:
- Input Processing: Reads character input and tracks positions
- Data Structure: Creates a map (array of 256 buckets) where each bucket contains a linked list
- Linked List Building: For each character at position
pos, a node is created with{pos, next} - Serialization: Writes to file in a specific format
Serialization Format
For each of the 256 possible byte values:
- Write the count of occurrences (8-byte little-endian integer)
- Write each position where that byte appeared (8-byte little-endian integers)
Example:
Byte 0x41 (A) appears at positions [0, 5, 10]:- Write: 0x03 0x00 0x00 0x00 0x00 0x00 0x00 0x00 (count = 3)- Write: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 (position 0)- Write: 0x05 0x00 0x00 0x00 0x00 0x00 0x00 0x00 (position 5)- Write: 0x0A 0x00 0x00 0x00 0x00 0x00 0x00 0x00 (position 10)Solution
Step 1: Understand the Reverse Process
To decode, we need to:
- Read the occurrence count for each byte value (0-255)
- Read the positions where each byte appeared
- Reconstruct the original data by placing each byte at its correct position
Step 2: Implement the Decoder
def deserialize_and_reconstruct(encoded_file): with open(encoded_file, 'rb') as f: # Step 1: Read and reconstruct the map byte_occurrences = {} for byte_val in range(256): # Read the length (number of occurrences) for this byte value length = int.from_bytes(f.read(8), 'little') if length > 0: byte_occurrences[byte_val] = [] for _ in range(length): # Read each occurrence order order = int.from_bytes(f.read(8), 'little') byte_occurrences[byte_val].append(order)
# Step 2: Reconstruct the original sequence original_data = [None] * sum(len(orders) for orders in byte_occurrences.values()) for byte_val, orders in byte_occurrences.items(): for order in orders: original_data[order] = byte_val
# Convert byte values back to bytes original_bytes = bytes(original_data) return original_bytesStep 3: Execute the Decoder
def main(): encoded_file = 'message.txt.cz' original_data = deserialize_and_reconstruct(encoded_file)
# Output the original data with open('decoded_message.txt', 'wb') as f: f.write(original_data) print("Decoded message written to decoded_message.txt") print(original_data.decode())
if __name__ == '__main__': main()Complete Decoder Script
from struct import unpack
def deserialize_and_reconstruct(encoded_file): with open(encoded_file, 'rb') as f: byte_occurrences = {} for byte_val in range(256): length = int.from_bytes(f.read(8), 'little') if length > 0: byte_occurrences[byte_val] = [] for _ in range(length): order = int.from_bytes(f.read(8), 'little') byte_occurrences[byte_val].append(order)
# Reconstruct original sequence max_pos = max(max(orders) for orders in byte_occurrences.values() if orders) original_data = [None] * (max_pos + 1)
for byte_val, orders in byte_occurrences.items(): for order in orders: original_data[order] = byte_val
return bytes(original_data)
if __name__ == '__main__': encoded_file = 'message.txt.cz' original_data = deserialize_and_reconstruct(encoded_file)
with open('decoded_message.txt', 'wb') as f: f.write(original_data)
print("Decoded message:") print(original_data.decode())Key Takeaways
- Understanding data structures and their serialization formats is crucial for reverse engineering
- Linked lists and their in-memory representations differ from serialized formats
- Custom compression/serialization schemes must be completely reverse engineered
- Binary file format analysis requires understanding byte ordering (little-endian vs big-endian)
- Reconstructing original data from serialized format requires careful position/index tracking
- Testing the decoder on sample data validates the understanding
Flag: HTB{cr4sh1ng_1nt0_r3v3rs1ng}