THOUGHTS, STORIES AND IDEAS

In the last few days I had a great time unwrapping Tom's Data Onion, which is a programming puzzle in a text file. On my journey to `THE CORE` I learned a few new things about bitwise operations and encryption algorithms. Further I took a look at scripting in Kotlin using the kotlin-main-kts artifact and kscript, which I enjoyed way more.

I highly recommend trying to solve the puzzle on your own before continuing reading this article. Progressing through the layers is a lot of fun and further it includes some nice learnings. In the following lines I will describe the solution for each layer, so be warned that your puzzling experience may suffer from reading about a layer before solving it yourself first.

## Disclaimer

The below described solutions are by far not perfect! Especially in the first few layers I'm doing lots of conversions (especially to `String`) that, as I discovered later, aren't necessary. But hey, I just kept it like it was because it solves the layers and it somehow also shows my progress about working with bytes and bitwise operations.

## Layer 0

### Instructions

Tom's Data Onion consists of multiple layers. Solving one layer, which means decoding its payload, will output another text file to decode. It continues like that until you reach the mysterious core.

The first layer can be solved by decoding its payload with Adobe-flavoured ASCII85. The same applies to all following layers, but to solve those some additional steps are required.

### Solution

Adobe-flavoured ASCII85 is a binary-to-text encoding algorithm. It encodes 4 bytes of binary data to 5 ASCII characters, which then can be transmitted over communication protocols that are designed to transmit human-readable text instead of arbitrary binary data.

I'll first show you the code to decode Adobe-flavoured ASCII85 then I'll walk you through it and explain the steps.

``````fun File.decodeAscii85(): MutableList<UByte> {
.substringAfter("<~")
.substringBefore("~>")
.replace("\\s".toRegex(), "")
.replace("\n".toRegex(), "")
.replace("z", "!!!!!")

val result = ArrayList<UByte>()

encodedString.chunked(5).forEach {
// If the last group of the text to decode is smaller than 5, it needs to be padded with the biggest available character ('u')
if (it.length < 5) {
}

val base85 = asciiCharacters.chars().toArray().map { it - 33 }
val value = base85.withIndex().sumBy { it.value * Math.pow(85.0, 4.0 - it.index).toInt() }

binary
.chunked(8)
.filterIndexed { index, _ -> index < padding }
.map {
Integer.parseInt(it, 2).toUByte()
}
)
}

return result
}``````

So first of all this is an extension function that adds the function `decodeAscii85()` to `File`. I designed it like that because it will be needed in all layers. In the first statement the payload is extracted from the file and processed using the following rules:

• The start and end points of the payload are marked with `<~` and `~>`
• White spaces and new lines must be ignored
• 4 zero bytes would actually be encoded to `!!!!!` but are compressed to `z` as this appears quite often

Then the payload (`encodedString`) is chunked into pieces that contain 5 characters each and those characters are translated/decoded into 4 bytes of binary data. To do so the following steps are applied:

1. Get the ASCII values of the encoded characters
2. Substract those values by 33 (this is necessary because the first of the 85 ASCII characters that is used is `!`, which is represented by the value 33 not 0)
3. Calculate a 32 bit number with those values
4. Translate this 32 bit number into 4 bytes and add them to the result

If the last chunk has less than 5 characters it is padded with the biggest available character `u`. If a padding was applied as many bytes as characters were padded are removed before adding the bytes to the result.

That's basically it. Now the binary data just has to be translated to text and stored in a new file. To map the `ByteArray`,  that is returned by decoding the payload to binary data, to text and to store this text in a file I created the following extension functions.

``````fun ByteArray.toText(): String {
return map { it.toInt().toChar() }.joinToString("")
}

fun String.saveToFile(fileName: String) {
val file = File(fileName)
file.createNewFile()
file.writeText(this)
}``````

The last remaining piece of layer 0 then is to simply call these functions. This as simple as that:

``````fun File.decodeLayer0(resultFileName: String) {
decodeAscii85()
.toText()
.saveToFile(resultFileName)
}``````

## Layer 1

### Instructions

When reaching layer 1 it is time to get into bitwise operations. Now every second bit of the (ASCII85 decoded) payload needs to be flipped and all bits in a byte have to be rotated one position to the right.

### Solution

First of all I moved the functions I wrote in the previous layer to a utils file. As I couldn't find out how to call a function from a separate file with the kotlin-main-kts artifact I started using kscript, which worked just nice.

#### Flip each second bit

To flip a bit you simply need to `XOR` it with `1`. Therefore you need to `XOR` the byte with `11111111` first. Now all bits are flipped. As we just want to flip every second bit we need to mask the result to extract the bits we are interested in. This can be done using `AND` and setting the bits we are interested in to `1`. Thus the flipped byte needs to be masked with `01010101` to get all even bits and the original byte needs to be masked with `10101010` to get all odd bits. Then the results can be combined with a simple `OR`. Below you can see an example for this using the input `00110110`.

``````// Extract the flipped bits we are interested in
00110110
XOR 11111111 //flip all bits
--------
11001001
--------
01000001

// Extract the original bits we are interested in
00110110
--------
00100010

// Combine them
01000001
OR  00100010
--------
01100011``````

#### Rotate one position to the right

To rotate a byte one position to the right first the last bit (that will be shifted out) needs to be extracted. This can be easily done with masking again. Afterwards the byte can be shifted to the right and combined with the last bit using `OR`. Below the above pictured example is continued.

``````// Extract the last bit
01100011
AND 00000001
--------
00000001

// Shift the byte one position to the right and combine with the last bit
01100011
SHR --------
00110001
OR  1
--------
10110001
``````

That's all you need to do in this layer. The code for this layer looks like this:

``````fun File.decodeLayer1(resultFileName: String) {
decodeAscii85()
.map { byte ->
val flippedBits = (byte xor 0b11111111u) and 0b01010101u
val originalBits = byte and 0b10101010u
val flippedByte = flippedBits or originalBits

val lastBit = flippedByte and 0b00000001u
(flippedByte.shiftRight()) or lastBit
}
.toText()
.saveToFile(resultFileName)
}``````

## Layer 2

### Instructions

In this layer the author added a parity bit to each byte. Parity bits are used to check if bytes are corrupted. When the number of ones within the most significant seven bits of a byte is odd, the least significant bit (= the parity bit) should be set to `1`, otherwise it should be `0`. You can probably imagine the task now. It is to check the parity bit of each byte and to omit invalid bytes.

### Solution

This layer was rather easy in my opinion. First all invalid bytes need to be filtered out by checking the parity bit. To get the parity bit I used masking again. So I masked the byte with `00000001` and stored it in a variable. Afterwards I created a counter for the number of ones and simply shifted the byte to the right and masked it with `00000001` again to get the (new) least significant bit. If it was one, the counter was increased by one. This step was repeated seven times. Well, checking if the number of ones is odd or even and filtering out invalid ones is rather self explanatory then. In the following code example you can see that applied.

``````decodeAscii85()
.filter { byte ->
val parity = (byte and 0b00000001u).toInt()

var numberOfOnes = 0
var shiftedByte = byte
for (i in 0 until 7) {
shiftedByte = shiftedByte.shiftRight()
numberOfOnes += (shiftedByte and 0b00000001u).toInt()
}

parity % 2 == numberOfOnes % 2
}``````

After omitting all invalid bytes the resulting bytes need to be processed because the parity bit should not be part of the result that is parsed to text later on. I solved this with String operations, which is probably not the best solution but it worked. Basically I created a long string that joins the seven most significant bits of each valid byte, chunked it into pieces with the size 8 and parsed it back to bytes. Below you can see the code that includes (the already very well known) functions to map the bytes to text and to store the result in a file.

``````        .map { byte ->
}
.joinToString("")
.chunked(8)
.map { binaryByte ->
Integer.parseInt(binaryByte, 2).toUByte()
}
.toText()
.saveToFile(resultFileName)``````

## Layer 3

### Instructions

With layer 3 we start to dive into encryption. In this layer the payload has been encrypted using `XOR` and a 32 byte long cycling key. This means the input was `XOR`ed with the key to encrypt the payload. As the key is shorter than the input it was simply repeated all the time until the input was over. The encrypted payload is given, but neither the key nor the input are known before starting to implement the layer. The author just gives the hint that a `XOR` operation can be reversed without loosing any data. This means if `A XOR B = C` also the following two equations are true: `A XOR C = B` and `B XOR C = A`.

### Solution

This is the layer I definitely spent the most time on because I wanted to get a "mathematical" solution where I could just calculate or maybe even brute force the key or the input. But some hours and lots of papers with dozens of zeros and ones on them later I started to believe that this won't work. At this time I started discussing about this topic with a friend that is into encryption and he gave me the hint to look at the file and use the things I know about the expected result.

Using known parts of the result to calculate the key worked pretty well but still the solution is kinda disappointing to me because it's just not that nice. But yeah, this is how it is supposed to work. So below you can read about which parts of the text I assumed to get the full key.

Each file starts with the text `==[ Layer X/6: `. As I was solving layer 3, I new that the `X` had to be `4`. So I created a `ByteArray` of size 32 for the key and set the first 15 bytes with the following code.

``````val bytes = decodeAscii85()
val key = UByteArray(32)

// Start of file:
"==[ Layer 4/6: " // key[0-14]
.toCharArray()
.map { it.toInt().toUByte() }
.withIndex()
.forEach {
key.set(it.index, it.value xor bytes[it.index])
}``````

Then I gave decryption a first try using the below printed code.

``````bytes
.withIndex()
.map { it.value xor key.get(it.index % 32) }
.toText()
.saveToFile(resultFileName)``````

The first line of the result was `==[ Layer 4/6: ÐÀmBùIR_ñ°gìù===============£\$«BXìÃ<w\$en computers`. So lets take a look about what else we know about the expected result. Each line in the provided files has 60 characters. Therefore we know that everything after the block of equal signs we see in the result until the 60th characters needs to be `=`. In addition we know that this block is always followed by two `\n` characters, which gives us two more bytes of the key. The following code was executed to calculate some missing parts of the key.

``````"=============\n\n" // key[14-29]
.toCharArray()
.map { it.toInt().toUByte() }
.withIndex()
.forEach {
key.set(15 + it.index, it.value xor bytes[32 + 15 + it.index])
}``````

When checking the result below it is readable already. There are only two bytes of the key missing.

``````==[ Layer 4/6: Network Trafficù============================

\$en computers send data over a ·)twork like the internet,
the d¸8a is broken up and placed with°" packets. As well as
containin¾lthe data being sent, packets c¶"tain extra data

The code to calculate the missing bytes is quite similar to the code displayed above, but still I added it below for the sake of completeness.

``````" ]" // key[30-31]
.toCharArray()
.map { it.toInt().toUByte() }
.withIndex()
.forEach {
key.set(30 + it.index, it.value xor bytes[30 + it.index])
}``````

## Layer 4

### Instructions

In this layer the data represents a stream of raw network data. To be precise, a series of IPv4 packets with a UDP (User Datagram Protocol) payload inside. To solve this layer those packets need to be parsed and the invalid ones should be omitted. A valid packet is defined like this:

• Sent from any port of `10.1.1.10`
• Sent to port `42096` of `10.1.1.200`
• IPv4 and UDP header checksums are correct

### Solution

As the size of the packets may vary I wrote a recursive function that gets, parses and validates the current packet and then calls itself again with the start index of the next packet. A packet consists of the IPv4 header (20 bytes), the UDP header (8 bytes) and the actual data that is sent with the packet. The length of this payload is defined in the IPv4 header.

To parse a packet the code that is shown below completes the following steps:

1. Extract the IPv4 and the UDP header
2. Read the data size from the IPv4 header and extract the data
3. Check if the packet is valid (see below for more details) and add the data to the result if yes
4. Parse the next packet
``````val ipHeaderSize = 20

fun parsePacket(allBytes: MutableList<UByte>, startIndex: Int, result: ArrayList<UByte>) {
val udpStartIndex = startIndex + ipHeaderSize
val dataStartIndex = udpStartIndex + udpHeaderSize

if (dataStartIndex >= allBytes.size - 1) {
return
}

val packetSize = Integer.parseInt(
.subList(2, 4)
.joinToString(""),
2
)

val endIndex = dataStartIndex + dataSize
if (endIndex >= allBytes.size - 1) {
return
}
val data = allBytes.subList(dataStartIndex, endIndex)

}

// parse next packet
parsePacket(allBytes, startIndex + packetSize, result)
}``````

To validate a packet first the sender and receiver need to be checked as they must match the specified IP addresses and port. The IPv4 address of the sender is stored in bytes 12 - 15 (both inclusive) of the IPv4 header, while the destination address is stored in bytes 16-19. The destination  port can be extracted from the UDP header where it is stored in bytes 2-3.

``````// === The packet was sent FROM any port of 10.1.1.10 ===

// === The packet was sent TO port 42069 of 10.1.1.200 ===
if (!doesPortMatch(udpHeader.subList(2, 4), 42069)) return false``````

To check an IP address or port the following helper functions can be used. `doesIpAddressMatch()` simply checks the length of the given arguments and compares each byte, while `doesPortMatch()` parses the given bytes to an `Integer` and then compares this number to the given port.

``````fun doesIpAddressMatch(bytes: List<UByte>, ipToMatch: List<Int>): Boolean {
if (bytes.size != 4 || bytes.size != ipToMatch.size) {
return false
}

bytes.withIndex().forEach {
if (it.value.toInt() != ipToMatch[it.index]) {
return false
}
}
return true
}

fun doesPortMatch(bytes: List<UByte>, portToMatch: Int): Boolean {
if (bytes.size != 2) {
return false
}

return bytes
.joinToString("")
}``````

Now the header checksums have to be validated. Before doing that I wrote a helper function that calculates the checksum for a list of bytes. This article on The Geek Stuff provides a great and very detailed explanation about how to calculate the checksum of an IPv4 header. When writing this function I exactly followed the steps the author of that post describes.

``````fun calculateChecksum(bytes: List<UByte>): UInt {
var checksum = 0u

for (i in bytes.indices step 2) {
checksum += (bytes[i].toUInt() shl 8) or bytes[i + 1].toUInt()

val carry = (checksum and 0b10000000000000000u) shr 16
if (carry == 1u) {
checksum = checksum and 0b01111111111111111u
checksum += carry
}
}

return checksum xor 0b1111111111111111u // invert
}``````

With this function implemented validating the IPv4 header checksum is rather easy. First of all it needs to be extracted from the header and stored in a variable as it should be set to zero when calculating the checksum. This is done in the next few lines. In the last line the checksum is calculated and compared to the value that was extracted from the header. If they don't match the packet is invalid.

``````val ipChecksum = (ipHeader.toUInt() shl 8) or ipHeader.toUInt()

// replace the checksum bytes with 0 for the calculation

if (calculateChecksum(ipHeaderWithoutChecksum) != ipChecksum) return false
``````

The process to validate the UDP header checksum is quite similar, but also a bit more complex because the bytes that should be used to calculate the checksum are not simply the bytes of the UDP header, but consist of the following parts:

• A pseudo header containing the following things
• Source and Destination defined in the IPv4 header: 8 bytes
• 0: 1 byte
• Protocol: 1 byte (this is always 17, which stands for UDP)
• Length defined in the UDP header: 2 bytes
• The UDP header without the checksum
• The full data without headers
• A zero byte of padding if the total number of bytes should be odd

After putting this together the checksum can simply be calculated and compared to the value that is set in the header like it was done before for the IPv4 header.

`````` val checksumBase = ArrayList<UByte>()
checksumBase.add(17u.toUByte()) // Protocol - UDP = 17
// data

if (udpChecksum != 0u) { // no checksum was calculated before sending the packet
if (checksumBase.size % 2 != 0) checksumBase.add(0u.toUByte())
if (calculateChecksum(checksumBase) != udpChecksum) return false
}``````

## Layer 5

### Instructions

The payload of this layer has been encrypted using the Advanced Encryption Standard, to be precise `AES-256 in Counter Mode(CTR)`. The key and initialization vector (IV) to decrypt the payload are given in the beginning of the payload. Since the key is encrypted with a key encrypting key (KEK) and another IV there is one step that needs to be completed prior to decrypting the payload. This means the key needs to be unwrapped before the payload itself can be decrypted.

### Solution

The first step to solve this layer is to extract the given values. This is pretty straight forward and you can see the code for that below.

``````val bytes = decodeAscii85()

val keyEncryptingKey = bytes.subList(0, 32).map { it.toByte() }.toByteArray()
val keyInitializationVector = bytes.subList(32, 40).map { it.toByte() }.toByteArray()
val wrappedKey = bytes.subList(40, 80).map { it.toByte() }.toByteArray()
val wrappedInitializationVector = bytes.subList(80, 96).map { it.toByte() }.toByteArray()
val encryptedPayload = bytes.subList(96, bytes.size).map { it.toByte() }.toByteArray()``````

After extracting all necessary values the key needs to be unwrapped. This can be done using the `javax.crypto` library, which supports en- and decryption with `AES` but also wrapping and unwrapping keys. This would actually be very easy, but unfortunately the IV is hardcoded in a `private final` field in the `AESWrapCipher`. Therefore I used reflection to change the IV to unwrap the key.

``````// init the cipher to unwrap the key
val keyCipher = Cipher.getInstance("AESWrap");
keyCipher.init(Cipher.UNWRAP_MODE, SecretKeySpec(keyEncryptingKey, "AES"));

// use reflection to set the IV as it is hardcoded in a private final field
val spiField = keyCipher::class.java.getDeclaredField("firstSpi")
spiField.isAccessible = true

val clazz = Class.forName("com.sun.crypto.provider.AESWrapCipher")
val spiIvField = clazz.getDeclaredField("IV")
spiIvField.isAccessible = true

val modifiersField = Field::class.java.getDeclaredField("modifiers")
modifiersField.isAccessible = true
modifiersField.setInt(spiIvField, spiIvField.getModifiers() and Modifier.FINAL.inv())
spiIvField.set(this, keyInitializationVector)

// unwrap the key
val key = keyCipher.unwrap(wrappedKey, "AES", Cipher.SECRET_KEY)``````

After overcoming this hurdle, the last part was easy. When the key is unwrapped the payload can simply be decrypted like that.

``````// decrypt the payload
cipher.init(Cipher.DECRYPT_MODE, key, IvParameterSpec(wrappedInitializationVector))

## Layer 6

### Instructions

For me layer 6 was definitely the most intimidating layer when reading the instructions for the first time. To solve this layer it is required to build an emulator for the `Tomtel Core i69` that translates machine code written for this machine to text. When I read through the instructions a bit more carefully the solution became quite clear and was by far not that scary anymore. The Tomtel has fixed sets of registers and instructions and also a fixed amount of memory, which you need to use to decrypt the payload.

### Solution

The first step to solve this layer is to store all bytes of the encrypted payload in the memory of the Tomtel. Then I created a loop, that is running until the `HALT` instruction appears which means that the program completed. One rather important register is the `pc` register. It holds the program counter and tells us which command to read next. In the beginning it is set to `0`. So the steps that are repeated in each iteration of the loop are the following:

1. Parse the command from memory at the position where `pc` is pointing to
2. Increase the value of the `pc` register by the size of the command (some commands take the next byte(s) as input and therefore are bigger than 1)
3. Execute the just parsed command

An overview about this can be seen below. Unfortunately the `when` block is rather big, so I decided to not show its full content here, because it is simply too much for this post. But actually the command execution is rather simple as you can see at the example command below. The full source code is (as always) available on GitHub.

``````val memory = decodeAscii85()
val output = ArrayList<UByte>()

val curIndex = getRegisterValue("pc").toInt()

// 1. Reads one instruction from memory, at the address stored in the `pc` register.
val command = Command.fromByte(memory[curIndex])
?: throw RuntimeException("No command for byte \${memory[curIndex].toString(16)} at position \${curIndex} found")

// 2. Adds the byte size of the instruction to the `pc` register.
registers["pc"] = curIndex.toUInt() + command.size.toUInt()

// 3. Executes the instruction.
when (command) {
Command.ADD_A_X_B -> registers["a"] = (getRegisterValue("a") + getRegisterValue("b")) % 256u
// Omitted all other cases here as this would be too much for the blog post
}

}

output
.toText()
.saveToFile(resultFileName)``````

To recognize the commands I created an enum class that holds all commands and translates a byte to a command.

``````enum class Command(val opCode: UByte? = null, val opCodePattern: String? = null, val size: Int) {
ADD_A_X_B(opCode = 0xC2.toUByte(), size = 1), // Sets `a` to the sum of `a` and `b`, modulo 255.
APTR_IMM8(opCode = 0xE1.toUByte(), size = 2), // Sets `ptr` to the sum of `ptr` and `imm8`. Overflow behaviour is undefined.
CMP(opCode = 0xC1.toUByte(), size = 1), // Sets `f` to zero if `a` and `b` are equal, otherwise sets `f` to 0x01.
HALT(opCode = 0x01.toUByte(), size = 1), // Stops the execution of the virtual machine. Indicates that the program has finished successfully.
JEZ_IMM32(opCode = 0x21.toUByte(), size = 5), // If `f` is equal to zero, sets `pc` to `imm32`. Otherwise does nothing.
JNZ_IMM32(opCode = 0x22.toUByte(), size = 5), //  If `f` is not equal to zero, sets `pc` to `imm32`. Otherwise does nothing.
MV_DEST_X_SRC(opCodePattern = "01******", size = 1), // Sets `{dest}` to the value of `{src}`.
MV32_DEST_X_SRC(opCodePattern = "10******", size = 1), // Sets `{dest}` to the value of `{src}`.
MVI_DEST_X_IMM8(opCodePattern = "01***000", size = 2), // Sets `{dest}` to the value of `imm8`.
MVI32_DEST_X_IMM32(opCodePattern = "10***000", size = 5), // Sets `{dest}` to the value of `imm32`.
OUT_A(opCode = 0x02.toUByte(), size = 1), // Appends the value of `a` to the output stream.
SUB_A_X_B(opCode = 0xC3.toUByte(), size = 1), // Sets `a` to the result of subtracting `b` from `a`. If subtraction would result in a negative number, 255 is added to ensure that the result is non-negative.
XOR_A_X_B(opCode = 0xC4.toUByte(), size = 1); // Sets `a` to the bitwise exclusive OR of `a` and `b`.

companion object {
private val opCodeMap = values().associateBy(Command::opCode)

fun fromByte(byte: UByte): Command? {
val command = opCodeMap[byte]
if (command != null) {
return command
}

val startingBits = binaryByte.substring(0, 2)
val lastBits = binaryByte.substring(5)
return when (startingBits) {
"01" -> if (lastBits == "000") MVI_DEST_X_IMM8 else MV_DEST_X_SRC
"10" -> if (lastBits == "000") MVI32_DEST_X_IMM32 else MV32_DEST_X_SRC
else -> null
}
}
}
}``````

## The Core

After solving layer 6 you'll reach the mysterious core. Tom Dalling (the author) says the following about it:

I've hidden something at the core of this puzzle -- something that I probably should not have published -- something that, if you read it, you might wish that you hadn't. This may land me in serious trouble, but I can't sleep easy at night holding on to this secret. You have been warned.

I'm very pleased to belong to the group of privileged people knowing this secret now. I won't tell you more about the secret hidden at the end, but really recommend you to unwrap Tom's Data Onion on your own now.

## Version Information

This post describes the solution for version 1.1.1 of Tom's Data Onion.

## Resources

##### Title picture
You've successfully subscribed to ksick.dev
Welcome back! You've successfully signed in.
Great! You've successfully signed up.