Tiny LFSR writeup

I worked out a problem in BUUOJ tonight. That's my first try in LFSR problem, so I'm writing this post to record it.

And for some reasons, unfortunately, Sogou Pinyin cannot be installed in ubuntu 20.04, while Google Pinyin performs so bad that I decided to write this post in English. Hope it will not disturb you.

BUUOJ: [AFCTF2018]Tiny LFSR
[attachment]

Let's take a look at the source code.

import sys
from binascii import unhexlify

if(len(sys.argv)<4):
	print("Usage: python Encrypt.py keyfile plaintext ciphername")
	exit(1)

def lfsr(R, mask):
	output = (R << 1) & 0xffffffffffffffff
	i=(R&mask)&0xffffffffffffffff
	lastbit=0
	while i!=0:
		lastbit^=(i&1)
		i=i>>1
	output^=lastbit
	return (output,lastbit)

R = 0
key = ""
with open(sys.argv[1],"r") as f:
	key = f.read().strip()
	R = int(key,16)
	f.close
	
mask = 0b1101100000000000000000000000000000000000000000000000000000000000

# print(key)

a = ''.join([chr(int(b, 16)) for b in [key[i:i+2] for i in range(0, len(key), 2)]])

f=open(sys.argv[2],"r")
ff = open(sys.argv[3],"wb")
s = f.read()
f.close()
lent = len(s)

for i in range(0, len(a)):
	ff.write((ord(s[i])^ord(a[i])).to_bytes(1, byteorder='big'))

for i in range(len(a), lent):
    tmp=0
    for j in range(8):
        (R,out)=lfsr(R,mask)
        tmp=(tmp << 1)^out
    ff.write((tmp^ord(s[i])).to_bytes(1, byteorder='big'))
ff.close()
Encrypt.py

That's such a badly written code that is not Pythonic, which makes it difficult to analyze the source code. Skim through the code, it is not difficult to obtain the encryption method:

  1. for the first len(key) chars, output m[x] ^ key[x]
  2. for the following chars, use LFSR to generate a bit b, then output m[x] ^ b.

Notice that this encryption method works as a stream cipher does. This truth suggests that if we get key, we are able to decrypt any cipher text - for the first several chars, xor the key; for the other chars, use the same algorithm as encryption does to generate a bit stream, then xor the cipher text to get plain text.

As an example for encrypion, cipher.txt and Plain.txt is given in attachment. We infer that the key which is used to encode this example, is also used to encode flag to flag_encode.txt. Let's try to recover this specific key from the given example.

cip = open('cipher.txt', 'rb').read()
msg = open('Plain.txt', 'rb').read()

for keyLen in range(len(cip)):
    print(codecs.encode(strxor(cip, msg)[:keyLen], 'hex'))

# b''
# b'01'
# b'0123'
# b'012345'
# b'01234567'
# b'0123456789'
# b'0123456789ab'
# b'0123456789abcd'
# b'0123456789abcdef'
# b'0123456789abcdef18'
# ...

Let's guess that 0x123456789abcdef is the key. If our speculation is correct, the encryption result of Plain.txt using this key, will be corresponding to cipher.txt.

$ cat > key
0123456789abcdef
$ python Encrypt.py key Plain.txt out.txt
$ workspace diff out.txt cipher.txt
$ echo $?
0
Verify our speculation

We have get the correct key! Let's use it to decrypt flag_encode.txt :

from Crypto.Util.strxor import strxor
import codecs

key = '0123456789abcdef'
R = int(key, 16)
mask = 0b1101100000000000000000000000000000000000000000000000000000000000

def lfsr(R, mask):
    # 左移1位:保留末尾 63 位,在最后添加一个0
    output = (R << 1) & 0xffffffffffffffff
    
    # i:保留 R 的前 0、1、3、4位
    i=(R&mask)&0xffffffffffffffff
    
    
    lastbit=0
    while i!=0:
        lastbit^=(i&1)
        i=i>>1
    # lastbit:统计 i 里面有多少个1, 奇数个则为1, 偶数个则为0

    # output: R 左移1位,再添加 lastbit
    output^=lastbit
    return (output,lastbit)

cip = open('flag_encode.txt', 'rb').read()
a = ''.join([chr(int(b, 16)) for b in [key[i:i+2] for i in range(0, len(key), 2)]])

ans = []

for i in range(len(a)):
    ans.append(chr((cip[i] ^ ord(a[i]))))

lent = len(cip)
    
for i in range(len(a), lent):
    tmp=0
    for j in range(8):
        (R,out)=lfsr(R,mask)
        tmp=(tmp << 1)^out
    ans.append(chr(tmp ^ cip[i]))

''.join(ans)
# 'In computing, a linear-feedback shift register (LFSR) is a shift register whose input bit is a linear function of its previous state.\n\nThe most commonly used linear function of single bits is exclusive-or (XOR). Thus, an LFSR is most often a shift register whose input bit is driven by the XOR of some bits of the overall shift register value.\n\nThe initial value of the LFSR is called the seed, and because the operation of the register is deterministic, the stream of values produced by the register is completely determined by its current (or previous) state. Likewise, because the register has a finite number of possible states, it must eventually enter a repeating cycle. However, an LFSR with a well-chosen feedback function can produce a sequence of bits that appears random and has a very long cycle.\n\nApplications of LFSRs include generating pseudo-random numbers, pseudo-noise sequences, fast digital counters, and whitening sequences. Both hardware and software implementations of LFSRs are common.\n\nThe mathematics of a cyclic redundancy check, used to provide a quick check against transmission errors, are closely related to those of an LFSR.\n\nCongratulations! flag is afctf{read_is_hard_but_worthy}'

So we have done it. Reviewing the process of solving the problem, we found it unnecessary to grasp every line in lsfr(); instead, just regarding it as a black-box will conduce to solving this problem easily.