Change Python's syntax with the "# coding:" trick
And potentially get fired, but I hear the job market is great these days
Summary
You know how 93% of Paint Splatters are Valid Perl Programs?
Well, technically any bag of junk can be a valid Python program if it you declare a compatible decoder in the header.
The coding # coding:
accepts arbitrary names, that can then be mapped using @codecs.register
to a codecs.Codec
subclass with a custom decode()
method. This basically can contain any custom code and therefore, allows you to turn your wildest fantasies into something Python can execute.
Anything can be a macro system if you are brave enough
Since there is no such thing as a raw text file, Python has the # coding
directive (and no, no need for # -*- coding: utf-8 -*-
) so you can specify the encoding of your code.
And sure, by default in Python 3 it's UTF-8, so you usually don't need to declare it, but it can be anything.
I don't mean any encoding. I mean... anything.
The decoding mechanism of a Python file is generic and pluggable, and you can use the codecs module to access it from Python itself. Of course, most codecs are just regular character set mappings, but the API doesn't specify it has to be, just that you take some bytes and turn them into text. In fact, it used to be an easter egg that you could write Python in rot13.
This means you can abuse this feature to allow arbitrary content to be valid Python.
Want a JPEG to be Python? It's possible.
Wish inline JSX or SQL were valid in Python? They can't stop you!
Think keywords should be in Italian? Hey, why not?
Let's say you hate significative spaces, and you are so disappointed by this little nugget of humor:
>>> from __future__ import braces
File "<stdin>", line 1
SyntaxError: not a chance
Fear not, with codecs, we can grow mustaches on the snake!
Let's create a "braces_indent.py" file in our current directory:
import codecs
import encodings
import tokenize
import io
# We create our custom codec. I only needs to methods:
# encode() and decode()
class BracesToIndentCodec(codecs.Codec):
# this is not important for us but mandatory
def encode(self, input, errors='strict'):
return input.encode('utf-8', errors)
# "input" is bytes() or memoryview(), and this must return
# a tuple of (result_text, consumed input)
def decode(self, input, errors='strict'):
if isinstance(input, memoryview):
input = input.tobytes()
output = input.decode('utf-8', errors)
# Really, you can put any processing here, so we'll
# traverse the text and check for lines begining
# with '{' or '}' and susbtitute that with 4 spaces
new_lines = []
level = 0
indent = ' ' # the urge to troll and put a tab here...
# Super naive and will break if you sneeze
# but you get the point...
for line in output.splitlines():
stripped = line.strip()
if stripped.endswith('{'):
line = indent * level + stripped.rstrip('{') + ":"
level += 1
elif stripped.endswith('}'):
level -= 1
line = indent * level + stripped.rstrip('}')
else:
line = indent * level + stripped
new_lines.append(line)
return '\n'.join(new_lines), len(input)
# This function tells Python how to find our new codec
# You can have many of those, and if it doesn't find anything
# it should return None, otherwise it should return the codec
@codecs.register
def search_function(encoding):
if encoding != 'braces_indent':
return None
codec = BracesToIndentCodec()
return codecs.CodecInfo(
name='braces_indent',
encode=codec.encode,
decode=codec.decode
)
We need @codecs.register
to be run on Python startup, and for this, we are going to use another trick: imports in "*.pth" files. I won't get into details, I may write an article about them later on. For now, just know that we are going to create a "braces_indent.pth" file with the following content:
import braces_indent
Yep, that's it, just one line. And we will put this file into the "site-packages" directory of a virtualenv.
We are all set to use our glorious codec.
Now let's use it and create a "hello.py" file:
# coding: braces_indent
def main() {
print("Hello, World!")
if True {
print("Finally, sanity!")
}
}
This file cannot be run directly (e.g: python hello.py
won't work). But if we activate the virtualenv, we can import it like any other Python module:
>>> from hello import main
>>> main()
Hello, World!
Finally, sanity!
This is of course absolutely terrible. It's dark magic, hard to debug and can explode at any moment. But isn't it fun?
It also has practical implications: the ideas project uses that feature to demo some of its experimental Python syntax.
Of course, there are other ways to do this.
You also can use import hooks (ideas also does that for most other features), bytecode manipulation (pytest uses this technique to allow rich assert
) or use the ctypes module (it's how forbiddenfruit lets you extend built-in types).
You know the drill: big power, big responsibilities, and all that.
As I was reading through this, I was thinking "yes, but as I have done in my *ideas* project, import hook are more flexible ..." until I reached the part where you linked to the documentation for my project! :-)