old/1

How Writing Code in Elm Changed How I Want to Write Code in Python

Jul 6, 2023

Preface: This blog post is not currently looking good in a mobile view, due to the heavy use of code blocks that don't wrap.\ Maybe view this on your desktop or laptop, when you get a chance? Or try turning your phone sideways (landscape-orientation)?

Today, I'm going to be talking on programming/coding/robot-speak, and this language called Elm that I won't shut up about.

So first, for those who may know Python and are not familiar with Elm: What is Elm?

What is Elm?

Elm (also sometimes called Elmlang, since we keep naming languages after things that are rather hard to search for. In this case, Elm also being the name of a commonly-referenced tree) is a functional programming language, with a similar style to Haskell or F#. Elm practices very strong typing, and even if you don't annotate the types of your code, the compiler will still find it out, and complain if something doesn't add up. Elm strives for consistency, predictability, and security. Elm is transpiled to Javascript, but the language is designed to try and minimize or remove the "funny business" that Javascript has.

If you need a small example of what kind of "funny business" I'm taking about

The functional-language style of Haskell, F#, and Elm can be a bit... different ...for developers who aren't expecting it. But after some time with it, it's really not that bad.

Elm itself really only comprises of about 6 components

functions (and arguments)
types (and "type variants")
type aliases ("records")
if/else statements
case statements/pattern matching
let statements

One thing you may notice, as I kinda have in coming up with this list of components: variables is absent from that list. And that's one of the things that leads to Elm's consistency: you will never have a variable that could change into something else accidentally (via a function, method, or otherwise), and no secret pointer/referencing that modifies something you didn't want modified.

So, how do you define something like a variable, I hear you ask?

In Elm, you technically make no-argument functions, or a "Functional value", which are akin to constants. It's really a function that will always return the same constant value. Its predictable and consistent!

Even inside let statements, its all functions. Same syntax and everything.

So, now let's look at some of the other key components, and how I plan to use that in Python now and in the future:

Elm Types

Let's take a look at a basic Elm type:

type Water
    = Steam
    | Boiling
    | Hot
    | Warm
    | RoomTemperature
    | Cool
    | Cold
    | Freezing
    | Ice

This may be a weird example, but it works. Type Water can only ever be one of those 9 options (called "type variants"), no more, no less. Anything that takes Water as an argument must be able to consider every type variant, either accepting them all regardless (such as making another, more complex type), or using a case statement (like Python 3.10's match/case) to handle each case, or end up at the "no match" case. Example:

waterToString : Water -> String
waterToString water =
    case water of
        Steam ->
            "steam"
        Boiling ->
            "boiling"
        Hot ->
            "hot"
        Warm ->
            "warm"
        RoomTemperature ->
            "room_temperature"
        Cool ->
            "cool"
        Cold ->
            "cold"
        Freezing ->
            "freezing"
        Ice ->
            "ice"

        {-

            Doesn't apply here, but here's
            an example of the "no-match case", or default:

        _ ->
            "default_to_this_string"

        -}

Now, we can actually do this in Python! Some people who see and understand the above might immediately think: Enum. And that would be absolutely correct. And as mentioned above, we have match/case statements as of Python 3.10.

Using Enums, we can recreate Elm's Types, and parse them well with match/case:

import enum

class Water(enum.Enum):
    Steam = "steam"
    Boiling = "boiling"
    Hot = "hot"
    Warm = "warm"
    RoomTemperature = "room_temperature"
    Cool = "cool"
    Cold = "cold"
    Freezing = "freezing"
    Ice = "ice"

x = Water("ice")

match x:
    case Water.Steam:
        pass
    case Water.Boiling:
        pass
    case Water.Hot:
        pass
    case Water.Warm:
        pass
    case Water.RoomTemperature:
        pass
    case Water.Cool:
        pass
    case Water.Cold:
        pass
    case Water.Freezing:
        pass
    case Water.Ice:
        pass
    # optionally
    # case _:
    #     pass

Now, we don't have to make a water_to_string function, since Python's Enums have that built-in, but only if you give each Enum's value a string, as seen in the Water's class attribute definition. To turn Water into a string, we just would use x.value

This:

A) Makes Elm and Python work well if we're sending JSON back and forth from the frontend to the backend, or vice versa. We encode an Elm Type into a JSON string, then decode it in Python and instantiate an Enum. Or in reverse: use Enum.value to turn an Enum into a string, send it over JSON, then decode it and use a case to find the right Elm Type (think the waterToString function above, but reverse each type variant with it's matching string, like "boiling" -> Boiling ).
B) This makes Python more predictable and consistent. Instead of passing around one of 9 strings, which can be modified by any string method or other means, we pass around one of 9 constant Enum members.
C) Even if you're not using Elm, or not even running an application with a frontend, or maybe it's just a small Python script, the consistency of expecting certain values prevents any unforseen run-time consequences due to string typos or modifications anywhere in the file. IDE's can see that the Water Enum has 9 members with set names. Additionally, IDE's may tell you that you misspelled a string, but it won't tell you that your misspelled string isn't the same as the string you're expecting later on.

Elm Records

Records in Elm are a tool to store data. You give it parameter names, give each parameter a type, and use it. When you do use it, just fill all (and yes, all) of the parameters with some data and pass it around. Here's an example:

type alias BlogPost =
    { index : Int
    , header : String
    , content : String
    }

Here, we have almost a replica of the blog post model for this blog. There's a bit more to it, but for example's sake, we'll stick with this. To use this BlogPost record:

exampleBlogPost : BlogPost
exampleBlogPost =
    { index = 0
    , header = "Init"
    , content = "Some very long text"
    }

So now, exampleBlogPost will always create a new BlogPost, with exactly those values defined above. If we wanted to make it more dynamic:

blogPostTemplate : Int -> String -> String -> BlogPost
blogPostTemplate idx head cont =
    { index = idx
    , header = head
    , content = cont
    }

With this, we can create BlogPost's whenever we want, provided we have an int and two strings.

(By the way, to understand the first line of this code, it's just type annotations. It's a bit weird to look at, but just understand that the last type is the return type, and each one before is the type of the corresponding argument. To write this in Python, it would be:\ def blogPostTemplate(idx: int, head: str, cont: str) -> BlogPost: )

Now, to use records, we can access any of it's parameters by dot-notation, just like we would a Python class:

getBlogPostHeader : BlogPost -> String
getBlogPostHeader post =
    post.header

So now the question people want: What do we do in Python?

There are many ways, actually! The one I find nice and simple is dataclasses.dataclass:

import dataclasses

@dataclasses.dataclass(kw_only=True)
class BlogPost:
    index: int
    header: str
    content: str

Here, class BlogPost can be instantiated by keyword-arguments only (see: kw_only=True), or we can unpack a dict into it with double-asterisks, provided all the dict's keys match up. But essentially, this is almost the same as an Elm record. However...

Type-Strictness

Admittedly, one of the downsides of all of these Python equivalents is Python's lack of type-enforcement. By default, all type annotations are merely suggestions and guidance. There's no compiler to tell us if something wrong could occur. We can add a type-checking tool alongside our code, to make sure things are used as intended, but:

A) That doesn't mean everyone working on your code will use the type-checker, which makes the type-checker pointless.
B) The type annotations can also just be omitted, which makes the type-checker pointless.

So something I've been working on has been to make tools to enforce Python's type annotations in runtime. Because in most cases, the types are there.

With any function in Python, you can use the __annotations__ magic method to see all of the annotated types for each arg, as well as the return value. This also applies to class methods and lambdas, but lambda functions can't be annotated, so __annotations__ is moot there. But for the regular functions, we can do this:

def strict_function(f):
    def wrapped(*args, **kwargs):

        # Below, type(lambda: None) returns <class 'function'>.
        # This decorator is intended for functions, not classes,
        # so we see whether type(f) is also <class 'function'>.

        if type(f) is not type(lambda: None):
            raise TypeError(
                f"Provided item to wrap is not a function: {f}"
            )
        annotations = f.__annotations__
        annotations_as_list = list(annotations.items())
        annotation_errors = []
        if len(list(args) + list(kwargs.items()))+1 != len(annotations_as_list):
            raise TypeError(
                f"Not enough type annotations provided to function {f.__name__}.\\n"
                f"Annotations provided: {len(annotations_as_list)}. "
                f"Total items to annotate: {len(list(args) + list(kwargs.items()))+1}"
            )
        for i, arg in enumerate(args):
            if type(arg) is not annotations_as_list[i][1]:
                annotation_errors.append(
                    f"{annotations_as_list[i][0]} is not of type {annotations_as_list[i][1]}"
                )
        for kwarg in kwargs:
            if type(kwargs[kwarg]) is not annotations[kwarg]:
                annotation_errors.append(
                    f"{kwarg} is not of type {annotations[kwarg]}"
                )
        if annotation_errors:
            raise TypeError(
                "Provided arguments do not match annotated types.\\n"
                + "\\n".join(annotation_errors)
            )
        return_value = f(*args, **kwargs)
        if (
            (return_value is None and annotations["return"] is None)
            or (type(return_value) is annotations["return"])
        ):
            return return_value
        else:
            raise TypeError(
                f"Returned value {return_value} does not match annotation {annotations['return']}"
            )
    return wrapped

This whole blob makes a function decorator, used with @strict_function. It ensures that every function it wraps annotates every arg and kwarg, as well as the return value. It also makes sure that each arg is of the type it's supposed to be, that every kwarg is of the type it's supposed to be, and that the return value is the type it's supposed to be.... for most cases.

There are a few caveats. Functions that have *args or **kwargs (or both) as the arguments, since you can't really type the individual arguments, do not work, yet each argument given in the function call will count towards the total annotations expected. Types that aren't straightforward (such as anything involving the typing module, or union-types like str | int ) also don't work with how the decorator is currently written. But under most cases, this makes the function strict on typing.

(And it's not like handling things like *args, **kwargs, typing, or | is impossible, I just don't have the implementation down yet.)

On the dataclasses we were using before, we can get a tuple of the dataclass' attribute's names, types, and values with dataclasses.fields(some_instance_of_a_dataclass). I'm still working on a comprehensive way to always type dataclass attributes, but it's tricky. There are a lot of types (and ways to check them) to consider.

Regular classes are technically types in their own regards, but one thing remains:

Variables

Variables are quite difficult to enforce typing on, as most type-based classes do not seem to have an __annotations__ magic method, nor any other way to really determine if the value of the variable is what is annotated that it should be.

There is one solution, and it's everyone's favorite word from other languages: var

Although not the same implementation as other languages, we could create a class named var, and use the __class_getitem__ to define the .type, both __init__ and __call__ to define the .value. From then on, we can override other magic methods like __setattr__ to ensure that the .value attribute is always whatever the .type says it should be, as well as set the .type. We can also override __setitem__ to change both the type and the value, should we need to, or just __getitem__ to change the type (although this one may confuse IDE's)

It's a bit hacky, but it works. Let's see the whole implementation:

# purposely lowercase, to mimic the built-in types
class var:
    def __init__(self, value):
        if not hasattr(self, "type"):
            self.type = type(value)
            self.value = value
        else:
            self.value = value

    def __setattr__(self, key, attr):
        match key:
            case "type":
                super().__setattr__("type", attr)
                if hasattr(self, "value"):
                    super().__setattr__("value", attr(self.value))
            case "value":
                if (
                    (attr is None and self.type is None)
                    or (type(attr) is self.type)
                    or isinstance(attr, self.type)
                ):
                    super().__setattr__("value", attr)
                else:
                    raise TypeError(
                        f"Provided value <{attr}> is not of type {type(self.value)}"
                    )
            case a:
                raise AttributeError(
                    f"Cannot change attribute <{a}>. Only changing of 'type' and 'value' are allowed."
                )

    def __class_getitem__(cls, set_type):
        cls.type = set_type
        return cls

    def __call__(self, set_value):
        self.value = set_value

    def __setitem__(self, set_type, set_value):
        del self.value
        self.type = set_type
        self.value = set_value

    def __getitem__(self, set_type):
        self.type = set_type

    def __str__(self):
        return str(self.value)

    def __repr__(self):
        return f"var[{self.type.__name__}]({self.value})"

And for usage:

x = var[str]("This String")
#     uses __class_getitem__, followed by __init__
print(x)
#     uses __str__
#     >>> This String
print(repr(x))
#     uses __repr__
#     >>> var[str](This String)

x("A New String")
#     uses __call__
print(x)
#     >>> A New String
print(repr(x))
#     >>> var[str](A New String)

x[int] = 16
#     uses __setitem__
print(x)
#     >>> 16
print(repr(x))
#     >>> var[int](16)

x[float]
#     uses __getitem__
#     16 can be converted to float with float(16)
#     this line in __setattr__ succeeds:
#       super().__setattr__("value", attr(self.value))
print(x)
#     >>> 16.0
print(repr(x))
#     >>> var[float](16.0)

y = var[datetime](datetime.now())
print(y)
#     >>> 2023-07-05 23:06:50.843621
print(repr(y))
#     >>> var[datetime](2023-07-05 23:06:50.843621)
print(x)
#     >>> 16.0
print(repr(x))
#     >>> var[float](16.0)
# The old classes are still intact!

x("a string again")
#     >>> TypeError: Provided value <a string again> is not of type <class 'float'>
#     x is still var[float]. str is not a float.

# A simpler, "inferred type" instantiation
# Just use regular __init__
z = var(False)
#     __init__ says:
#       if not hasattr(self, "type"):
#	       self.type = type(value)
#     z.type = bool is inferred.
print(z)
#     >>> False
print(repr(z))
#     >>> var[bool](False)

Now, that just requires using this new var class, and remembering to access it's .value if you want the value for assigning to something else. Which can be a challenge if people don't want to use it, but it's the closest we'll get.

Summary

So, Elm is a pretty secure language, due to it's compiler type-checking, and can be extended through custom types, and has secure data modelling with records.

With Python, we can get close to making it as type-aware as possible, but it's at a cost: It raises errors at runtime. But once those are all settled, we theoretically could have something pretty secure, predictable, and consistent. At least, until data from the outside world gets involved. User-data is never consistent, nor predictable.

I'm currently making (alongside a few other projects) a library to make type-aware, type-secure objects, much like we saw above with the @strict_function function decorator and the var class, but (hopefully) better, and with more!

Let me know what you think! Probably on Mastodon (if you have that), since Twitter seems to be self-destructing slowly.