Okay I mostly figured it out, but for some reason hashmaps don't want to cooperate with me.
It's about as unholy as I expected. The HTML document has a comment(that browsers drop) which provides some general information, and then the predictable nature of the HTML document(there's line breaks in specific spots to denote each database entry.) There's specialized subroutines to break it down even FURTHER because the returned database actually retains its HTML tags and thus needs further processing rather than sensibly getting rid of all of that BEFORE get_thread finishes.
This is some serious MIT PHD coding, /JesusTakeTheWheel/.