May 2nd, 2020


Can a Natural Language Model be used to predict Bitcoin movement? Is GPT-2 a Proto Artificial General Intellegence capable of predicting binary Bitcoin price movement.

Language Transformers are one of the most exciting and mysterious use cases for machine learning Neural Nets. Giving us the ability to essentially talk to Artificial Intelligence. From Open AI's Better Language Models and Their Implications "Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text." To play with GPT2 check out Talk to Transformer

GPT-2 is a type of language model "capable of achieving state-of-the-art results on a set of benchmark and unique natural language processing tasks that range from language translation to generating news articles to answering SAT questions."

I think it was spoken about on Kevin Scott's podcast Behind the Tech that under a lot of the abstraction and mystery of language models, all they are doing is pattern recognition predicting the next character which after enough characters are successfully computed gives us the perception of speech.

This concept has been further demonstrated with GPT-2's ability, after feeding it enough data, to even cross language barriers or Generate Pokemon!

That being said we wanted to see if there was a way to study GPT-2's pattern recognition ability. If it's possible to leverage the Neural Net to predict the future and do this in a simple and binary format. With this criteria in place it because convenient to test this programatic hypothesis on the BITCOIN market.

The program primarily uses Node / Javascript to retrieve the market data and parse the response with the machine learning portion being a tiny python script using the GPT-2 Simple package. Jack on BTC "We have something that is pretty organic in nature and very principled in its original design...It's poetry."


Code - Github GPT-2-Crypto



This is not a trading strategy. Since the market has been reduced to a binary up and down there is no perception volume or volatility because of this alternative strategies need to be implemented to make this a viable tool.

gpt-2-crypto Dataset 1


Can a language model be used to predict BTC prices with a time based strategy

How it works

Every hour the price of Bitcoin is calculated against The previous hours price, if the price went up. We assign it a 1, if it goes down we assign it 0. This long block of numbers iss then fed into a custom trained GPT-2 model Which then Responds with a large block of binary ones and zeros.

The node program then. Takes the response and parses the next hour's binary character... Assuming that this is GPT-2’s attempt at predicting the market.


Time Stamp: May 15, 2020 7:10 PM

Unix Stamp: 1589569800

Previous BTC Price: 9488.88532808

Current BTC Price: 9504.92062577

TradeStatus Previous: 1

Binary Current: 1

Input Length Check: 1935

Input will readjust in 66 hours

binaryInput: "10100111010111...101000101011"

Binary Input Length: 1937

ूੂ✧Loading GPT-2...

✧A ूੂI ूੂ✧: Loading checkpoint checkpoint/run1/model-845

✧A ूੂI ूੂ✧: 10100111010111...101000101011

✧A ूੂI ूੂ✧: 00011001001011...1001000100100

✧A ूੂI ूੂ✧: None

GPT2 Output Data: "10100111010111...10100010101110000"

Trade Log Position: 1936

GPT2__Predict: 0

GPT2 Output Data: 0

Market Outcome: 1

Time Stamp: May 15, 2020 7:15 PM

Unix Stamp: 1589570101

First Impressions

It is difficult to tell if GPT-2 is in fact producing consistant though very slight positive results. In randomized backlog samplings we have found slight positive tendencies. We have also noticed the first characters in an input has a major influence on the response which we have not been able to solve. So if the input starts with a 0, GPT-2 has a tendency to respond with a 0.

Further Building

It would be interesting to test alternative methods in stacking other transformers with varying datasets. Building on top of The already existing binary prediction model and incorporating Volume prediction models. Or even going as far as sampling other Language transformers to hopefully increase its prediction accuracy.



Better Language Models and Their Implications

Talk to Transformer

Generating output sequences from input sequences using neural networks