Building a bitcoin dataset

Marcel Burger
4 min readJul 31, 2019

Good analysis starts with clean complete and good data. I have been looking for good datasets on bitcoin for a while, but couldn’t find anything that matched with what I was looking for. I decided to create my own dataset that I can use for different purposes. I wanted a dataset with reliable data, so my journey started with finding the right sources and instruments.

Build in Python

When datasets start to exceed the dimensions of google sheets or Excel, you’d better start looking for different solutions. I decided to use Python. More specifically, I’m working with PyCharm to write and evaluate the code and using the pandas and numpy package that provide appropriate tooling to first build the dataset and later analyse it. An easy way to build datasets is by using the DataFrame data structure functionality and read the CSV-files into a dataframe. Now that we have found a good way to get all the data in one place where we can easily analyse, it’s time to actually collect the data.

Getting all bitcoin blockchain data

A great source to get all the bitcoin blockchain data you’re looking for is www.blockchair.com. In the bitcoin/blocks section of the site you can select what data you’d like to export to a csv file and download it from there.

Blockchair.com: a great place to get your bitcoin blockchain data in csv format

Collecting bitcoin price…

--

--

Marcel Burger

As CIO Marcel heads Amdax Asset Management. He holds a MSc in Econometrics. Before he cofounded Amdax, he worked as a trader, portfoliomanager and quant.