We used a distributed scraping system to collect this information and pointed a combination of 14 nodes, 3 clusters and 100 running instances per minute to scrap the data at high frequency, due to the ephemeral characteristics of 4chan. The scraping system was configure as a breath-first scraping system and MongoDB was used as document storage configured as a replica-set made of one primary, three secondaries and one arbiter.