4chan /pol board as a temporary evolution of live threads and posts.
Description
Included in this dataset there is a three month extract of data scraped from 4chan during the period from 1st of April 2021 to 1st of July 2021
Collection Method
The scraping system was configure as a breath-first scraping system and MongoDB was used as document storage configured as a replica-set made of one primary, three secondaries and one arbiter.
Data Objects
Offline / Analogue Data Records
There are no offline / analogue datasets associated with this recordExternal Data Records
There are no external datasets associated with this recordDigital Data Downloads
To download and items from this dataset, you must agree to abide by the licence attached to the individual items. If you make use of any item you download, you must also cite it in any publication or outputs of your own.
If you have any questions or would like additional information, please contact us at researchdata@bbk.ac.uk.
Metadata
Dataset Title: | 4chan /pol board as a temporary evolution of live threads and posts. |
||||
---|---|---|---|---|---|
Creators: | Prifti, Ylli |
||||
School/Department: | |||||
Keywords: | 4chan, live board, threads, posts |
||||
Data collection method: | We used a distributed scraping system to collect this information and pointed a combination of 14 nodes, 3 clusters and 100 running instances per minute to scrap the data at high frequency, due to the ephemeral characteristics of 4chan.
|
||||
Collection period: |
|
||||
Temporal coverage: |
|
||||
Statement on legal, ethical, and access issues: | This is a collection of publicly available, anonymous at source and otherwise ephemeral data. |
||||
Depositing User: | Ylli Prifti | ||||
Date Deposited: | 29 Jul 2021 16:05 | ||||
Last Modified: | 01 Jul 2022 15:16 | ||||
Publisher: | Birkbeck College, University of London |
Export / Share Citation
Impact & Reach
Additional statistics for this dataset are available via IRStats2.