BiRD - Birkbeck Research Data

    flexiMAP: A regression-based method for discovering differential alternative polyadenylation events in standard RNA-seq data

    Cite as: Szkop, Krzysztof J. and Moss, David S. and Nobeli, Irene (2019): flexiMAP: A regression-based method for discovering differential alternative polyadenylation events in standard RNA-seq data. Birkbeck College, University of London. doi: https://doi.org/10.18743/DATA.00035

    Description

    An “idealized” dataset of RNA-seq reads was created using the polyester R package (Frazee et al., 2015). This simulated dataset is clean of technical biases and fold changes between isoforms are known, allowing testing of the sensitivity limits of the method in the absence of external factors. The simulation experiment comprises 20 samples, 10 in each of two conditions.
    For the "main" dataset, polyadenylation sites splitting each transcript into two isoforms (short and long) were obtained from the poly(A) site atlas (Gruber et al., 2016) for 11000 human transcripts. Each isoform (“short” and “short + long”) was simulated as a different transcript. The expression of the “shot + long” isoform was unchanged between conditions, whereas eleven different fold changes were applied between conditions for the “short” isoform in order to produce a range of different ratios, R. Hence, each fold change is represented by ~ 1000 transcripts in the dataset. Additionally, for each fold change category we assigned 100 different mean expression levels (from 100 to 1000) with the aim of sampling the effect of the expression level on the ability of the method to detect alternative polyadenylation events.
    For the "biased" dataset, the aim was to create a scenario where fold changes between two conditions are confounded by the presence of an additional factor. In the specific example set up, we created an imbalanced dataset with 1000 transcripts where male and female-origin samples are present in unequal numbers in the control (7 males and 3 females) and condition (3 males and 7 females) groups. Although the group membership for the factor of interest (condition) plays no role in the choice of polyadenylation site of these transcripts, membership to male or female group does, confounding the outcome of methods that do not take into account additional covariates.

    Collection Method

    Simulated RNA-seq data using polyester package (Frazee et al., 2015)

    Data Objects

    Offline / Analogue Data Records

    There are no offline / analogue datasets associated with this record

    External Data Records

    There are no external datasets associated with this record

    Digital Data Downloads

    To download and items from this dataset, you must agree to abide by the licence attached to the individual items. If you make use of any item you download, you must also cite it in any publication or outputs of your own.

    If you have any questions or would like additional information, please contact us at researchdata@bbk.ac.uk.

    Additional Metadata

    Data

    Metadata

    Dataset Title:

    flexiMAP: A regression-based method for discovering differential alternative polyadenylation events in standard RNA-seq data

    Creators:

    Szkop, Krzysztof J. and Moss, David S. and Nobeli, Irene

    School/Department:

    Birkbeck Schools and Research Centres > School of Science > Biological Sciences

    Data collection method:

    Simulated RNA-seq data using polyester package (Frazee et al., 2015)

    Statement on legal, ethical, and access issues:

    Not applicable

    Export / Share Citation

    Cite as: Szkop, Krzysztof J. and Moss, David S. and Nobeli, Irene (2019): flexiMAP: A regression-based method for discovering differential alternative polyadenylation events in standard RNA-seq data. Birkbeck College, University of London. doi: https://doi.org/10.18743/DATA.00035

    Impact & Reach

    Activity Overview
    6 month trend
    189Downloads
    6 month trend
    465Hits

    Additional statistics for this dataset are available via IRStats2.