[PDF] fastp: an ultra-fast all-in-one FASTQ preprocessor | Semantic Scholar (2024)

Skip to search formSkip to main contentSkip to account menu

Semantic ScholarSemantic Scholar's Logo
@article{Chen2018fastpAU, title={fastp: an ultra-fast all-in-one FASTQ preprocessor}, author={Shifu Chen and Yanqing Zhou and Yaru Chen and Jia Gu}, journal={Bioinformatics}, year={2018}, volume={34}, pages={i884 - i890}, url={https://api.semanticscholar.org/CorpusID:52196534}}
  • Shifu Chen, Yanqing Zhou, Jia Gu
  • Published in bioRxiv 1 March 2018
  • Computer Science

Fastp is developed as an ultra-fast FASTQ preprocessor with useful quality control and data-filtering features that can perform quality control, adapter trimming, quality filtering, per-read quality cutting, and many other operations with a single scan of the FastQ data.

10,599 Citations

Highly Influential Citations

1,379

Background Citations

562

Methods Citations

2,831

Results Citations

8

Topics

Fastp (opens in a new tab)Adapter Trimming (opens in a new tab)SOAPnuke (opens in a new tab)AfterQC (opens in a new tab)Cutadapt (opens in a new tab)Adapter Trimmer (opens in a new tab)FastQC (opens in a new tab)Adapter Contamination (opens in a new tab)Base Correction (opens in a new tab)Adapter Sequences (opens in a new tab)

10,599 Citations

Atria: an ultra-fast and accurate trimmer for adapter and quality trimming
    Jiacheng ChuanAiguo ZhouL. HaleMiao HeXiang Li

    Computer Science, Biology

    bioRxiv

  • 2021

Atria matches the adapters in paired reads and finds possible overlapped regions with a super-fast and carefully designed byte-based matching algorithm (O(n) time with O(1) space) that can be used in a broad range of short-sequence matching applications.

Ktrim: an extra-fast and accurate adapter- and quality-trimmer for sequencing data
    Kun Sun

    Computer Science, Biology

    Bioinform.

  • 2020

Ktrim was ∼2-18 times faster than current tools and also showed high accuracy when applied on the testing datasets and could serve as a valuable and efficient tool for short-read NGS data preprocessing.

  • 24
  • PDF
RabbitFX: Efficient Framework for FASTA/Q File Parsing on Modern Multi-Core Platforms
    Hao ZhangHonglei Song Weiguo Liu

    Computer Science, Biology

    IEEE/ACM Transactions on Computational Biology…

  • 2023

RabbitFX is a fast, efficient, and easy-to-use framework for processing biological sequencing data on modern multi-core platforms that can efficiently read FASTA and FASTQ files by combining a lightweight parsing method by means of an optimized formatting implementation.

  • 2
  • Highly Influenced
RabbitQCPlus 2.0: More Efficient and Versatile Quality Control for Sequencing Data.
    Lifeng YanZekun Yin Weiguo Liu

    Computer Science, Biology

    Methods

  • 2023
FastProNGS: fast preprocessing of next-generation sequencing reads
    Xiaoshuang LiuZhenhe YanChao WuYang YangXiaoming LiGuangxin Zhang

    Computer Science

    BMC Bioinformatics

  • 2019

FastProNGS is a rapid, standardized, and user-friendly tool for preprocessing next-generation sequencing data within minutes and is an all-in-one software that is convenient for bulk data analysis.

  • 13
  • PDF
RabbitQCPlus: More Efficient Quality Control for Sequencing Data
    Lifeng YanZekun Yin Weiguo Liu

    Computer Science

    2022 IEEE International Conference on…

  • 2022

RabbitQCPlus is an ultra-efficient quality control tool for modern multi-core systems that uses vectorization, memory copy reduction, parallel (de)compression, and optimized data structures to achieve substantial performance gains.

SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files
    Andrea TelatinP. FariselliG. Birolo

    Computer Science, Biology

    Bioengineering

  • 2021

A suite of tools, called SeqFu (Sequence Fastx utilities), that provides a broad range of commands to perform both common and specialist operations with ease and is designed to be easily implemented in high-performance analytical pipelines.

FAST: FPGA-based Acceleration of Genomic Sequence Trimming
    Behnam KhaleghiTianqi Zhang Tajana Rosing

    Computer Science, Biology

    2022 IEEE Biomedical Circuits and Systems…

  • 2022

This work proposes the first FPGA-based framework dubbed FAST to accelerate the stages that deal with sequence trimming, in particular adapter and primer removal, which supports a comprehensive set of functionalities and is convenient to use by operating on standard genomics data formats.

  • 1
  • Highly Influenced
EARRINGS: an efficient and accurate adapter trimmer entails no a priori adapter sequences
    Ting-Hsuan WangCheng-Ching HuangJui-Hung Hung

    Computer Science, Biology

    Bioinform.

  • 2021

A set of fast and accurate adapter detection and trimming algorithms that entail no a priori adapter sequences are introduced that are particularly useful in meta-analyses of a large batch of datasets and can be incorporated in any sequence analysis pipelines in all scales.

  • 3
Falco: high-speed FastQC emulation for quality control of sequencing data
    Guilherme de Sena BrandineAndrew D. Smith

    Computer Science, Biology

    F1000Research

  • 2019

Falco is presented, an emulation of the popular FastQC tool that runs on average three times faster while generating equivalent results and requires less memory to run and provides more flexible visualization of HTML reports.

...

...

20 References

SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data
    Yuxin ChenYongsheng Chen Qiang Chen

    Computer Science, Biology

    GigaScience

  • 2018

SOAPnuke is demonstrated as a tool with abundant functions for a “QC-Preprocess-QC” workflow and MapReduce acceleration framework that enables large scalability to distribute all the processing works to an entire compute cluster.

Trimmomatic: a flexible trimmer for Illumina sequence data
    Anthony M. BolgerM. LohseB. Usadel

    Computer Science, Biology

    Bioinform.

  • 2014

Timmomatic is developed as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data and is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.

AfterQC: automatic filtering, trimming, error removing and quality control for fastq data
    Shifu ChenTanxiao HuangYanqing ZhouYue HanMingyan XuJia Gu

    Computer Science

    BMC Bioinformatics

  • 2017

Experimental results show that AfterQC can help to eliminate the sequencing errors for pair-end sequencing data to provide much cleaner outputs, and consequently help to reduce the false-positive variants, especially for the low-frequency somatic mutations.

  • 258
  • PDF
Cutadapt removes adapter sequences from high-throughput sequencing reads
    Marcel Martin

    Computer Science, Biology

  • 2011

The command-line tool cutadapt is developed, which supports 454, Illumina and SOLiD (color space) data, offers two adapter trimming algorithms, and has other useful features.

  • 22,238
  • PDF
Fast gapped-read alignment with Bowtie 2
    Ben LangmeadS. Salzberg

    Computer Science, Biology

    Nature Methods

  • 2012

Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

  • 39,888
  • PDF
SpeedSeq: Ultra-fast personal genome analysis and interpretation
    Colby ChiangRyan M. Layer Ira M. Hall

    Computer Science, Biology

    Nature Methods

  • 2015

The SpeedSeq platform accomplishes alignment, variant detection and functional annotation of a 50× human genome in 13 h on a low-cost server and alleviates a bioinformatics bottleneck that typically demands weeks of computation with extensive hands-on expert involvement.

  • 452
  • PDF
The Sequence Alignment/Map format and SAMtools
    Heng LiR. Handsaker R. Durbin

    Computer Science, Biology

    Bioinform.

  • 2009

Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by

UMI-tools: Modelling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy
    Tom S. SmithA. HegerI. Sudbery

    Computer Science, Biology

    bioRxiv

  • 2016

It is shown that errors in the UMI sequence are common and network-based methods to account for these errors when identifying PCR duplicates are introduced, demonstrating the value of properly accounting for errors in UMIs.

  • 1,219
  • PDF
Detecting ultralow-frequency mutations by Duplex Sequencing
    Scott R. KennedyMichael W. Schmitt L. Loeb

    Biology

    Nature Protocols

  • 2014

A detailed protocol for efficient DS adapter synthesis, library preparation and target enrichment, as well as an overview of the data analysis workflow are provided.

  • 360
  • PDF
Theoretical and practical advances in genome halving
    F. CollynL. GuyM. MarceauM. SimonetClaude-Alain H. Roten

    Biology

  • 2004

The authors' tighter bounds on genome halving distance yield a new algorithm for reconstructing an ancestral duplicated genome, and a software package GenomeHalving is created based on this new algorithm, identifying a sequence of translocations for halving the yeast genome that is shorter than previously conjectured possible.

  • 28,326

...

...

Related Papers

Showing 1 through 3 of 0 Related Papers

    [PDF] fastp: an ultra-fast all-in-one FASTQ preprocessor | Semantic Scholar (2024)
    Top Articles
    Latest Posts
    Article information

    Author: Melvina Ondricka

    Last Updated:

    Views: 5281

    Rating: 4.8 / 5 (68 voted)

    Reviews: 91% of readers found this page helpful

    Author information

    Name: Melvina Ondricka

    Birthday: 2000-12-23

    Address: Suite 382 139 Shaniqua Locks, Paulaborough, UT 90498

    Phone: +636383657021

    Job: Dynamic Government Specialist

    Hobby: Kite flying, Watching movies, Knitting, Model building, Reading, Wood carving, Paintball

    Introduction: My name is Melvina Ondricka, I am a helpful, fancy, friendly, innocent, outstanding, courageous, thoughtful person who loves writing and wants to share my knowledge and understanding with you.