注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

云之南

风声,雨声,读书声,声声入耳;家事,国事,天下事,事事关心

 
 
 

日志

 
 
关于我

专业背景:计算机科学 研究方向与兴趣: JavaEE-Web软件开发, 生物信息学, 数据挖掘与机器学习, 智能信息系统 目前工作: 基因组, 转录组, NGS高通量数据分析, 生物数据挖掘, 植物系统发育和比较进化基因组学

网易考拉推荐

Quorum error corrector for reads  

2015-07-05 19:32:24|  分类: 生信分析软件 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

http://www.genome.umd.edu/quorum.html

Installation
============

There are two ways to install Quorum. The easy automated way and the manual way.

Easy
****

The quorum_easy_installation script will download Jellyfish and
Quorum, compile both of them and install them in the current directory
(or in the path given by PREFIX on the command line). It is the
easiest way to compile Quorum.

After downloading quorum_easy_installation, just do:

$ sh quorum_easy_installation

or

$ sh quorum_easy_installation PREFIX=/path/where/to/install

Manual
******

Quorum requires Jellyfish to be installed. For Quorum to compile and
run properly, pkg-config must find Jellyfish and the library loader
must find the shared library. See the README of Jellyfish for details.

Provided that Jellyfish is installed and accessible, install the usual
way:

$ ./configure --prefix=/path/where/to/install
$ make
$ make install

Note that 'make install' is necessary or the paths coded in the quorum
scripts will not be valid.

Usage
=====

Only one switch (-s) is required to run Quorum. This switch specify
the size of the Jellyfish hash and it must be large enough so that all
k-mers will fit into memory. With Illumina reads, a good estimate for
this size is:

  (G + k * n) / 0.8

where G is the estimated genome size, k is the k-mer length (24 by
default) and n is the number of reads. If the chosen size is too
small, quorum will stop with the error message: "Failed: Increase the
size parameter".

For example, for a bacteria with 2 million Illumina reads in files
read1.fastq and read2.fastq, the command would be:

$ quorum -s 50M read1.fastq read2.fastq

The output corrected file is called by default 'quorum_corrected.fa'.

Output format
=============

The correction made are appended to the header line in the fasta
format. For example, the following 101 bases long read:

@1204
GACCGGGCATGGGCTGAGCCTGTTCGGGAAGCTGACGGAGCCGGAAGAGGCCGGGATCGACCCTTCCGCCCCGCCCGCCGACTGGGTCGACCGGCCGGGCG

is corrected to:

>1204 86:sub:T-C 91:3_trunc 62:5_trunc
CTTCCGCCCCGCCCGCCGACTGGGCCGAC

The coordinate system is 0-based in the original reads (like a C or
Perl array). Here, at base 86 a substitution was made from T to C. The
5_trunc is the index of the first base (0 if not specified) and the
3_trunc is the index after the last base (read length if not
specified). Hence, the length of the corrected reads is computed as
3_trunc - 5_trunc (29 in this example). The uncorrected and corrected
reads align as follows:

0                                                            62                      86   91        101
|                                                             |                       |    |         |
GACCGGGCATGGGCTGAGCCTGTTCGGGAAGCTGACGGAGCCGGAAGAGGCCGGGATCGACCCTTCCGCCCCGCCCGCCGACTGGGTCGACCGGCCGGGCG
                                                              CTTCCGCCCCGCCCGCCGACTGGGCCGAC


Switches
========

Other useful switches include (see 'quorum --help' for a short
description of all of them).

* --threads NUMBER

Number of threads to use.

* --kmer-len LENGTH

Length of k-mer to use. Defaults to 24. This is limited to 31.

* --contaminant FILE

Pass in a fasta or fastq file of contaminant sequences. The error
correction program will truncate any reads which contains a k-mer
present in the contaminant sequences.

* --prefix NAME

By default, all output file have the form 'quorum_*'. This can be
changed with this switch.

* --min-q-char ASCII

This is the ASCII value of the base of quality encoding. If not
specified, it is auto-detected: the first 1,000 reads of the first
file are read and the minimum quality value seen in these reads is
used for min-q-char. An error is raised if this auto-detected base is
not one of the standard value (33, 59 or 64).
  评论这张
 
阅读(397)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2016