云之南

风声，雨声，读书声，声声入耳；家事，国事，天下事，事事关心

日志

关于我

云之南

专业背景：计算机科学研究方向与兴趣: JavaEE-Web软件开发, 生物信息学, 数据挖掘与机器学习, 智能信息系统目前工作: 基因组, 转录组, NGS高通量数据分析, 生物数据挖掘, 植物系统发育和比较进化基因组学

文章分类

blastclust 聚类

2011-05-03 15:30:36| 分类：生物信息学 | 标签： |举报 |字号大中小订阅

下载LOFTER 我的照片书 |

http://szypanther.blog.hexun.com/44600967_d.html

聚类
blastclust
blastclust -a 4 -i proteins.fsa -o cluster_60_80_complete.ssv -S 60 -L 0.80 -e F
use cpus: 4
inputfile: proteins.fsa
outputfile: cluster_60_80_complete.ssv
protein identity: >60%
coverage: >80%
if blastclust -a 4 -i proteins.fsa -o cluster_60_80_complete.ssv -S 60 -L 0.80 -e F -p F
then the input file is nucleotides, not proteins

blastclust Parameters
2008-12-02 20:02

blastclust clusters a database of protein or nucleotide sequences. It outputs rows of sequence identifiers from the database with clustered sequences occurring on the same row and clusters sorted from largest to smallest. The program can generate a list of clusters for input into another program (e.g., an alignment program such as PHRAP); however, it should be used only on a relatively small number of sequences (10-1000) because it runs only on a single computer, and the RAM requirements quickly exceed most capacities.

Here are a few sample command lines:

blastclust -i my_nucdb -p F -o my_nucdb.clusters 
blastclust -i my_pepdb -o my_pepdb.clusters -L 0.7 -S 90

The following reference describes parameters used with blastclust.

-a [integer]

Default: 1

Programs: All

Specifies the number of CPUs to use on a multiprocessor machine.

-b [T/F]

Default: T

Requires coverage on both sequences. If set to T, the program requires both sequences to pass the coverage criteria set with -L before they are called neighbors and clustered together.

-c [file]

Default: Optional

Specifies a configuration file with advanced options. The configuration file is simply a list of the options that you commonly use.

-C [T/F]

Default: F

The crash recovery option. Set it to complete unfinished clustering. Set to T if using the -r option with a file to restore the clustering. Use the same command line as the crashed run with the same -s, with only -C, T, and -r being added. This restarts the run using the hit list file specified by -r and then appending to it (as specified by -s).

-d [file]

Default: Optional

The input file is a BLAST database, not a FASTA file.

-e [T/F]

Default: F

Enables ID parsing in the database-formatted report.

-i [file]

Default: stdin

Specifies the FASTA input file for clustering.

-l [file]

Default: Optional

Restricts the reclustering to the IDs specified in [file]. It can be useful when you have a very large FASTA database and wish to cluster a subset of sequences.

-L [real number]

Default:0.9

Specifies the length of coverage threshold.

-p [T/F]

Default: T

Input sequences are proteins. Set to F for nucleotides.

-r [file]

Default: Optional

Specifies the file used to restore neighbors for reclustering. Set -C to T. This file is created by the -s command of a previous run. Use it if the program crashes during a run.

-s [file]

Default: Optional

Specifies the file in which to save the hit list. This file can restore a crashed run and is the input file specified by -r.

-v [file]

Default: stdout

Prints progress messages. Progress is reported to standard output if no file is specified.

-W [integer]

Default: Protein 3, Nucleotide 32

评论这张

转发至微博

阅读(2749)| 评论(0)

历史上的今天

this.p={  m:2,
              b:2,
              loftPermalink:'',
              id:'fks_095067093094088067087084085095085094087074093087083071',
              blogTitle:'blastclust 聚类',
              blogAbstract:'<P\><A rel=\"nofollow\" href=\"http://szypanther.blog.hexun.com/44600967_d.html\"  \>http://szypanther.blog.hexun.com/44600967_d.html</A\><WBR\></P\>  <P\>聚类<BR\>    blastclust<BR\>blastclust -a 4 -i proteins.fsa -o cluster_60_80_complete.ssv -S 60 -L 0.80 -e F<BR\>use cpus: 4  <BR\>inputfile: proteins.fsa  <BR\>outputfile: cluster_60_80_complete.ssv   <BR\>protein identity: >60%  <BR\>coverage: >80%</P\>',
              blogTag:'',
              blogUrl:'blog/static/1869915420114333036765',
              isPublished:1,
              istop:false,
              type:0,
              modifyTime:1316918571116,
              publishTime:1304407836765,
              permalink:'blog/static/1869915420114333036765',
              commentCount:0,
              mainCommentCount:0,
              recommendCount:0,
              bsrk:-100,
              publisherId:0,
              recomBlogHome:false,
              currentRecomBlog:false,
              attachmentsFileIds:[],
              vote:{},
              groupInfo:{},
              friendstatus:'none',
              followstatus:'unFollow',
              pubSucc:'',
              visitorProvince:'',
              visitorCity:'',
              visitorNewUser:false,
              postAddInfo:{},
              mset:'000',
              mcon:'',
              srk:-100,
              remindgoodnightblog:false,
              isBlackVisitor:false,
              isShowYodaoAd:true,
              hostIntro:'专业背景：计算机科学                           \n\n研究方向与兴趣: JavaEE-Web软件开发,\n生物信息学, 数据挖掘与机器学习, 智能信息系统                                                                \n目前工作: 基因组, 转录组, NGS高通量数据分析, 生物数据挖掘, 植物系统发育和比较进化基因组学',
              hmcon:'1',
              selfRecomBlogCount:'0',
              lofter_single:'<iframe width="140" height="560" style="overflow:hidden;" src="http://www.lofter.com/mailEntry.do?blogad=1&blog" frameBorder="0"></iframe>'
            }

{list a as x}
    {if !!x}
    <div class="iblock nbw-fce nbw-f40">
      <a class="fc03 noul" target="_blank" hidefocus="true" href="http://blog.163.com/${x.visitorName}/">
      {if x.visitorName==visitor.userName}
      <img alt="${x.visitorNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.visitorName)}&r=${visitor.imageUpdateTime}"/>
      {else}
      <img alt="${x.visitorNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.visitorName)}"/>
      {/if}
      </a>
      <div class="cwd vname thide">
        {if x.moveFrom=='wap'}
          <a class="noul pnt" target="_blank" href="http://blog.163.com/services/wapblog.html?frompersonalbloghome"><span title="来自网易手机博客" class="iblock wapIcon"> </span></a>
        {elseif x.moveFrom=='iphone'}
          <a class="noul pnt" target="_blank"><span title="来自iPhone客户端" class="iblock iphoneIcon"> </span></a>
        {elseif x.moveFrom=='android'}
          <a class="noul pnt" target="_blank"><span title="来自Android客户端" class="iblock androidIcon"> </span></a>
        {elseif x.moveFrom=='mobile'}
          <a class="noul pnt" target="_blank" href="http://blog.163.com/services/emsblog.html?frompersonalbloghome"><span title="来自网易短信写博" class="iblock wapIcon"> </span></a>
        {/if}
        <a class="fc03 m2a"  target="_blank" hidefocus="true" href="http://blog.163.com/${x.visitorName}/">
          ${fn(x.visitorNickname,8)|escape}
        </a>
      </div>
    </div>
    {/if}
    {/list}

<#--最新日志，群博日志--> <#--推荐日志-->

<p class="fc06">推荐过这篇日志的人：</p>
    <div>
      {list a as x}
      {if !!x}
      <div class="iblock nbw-fce nbw-f40">
        <a class="fc03 noul" target="_blank" hidefocus="true" href="http://blog.163.com/${x.recommenderName}/">
        <img alt="${x.recommenderNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.recommenderName)}"/>
        </a>
        <div class="cwd thide">
          <a class="fc03 m2a" target="_blank" hidefocus="true" href="http://blog.163.com/${x.recommenderName}/">
            ${fn(x.recommenderNickname,6)|escape}
          </a>
        </div>
      </div>
      {/if}
      {/list}
    </div>
    {if !!b&&b.length>0}
    <p  class="fc06">他们还推荐了：</p>
    <ul>
    {list b as y}
      {if !!y}
        <li class="rrb"><span class="iblock">·</span><a class="fc03 m2a" target="_blank" href="http://blog.163.com/${y.recommendBlogPermalink}/?from=blog/static/1869915420114333036765">${y.recommendBlogTitle|escape}</a></li>
      {/if}
    {/list}
    </ul>
    {/if}

<#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇，下一篇--> <#-- 热度 -->

{list a as x}
    {if !!x}
    <div class="hotItem iblock nbw-fce nbw-f40">
      <a class="fc03 noul" target="_blank" hidefocus="true" href="http://blog.163.com/${x.publisherUsername}/">
      {if x.publisherUsername==visitor.userName}
      <img alt="${x.publisherNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.publisherUsername)}&r=${visitor.imageUpdateTime}"/>
      {else}
      <img alt="${x.publisherNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.publisherUsername)}"/>
      {/if}
      </a>
      <div class="cwd vname thide">
        <a class="fc03 m2a"  target="_blank" hidefocus="true" href="http://blog.163.com/${x.publisherUsername}/">
          ${fn(x.publisherNickname,8)|escape}
        </a>
      </div>
      <a class="f-myLikeIcons hottype {if x.type==1} js-liketype{elseif x.type==2} js-reblogtype{elseif x.type==3} js-sharetype{else}{/if}" target="_blank" hidefocus="true" href="http://blog.163.com/${x.publisherUsername}/"> </a>
    </div>
    {/if}
    {/list}

<#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->

页脚

我的照片书 - 手机博客 - 下载LOFTER APP - 订阅此博客

云之南

导航

日志

blastclust 聚类

历史上的今天

最近读者

热度

评论

页脚