云之南

风声，雨声，读书声，声声入耳；家事，国事，天下事，事事关心

日志

关于我

云之南

专业背景：计算机科学研究方向与兴趣: JavaEE-Web软件开发, 生物信息学, 数据挖掘与机器学习, 智能信息系统目前工作: 基因组, 转录组, NGS高通量数据分析, 生物数据挖掘, 植物系统发育和比较进化基因组学

文章分类

nucleotide blast中MegaBlast/Discontiguous MegaBlast/BlastN的区别与选择

2011-05-03 09:52:44| 分类：生物信息学 | 标签： |举报 |字号大中小订阅

下载LOFTER 我的照片书 |

http://fhqdddddd.blog.163.com/blog/static/1869915420110132106373/

http://fhqdddddd.blog.163.com/blog/static/1869915420110128425843/

http://fhqdddddd.blog.163.com/blog/static/18699154201071231047454/

从blastn页面上的简单帮助可以看到Highly similar sequences (megablast)多用于比较相似性比较高（相似性在95%以上）的序列，速度快；More dissimilar sequences (discontiguous megablast)用于相似性稍低于megablast的比对，但是灵敏度和精确度更高，多用于不同物种间的同源比对；而Somewhat similar sequences (blastn)用于比对相似性较差的序列，可以比对最短7个碱基的长度，所以比对精确度最高，比对结果最多，速度最慢。

所以，在选择的时候根据你提交的序列和搜索的目的进行选择，如果是想看这段序列在数据库当中是否有收录，可以用megablast，如果想用其他物种的基因注释信息来注释一个未注释物种的序列，可以选择discontiguous megablast，如果想得到更多更全面的结果，可以选择blastn。

说完blastn，接着说blastp~blsatp中也有三个不同的算法可以选择，如下：

blastp (protein-protein BLAST)就是简单地进行蛋白与蛋白的比对，寻找蛋白质相似序列；

PSI-BLAST (Position-Specific Iterated BLAST)叫做位点特异性迭代比对，它在蛋白质数据库中循环搜索查询蛋白质，所有前一次被psi-blast发现的统计显著蛋白质序列将整合成新记分矩阵，通过多次迭代比对，直到不再发现统计显著的新蛋白质；

PHI-BLAST (Pattern Hit Initiated BLAST)可以在搜索的时候限定蛋白质的模式（pattern），只给出包含此模式的比对结果。

http://liucheng.name/1010/

Blastp/PSI-Blast/PHI-BLAST都是蛋白序列与蛋白序列之间的Blast比对

1，Blastp: 标准的蛋白序列与蛋白序列之间的比对

Standard protein BLAST is designed for protein searches.

Blastp用于确定查询的氨基酸序列在蛋白数据库中找到相似的序列。跟其它的Blast程序一样，目的是要找到相似的区域。
2，PSI-BLAST : 敏感度更高的蛋白序列与蛋白序列之间的比对

PSI-BLAST is designed for more sensitive protein-protein similarity searches.

Position-Specific Iterated (PSI)-BLAST，是一种更加高灵敏的Blastp程序，对于发现远亲物种的相似蛋白或某个蛋白家族的新成员非常有效。当你使用标准的Blastp 比对失败时，或比对的结果仅仅是一些假基因或推测的基因序列时（"hypothetical protein" or "similar to..."），你可以选择PSI-BLAST重新试试。
3，PHI-BLAST : 模式发现迭代BLAST

PHI-BLAST can do a restricted protein pattern search.

PHI-BLAST, 模式发现迭代BLAST, 用蛋白查询来搜索蛋白数据库的一个程序。仅仅找出那些查询序列中含有的特殊模式的对齐。
PHI的语法详细介绍看这里：http://www.ncbi.nlm.nih.gov/blast/html/PHIsyntax.html megablast的参数设置

megablast 2.2.11 arguments:

-d Database [String]
default = nr
-i Query File [File In]
-e Expectation value [Real]
default = 10.0
-m alignment view options:
0 = pairwise,
1 = query-anchored showing identities,
2 = query-anchored no identities,
3 = flat query-anchored, show identities,
4 = flat query-anchored, no identities,
5 = query-anchored no identities and blunt ends,
6 = flat query-anchored, no identities and blunt ends,
7 = XML Blast output,
8 = tabular,
9 tabular with comment lines,
10 ASN, text
11 ASN, binary [Integer]
default = 0
-o BLAST report Output File [File Out] Optional
default = stdout
-F Filter query sequence [String]
default = T
-X X dropoff value for gapped alignment (in bits) [Integer]
default = 20
-I Show GI's in deflines [T/F]
default = F
-q Penalty for a nucleotide mismatch [Integer]
default = -3
-r Reward for a nucleotide match [Integer]
default = 1
-v Number of database sequences to show one-line descriptions for (V) [Intege
r]
default = 500
-b Number of database sequence to show alignments for (B) [Integer]
default = 250
-D Type of output:
0 - alignment endpoints and score,
1 - all ungapped segments endpoints,
2 - traditional BLAST output,
3 - tab-delimited one line format [Integer]
default = 2
-a Number of processors to use [Integer]

default = 1
-O ASN.1 SeqAlign file; must be used in conjunction with -D2 option [File Out
] Optional
-J Believe the query defline [T/F] Optional
default = F
-M Maximal total length of queries for a single search [Integer]
default = 20000000
-W Word size (length of best perfect match) [Integer]
default = 28
-z Effective length of the database (use zero for the real size) [Real]
default = 0
-P Maximal number of positions for a hash value (set to 0 to ignore) [Integer
]
default = 0
-S Query strands to search against database: 3 is both, 1 is top, 2 is bottom
[Integer]
default = 3
-T Produce HTML output [T/F]
default = F
-l Restrict search of database to list of GI's [String] Optional
-G Cost to open a gap (zero invokes default behavior) [Integer]
default = 0
-E Cost to extend a gap (zero invokes default behavior) [Integer]
default = 0
-s Minimal hit score to report (0 for default behavior) [Integer]
default = 0
-Q Masked query output, must be used in conjunction with -D 2 option [File Ou
t] Optional
-f Show full IDs in the output (default - only GIs or accessions) [T/F]
default = F
-U Use lower case filtering of FASTA sequence [T/F] Optional
default = F
-R Report the log information at the end of output [T/F] Optional
default = F
-p Identity percentage cut-off [Real]
default = 0
-L Location on query sequence [String] Optional
-A Multiple Hits window size [Integer]
default = 0
-y X dropoff value for ungapped extension [Integer]
default = 10
-Z X dropoff value for dynamic programming gapped extension [Integer]
default = 50
-t Length of a discontiguous word template (contiguous word if 0) [Integer]
default = 0
-g Generate words for every base of the database (default is every 4th base;
may only be used with discontiguous words) [T/F] Optional
default = F
-n Use non-greedy (dynamic programming) extension for affine gap scores [T/F]
Optional
default = F
-N Type of a discontiguous word template (0 - coding, 1 - optimal, 2 - two si
multaneous [Integer]
default = 0
-H Maximal number of HSPs to save per database sequence (0 = unlimited) [Inte
ger]
default = 0
-V Force use of the legacy BLAST engine [T/F] Optional
default = F

更详细的解释可以查看：http://www.ncbi.nlm.nih.gov/blast/producttable.shtml#tab31

评论这张

转发至微博

阅读(1882)| 评论(0)

历史上的今天

this.p={  m:2,
              b:2,
              loftPermalink:'',
              id:'fks_095067093080088069082084082095085094087074093087083071',
              blogTitle:'nucleotide blast中MegaBlast/Discontiguous MegaBlast/BlastN的区别与选择',
              blogAbstract:'<a target=\"_blank\" href=\"http://fhqdddddd.blog.163.com/blog/static/1869915420110132106373/\"  \>http://fhqdddddd.blog.163.com/blog/static/1869915420110132106373/</a\><br\><br\><a target=\"_blank\" href=\"http://fhqdddddd.blog.163.com/blog/static/1869915420110128425843/\"  \>http://fhqdddddd.blog.163.com/blog/static/1869915420110128425843/</a\><br\><br\><a target=\"_blank\" href=\"http://fhqdddddd.blog.163.com/blog/static/18699154201071231047454/\"  \>http://fhqdddddd.blog.163.com/blog/static/18699154201071231047454/</a\><br\><br\><p\>从blastn页面上的简单帮助可以看到Highly similar sequences  (megablast)多用于比较相似性比较高（相似性在95%以上）的序列，速度快；More dissimilar sequences  (discontiguous  megablast)用于相似性稍低于megablast的比对，但是灵敏度和精确度更高，多用于不同物种间的同源比对；而Somewhat</p\>',
              blogTag:'',
              blogUrl:'blog/static/1869915420114395244225',
              isPublished:1,
              istop:false,
              type:0,
              modifyTime:1317109150938,
              publishTime:1304387564225,
              permalink:'blog/static/1869915420114395244225',
              commentCount:0,
              mainCommentCount:0,
              recommendCount:0,
              bsrk:-100,
              publisherId:0,
              recomBlogHome:false,
              currentRecomBlog:false,
              attachmentsFileIds:[],
              vote:{},
              groupInfo:{},
              friendstatus:'none',
              followstatus:'unFollow',
              pubSucc:'',
              visitorProvince:'',
              visitorCity:'',
              visitorNewUser:false,
              postAddInfo:{},
              mset:'000',
              mcon:'',
              srk:-100,
              remindgoodnightblog:false,
              isBlackVisitor:false,
              isShowYodaoAd:true,
              hostIntro:'专业背景：计算机科学                           \n\n研究方向与兴趣: JavaEE-Web软件开发,\n生物信息学, 数据挖掘与机器学习, 智能信息系统                                                                \n目前工作: 基因组, 转录组, NGS高通量数据分析, 生物数据挖掘, 植物系统发育和比较进化基因组学',
              hmcon:'1',
              selfRecomBlogCount:'0',
              lofter_single:'<iframe width="140" height="560" style="overflow:hidden;" src="http://www.lofter.com/mailEntry.do?blogad=1&blog" frameBorder="0"></iframe>'
            }

{list a as x}
    {if !!x}
    <div class="iblock nbw-fce nbw-f40">
      <a class="fc03 noul" target="_blank" hidefocus="true" href="http://blog.163.com/${x.visitorName}/">
      {if x.visitorName==visitor.userName}
      <img alt="${x.visitorNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.visitorName)}&r=${visitor.imageUpdateTime}"/>
      {else}
      <img alt="${x.visitorNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.visitorName)}"/>
      {/if}
      </a>
      <div class="cwd vname thide">
        {if x.moveFrom=='wap'}
          <a class="noul pnt" target="_blank" href="http://blog.163.com/services/wapblog.html?frompersonalbloghome"><span title="来自网易手机博客" class="iblock wapIcon"> </span></a>
        {elseif x.moveFrom=='iphone'}
          <a class="noul pnt" target="_blank"><span title="来自iPhone客户端" class="iblock iphoneIcon"> </span></a>
        {elseif x.moveFrom=='android'}
          <a class="noul pnt" target="_blank"><span title="来自Android客户端" class="iblock androidIcon"> </span></a>
        {elseif x.moveFrom=='mobile'}
          <a class="noul pnt" target="_blank" href="http://blog.163.com/services/emsblog.html?frompersonalbloghome"><span title="来自网易短信写博" class="iblock wapIcon"> </span></a>
        {/if}
        <a class="fc03 m2a"  target="_blank" hidefocus="true" href="http://blog.163.com/${x.visitorName}/">
          ${fn(x.visitorNickname,8)|escape}
        </a>
      </div>
    </div>
    {/if}
    {/list}

<#--最新日志，群博日志--> <#--推荐日志-->

<p class="fc06">推荐过这篇日志的人：</p>
    <div>
      {list a as x}
      {if !!x}
      <div class="iblock nbw-fce nbw-f40">
        <a class="fc03 noul" target="_blank" hidefocus="true" href="http://blog.163.com/${x.recommenderName}/">
        <img alt="${x.recommenderNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.recommenderName)}"/>
        </a>
        <div class="cwd thide">
          <a class="fc03 m2a" target="_blank" hidefocus="true" href="http://blog.163.com/${x.recommenderName}/">
            ${fn(x.recommenderNickname,6)|escape}
          </a>
        </div>
      </div>
      {/if}
      {/list}
    </div>
    {if !!b&&b.length>0}
    <p  class="fc06">他们还推荐了：</p>
    <ul>
    {list b as y}
      {if !!y}
        <li class="rrb"><span class="iblock">·</span><a class="fc03 m2a" target="_blank" href="http://blog.163.com/${y.recommendBlogPermalink}/?from=blog/static/1869915420114395244225">${y.recommendBlogTitle|escape}</a></li>
      {/if}
    {/list}
    </ul>
    {/if}

<#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇，下一篇--> <#-- 热度 -->

{list a as x}
    {if !!x}
    <div class="hotItem iblock nbw-fce nbw-f40">
      <a class="fc03 noul" target="_blank" hidefocus="true" href="http://blog.163.com/${x.publisherUsername}/">
      {if x.publisherUsername==visitor.userName}
      <img alt="${x.publisherNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.publisherUsername)}&r=${visitor.imageUpdateTime}"/>
      {else}
      <img alt="${x.publisherNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.publisherUsername)}"/>
      {/if}
      </a>
      <div class="cwd vname thide">
        <a class="fc03 m2a"  target="_blank" hidefocus="true" href="http://blog.163.com/${x.publisherUsername}/">
          ${fn(x.publisherNickname,8)|escape}
        </a>
      </div>
      <a class="f-myLikeIcons hottype {if x.type==1} js-liketype{elseif x.type==2} js-reblogtype{elseif x.type==3} js-sharetype{else}{/if}" target="_blank" hidefocus="true" href="http://blog.163.com/${x.publisherUsername}/"> </a>
    </div>
    {/if}
    {/list}

<#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->

页脚

我的照片书 - 手机博客 - 下载LOFTER APP - 订阅此博客

云之南

导航

日志

nucleotide blast中MegaBlast/Discontiguous MegaBlast/BlastN的区别与选择

历史上的今天

最近读者

热度

评论

页脚