注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

云之南

风声,雨声,读书声,声声入耳;家事,国事,天下事,事事关心

 
 
 

日志

 
 
关于我

专业背景:计算机科学 研究方向与兴趣: JavaEE-Web软件开发, 生物信息学, 数据挖掘与机器学习, 智能信息系统 目前工作: 基因组, 转录组, NGS高通量数据分析, 生物数据挖掘, 植物系统发育和比较进化基因组学

网易考拉推荐

How to Run TGICL  

2012-04-27 09:55:02|  分类: 生物信息学 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

http://sourceforge.net/projects/tgicl/
A sample LSF script file jobfile to cluster a EST dataset using TGICL on lewis may include the following lines:

tgicl fastdb
where fastdb the multi-fasta file containing all the sequences to be clustered.
For more information about the usage of TGICL, tyep "tgicl -h" on command line or see the "README" file

Usage:

 Usage:   tgicl <fasta_db> [-q <qualdb>] [-d <refDb>] [-c {<num_CPUs>|<PVM_nodefile>}]      [-m <user>] [-O 'cap3_options']  [-l <min_overlap>] [-v <max_overhang>]      [-p <pid>] [-n slicesize] [-s <maxsize>] [-a <cluster_file>] [-M] [-K]       [-L] [-X] [-I] [-C] [-G] [-W <pairwise_script.psx>] [-A <asm_program.psx>]      [-P <param_file>] [-u <seq_list>] [-f <prefix_filter>] [-D]  Options:      -c : use the specified number of CPUs on local machine          (default 1) or a list of PVM nodes in <PVM_nodefile>   Clustering phase options:      -d do not perform all-vs-all search, but search <fasta_db> against          <refDb> instead; exit after the pairwise hits are generated      -n number of sequences in a clustering search slice (default 1000)      -p minimum percent identity for overlaps <PID> (default 94)      -l miminum overlap length (default 40)      -G store gap information for all pairwise alignments      -v maximum length of unmatched overhangs (default 30)      -M ignore lower-case masking in <fasta_db> sequences      -W use custom script <pairwise_script.psx> for the distributed          pairwise searches instead of the default: tgicl_cluster.psx      -Z only run the distributed pairwise searches and exit --         (no sorting of the pairwise overlaps and no clusters generated)      -Y only run the distributed pairwise searches          and the sorted & compressed *_hits.Z file      -L performs more restrictive, layout-based clustering         instead of simple transitive closure   General options:      -I do not rebuild database indices      -s attempt to split clusters larger than <maxsize> based on          seeded clustering (only works if there are 'et|'          or 'np|'-prefixed entries provided in the input file)      -O use given 'cap3_options' instead of the default ones         (-p 93)      -u skip the mgblast searches (assumed done) but restrict          further clustering analysis to only the sequences in <seq_list>      -C (TIGR sequences only) always put in the same cluster all reads          from the same clone      -t use <clone_list> file to put in the same cluster all sequence names         on the same line      -a assemble clusters from file <cluster_file>        (do not perform any pairwise clustering)      -f keep only sequence names with prefix <prefix>      -K skip the pairwise searches, only recreate the clusters         by reprocessing the previously obtained overlaps      -X do not perform assembly, only generate the cluster file      -A use custom script as the slice assembly script          (instead of tgicl_asm.psx)      -P pass the <param_file> as the custom parameter file          to the assembly program <asmprog.psx> 
  评论这张
 
阅读(1435)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2016