以后打算工作中用到的相关BLAST操作全部用BLAST+来完成 与以前的Blast相以,我们还是从格式化数据库到比对开始 一般我们是有一个fasta文件用来格式化数据库,以前的命令是formatdb,现在是makeblastdb 一般用到的格式如下: makeblastdb -in input_file -dbtype molecule_type -title database_title -parse_seqids -out database_name -logfile File_Name 注意:BLAST+2.2.24中这个参数不要加 -parse_seqids,不然成死循环 -in 后接输入文件,你要格式化的fasta序列 -dbtype 后接序列类型,nucl为核酸,prot为蛋白 -title 给数据库起个名,好看~~(不能用在后面搜索时-db的参数) -parse_seqids 推荐加上,现在有啥原因还没搞清楚 -out 后接数据库名,自己起一个有意义的名字,以后blast+搜索时要用到的-db的参数 -logfile 日志文件,如果没有默认输出到屏幕 和以前的formatdb差别还是挺大的,呵呵 用makeblastdb接参数-help会打印出为些信息: makeblastdb -help USAGE makeblastdb [-h] [-help] [-in input_file] [-dbtype molecule_type] [-title database_title] [-parse_seqids] [-hash_index] [-mask_data mask_data_files] [-out database_name] [-max_file_sz number_of_bytes] [-taxid TaxID] [-taxid_map TaxIDMapFile] [-logfile File_Name] [-version] DESCRIPTION Application to create BLAST databases, version 2.2.23+ OPTIONAL ARGUMENTS -h Print USAGE and DESCRIPTION; ignore other arguments -help Print USAGE, DESCRIPTION and ARGUMENTS description; ignore other arguments -version Print version number; ignore other arguments *** Input options -in <File_In> Input file/database name; the data type is automatically detected, it may be any of the following: FASTA file(s) and/or BLAST database(s) Default = `-' -dbtype <String, `nucl', `prot'> Molecule type of input Default = `prot' *** Configuration options -title <String> Title for BLAST database Default = input file name provided to -in argument -parse_seqids Parse Seq-ids in FASTA input -hash_index Create index of sequence hash values. *** Sequence masking options -mask_data <String> Comma-separated list of input files containing masking data as produced by NCBI masking applications (e.g. dustmasker, segmasker, windowmasker) *** Output options -out <String> Name of BLAST database to be created Default = input file name provided to -in argumentRequired if multiple file(s)/database(s) are provided as input -max_file_sz <String> Maximum file size for BLAST database files Default = `1GB' *** Taxonomy options -taxid <Integer, >=0> Taxonomy ID to assign to all sequences * Incompatible with: taxid_map -taxid_map <File_In> Text file mapping sequence IDs to taxonomy IDs. Format:<SequenceId> <TaxonomyId><newline> * Incompatible with: taxid -logfile <File_Out> File to which the program log should be redirected
|
评论