今天接着昨天的任务继续。昨天已经实现了文件读取了表达差异基因的筛选工作,今天的任务就是把这些基因在代谢途径中显示出来。原代码如下:
#!/usr/bin/perl
use SOAP::Lite;
$wsdl = ‘http://soap.genome.jp/KEGG.wsdl’;
$soap_ser = SOAP::Lite->service($wsdl);
$filename = “C:\\TPS1NS.xls”;
open(MICROARRAY, $filename)||die (“Could not open file”);
$up_reg = 0;
$down_reg = 0;
@map_hash = ();
@path_hash = ();
$j = 0;
$times = 1;
print “gene_name\texp_ratio\tpath\n”;
while($line = <MICROARRAY>){
$pattern = “\x09″;
@words = split(/$pattern/, $line);
if ($words[2] eq “P” && $words[4] eq “P”){
$ratio=log($words[1])-log($words[3]);
if(abs($ratio)>=$times){
if($ratio>=$times){
$up_reg += 1;
}
if($ratio<=-$times){
$down_reg +=1;
}
if($words[5]=~/DB_XREF=(\D+)(\d+)\s/){
$db_pre=$1;
$db_id=$2;
if($db_pre=~/GI/){
@kegg_id = split(/\s+/,$soap_ser->bconv(“ncbi-gi:$db_id”));
for($i=0;$i<3;$i++){
$map_hash[$j][$i] = $kegg_id[$i];
}
$map_hash[$j][3] = $ratio;
#通过KEGG API获取基因所在的代谢途径,并存贮起来。
$map_hash[$j][4]=$soap_ser->get_pathways_by_genes([$map_hash[$j][1]]);
#如果返回的代谢途径不为空,则将其打印出来。
if(@{$map_hash[$j][4]}){
foreach $hit(@{$map_hash[$j][4]}){
my %path = (“gene_name”=>$map_hash[$j][1],
“exp_ratio”=>$map_hash[$j][3],
“path”=>$hit);
push @path_hash, \%path;
print “$path{gene_name}\t$path{exp_ratio}\t$path{path}\n”;
}
}
$j++;
}
}
}
}
}
close(MICROARRAY);
#将相关的代谢途径打印一遍
print “\nrelated pathways\n”;
for($i=0;$i<scalar(@path_hash);$i++){
$tmp = $path_hash[$i];
%path_t = %$tmp;
print $path_t{path}.”\n”;
}
#以代谢途径名排序后再打印一遍
print “\nsorted related pathways\n”;
@path_sort = sort {$a->{path} cmp $b->{path}} @path_hash;
for($i=0;$i<scalar(@path_sort);$i++){
$tmp = $path_sort[$i];
%path_t = %$tmp;
print “$path_t{gene_name}\t$path_t{exp_ratio}\t$path_t{path}\n”;
}
#通过KEGG API获取代谢图,并在图上以颜色标注差异表达的基因。
@url_list = ();
$path_tmp = “none”;
@obj_list = ();
@fg_list = ();
@bg_list = ();
for($i=0;$i<scalar(@path_sort);$i++){
$tmp = $path_sort[$i];
%path_t = %$tmp;
if($path_tmp eq “none”){
$path_tmp = $path_t{path};
push @obj_list, $path_t{gene_name};
push @fg_list, “#ff0000″;
push @bg_list, “#ffff00″;
next;
}
elsif($path_t{path} eq $path_tmp){
push @obj_list, $path_t{gene_name};
push @fg_list, “#ff0000″;
push @bg_list, “#ffff00″;
next;
}
else{
$object_url=$soap_ser->color_pathway_by_objects($path_tmp,
\@obj_list,\@fg_list,\@bg_list);
# print “$path_tmp,@obj_list,@fg_list,@bg_list\n” ;
push @url_list, $object_url;
$path_tmp = $path_t{path};
@obj_list=($path_t{gene_name});
@fg_list=(“#ff0000″);
@bg_list=(“#ffff00″);
if($i==scalar(@path_sort)-1){
$object_url=$soap_ser->color_pathway_by_objects($path_tmp,
\@obj_list,\@fg_list,\@bg_list);
push @url_list, $object_url;
}
}
}
#用于显示图片
foreach $url(@url_list){
print “<img src=$url>\n”;
}
print $up_reg.” genes up regulated. And “.$down_reg.” genes down regulated.”;
print “\nThis is the end!”;
#将Perl数组转换成SOAP数组
sub SOAP::Serializer::as_ArrayOfstring{
my ($self, $value, $name, $type, $attr) = @_;
return [$name, {'xsi:type' => 'array', %$attr}, $value];
}
sub SOAP::Serializer::as_ArrayOfint{
my ($self, $value, $name, $type, $attr) = @_;
return [$name, {'xsi:type' => 'array', %$attr}, $value];
}
运行结果如下:
gene_name exp_ratio path
sce:YDR468C 1.349167127091 path:sce04130
sce:YGL201C 1.06268290373857 path:sce03030
sce:YGL201C 1.06268290373857 path:sce04111
sce:YLR340W 2.44280064588198 path:sce03010
related pathways
path:sce04130
path:sce03030
path:sce04111
path:sce03010
sorted related pathways
sce:YLR340W 2.44280064588198 path:sce03010
sce:YGL201C 1.06268290373857 path:sce03030
sce:YGL201C 1.06268290373857 path:sce04111
sce:YDR468C 1.349167127091 path:sce04130
<img src=http://soap.genome.jp/tmp/mark_pathway_www_api.125119056924788/sce03010.png>
<img src=http://soap.genome.jp/tmp/mark_pathway_www_api.125119057124812/sce03030.png>
<img src=http://soap.genome.jp/tmp/mark_pathway_www_api.125119057324832/sce04111.png>
<img src=http://soap.genome.jp/tmp/mark_pathway_www_api.125119057524854/sce04130.png>
31 genes up regulated. And 0 genes down regulated.
This is the end!
作业完成。下一步,应该探索如何对基因进行功能分类。
评论