实用的贴士:
* 概述
Illumina
FASTQ格式:
@HWI-EAS59:1:1:0:899#0/1
NAGTAAATCCCTACTTGAATTCGAGCACTGCAACAAACTA
+HWI-EAS59:1:1:0:899#0/1
DNWWZYPYZMKNYS[SBBBBBBBBBBBBBBBBBBBBBBBB
@HWI-EAS59:1:1:1:449#0/1
NGGGAATGATTAATTCCACAAAACAAAAGAAAGAAGTCGG
+HWI-EAS59:1:1:1:449#0/1
DIZZZ[YXS[U[XTWYTXXXRQLGY[YWPPZSXY[BBBBB
@HWI-EAS59:1:1:1:1018#0/1
NGTAGATAAAAAAATAAACTCAAATAAATTAAAGAGAATC
+HWI-EAS59:1:1:1:1018#0/1
DMPTQS[XYYZ[QQQOMJQLRTSWBBBBBBBBBBBBBBBB
@HWI-EAS59:1:1:1:805#0/1
NAGTATGCTATATTATGATATGTTATGAGATGTTATGTTT
+HWI-EAS59:1:1:1:805#0/1
DNUOMJTTXRRLYOBBBBBBBBBBBBBBBBBBBBBBBBBB
@HWI-EAS59:1:1:1:1371#0/1
AGAGATAGTAAAATCTCATAAATTACTATCAATTCATTCA
+HWI-EAS59:1:1:1:1371#0/1
a\aVb`bZYZ_a`aaP[`TGYTYUaZF_Y`a_W]X[HK^P
@HWI-EAS59:1:1:1:1278#0/1
ATAATAATAAAATATAACTGGTATGTTTATTTATTTATTA
+HWI-EAS59:1:1:1:1278#0/1
``aaaa\a`]]JZaXV`a^VR_^YFFN`a_aGQV_STBBB
@HWI-EAS59:1:1:1:907#0/1
ATAATATCAATAAAAAGAAACAACGACAACCTATAAGCAC
+HWI-EAS59:1:1:1:907#0/1
aba`a_a``]Raaa]ZR^aaa_aTQ_aa\J_WQZKRHK_\
@HWI-EAS59:1:1:1:154#0/1
TGATTAAATGCAAATTTAATTTAAAGAACAGCTGAATAAT
+HWI-EAS59:1:1:1:154#0/1
_^^bb`[U___[aa]`\`VX]a[SFODYWGZbBBBBBBBB
The syntax of Solexa/Illumina read format is almost identical to the FASTQ format, but the qualities are scaled differently. Given a character $sq, the following Perl co
$Q = 10 * log(1 + 10 ** (ord($sq) - 64) / 10.0)) / log(10);
The ASCII charactars in Solexa FASTQ means:
Co
CHAR
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
_
`
a
b
c
d
e
f
g
h
;
<
=
>
?
@
In contrast to Solexa FASTQ quality, the ASCII characters in standard (sanger) FASTQ, it used to denote:
Co
CHAR
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
"
"
"
"
"
"
#
#
$
$
%
%
&
&
'
(
)
*
+
+
,
-
.
/
0
1
2
3
4
5
6
7
8
9
:
;
<
=
>
?
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
_
`
a
So it is easy to conver Solexa->Sanger quality, you just need to build a conversion table in PERL script, just like this:
# Solexa->Sanger quality conversion table
my @conv_table;
for (-64..64) {
$conv_table[$_+64] = chr(int(33 + 10*log(1+10**($_/10.0))/log(10)+.499));
}
所以你首先要确定你测序得到的数据是来自什么测序中心和测序技术得到的,并需探讨用什么质控(quality check)和序列修饰(trimming)手段。
评论