- Prof. André de Carvalho
Prof. - - PowerPoint PPT Presentation
Prof. - - PowerPoint PPT Presentation
Prof. Andr de Carvalho ICMC-Universidade de So Paulo
- !"
- #
!"
"$%
&"'"
$($)
- *++,,!!!-"---,
- .
/! *%%0
- %%% $$'$%$'%11'22
$%'$%1'%# %%$ $)'2.0'0$'.#2 $.'031'#$%
*++,,!!!-"---,
- )
%%$ $)'2.0'0$'.#2 $.'031'#$% %% 2')%3'00%'$11 '#$2'22# %%# #1'))#'#12'.2) #%'012'.$2 %%. ..')3)'3.)'$31 .%'1%.'#$0 %%) )1'%#3'3#.'.1 )'%$1'31 %%1 10'%$0'0%'3%) 1.'20#'3.3 %%3 2#'23.'$30'3#% 2%'#22'#2 %%2 00'$$1'.#$'0. 02'212'.1)
- *45"
- 1
%%0+
.#3%5
$%%%
- 3
6++,,!!!--,7-
8%%09
- 2
6++,,!!!--,7-
8%%39
- 6++,,!!!--,7-
- 5
"8/9
:
- $%
: "
- $$
6++,,!!!--,7-
- Emphasis is progressively moving from
data accumulation to data interpretation
Data resulting from sequencing projects
- $
Data resulting from sequencing projects These data needs to be analysed Analysis in Laboratories is difficult and
expensive
Sophisticated computational tools are needed Data mining
;
"
;8;94
< =
- $#
= >>!/ *? :"&*
- =
""
- $.
- "
"!
- :!@
- *"
( (
- $)
& " AB"
"
* +
- $1
+
> =>
nucleotides Amino acids Gene expression
"
- $3
"
&
>
- They are only
assumptions
- $2
- >
=>
- assumptions
"
Recent discoveries contradict this dogma:
RNA can suffer replication in some virus and
plants
- $0
Viral RNA, through an enzyme named reverse
transcriptase, can be transcribed in DNA
DNA can directly produce specific proteins
Without going through the transcription process
"
>'
- /
- %
/
4"
<'
+
:!//'!!"
"@' !"
"
>8 "9
!!
- "
- $
"
""
89'89'89<89
"/
- !<!
"
"
=>8="9
>+
*
- *
"
& "
&<89'89 =>>
"
*"4>
;@
- #
;@
<"!
"
'
- .
- +
'
""4 "% <89
"
- )
- >
- 1
- <
=>
- >
- 3
- <
A C G U C G A G G C C U G A G G U A . . . =>
- >
- 2
- <
A C G U C G A G G C C U G A G G U A . . . => < < =" A C G
- <
> =>
- <
A C G U C G A G G C C U G A G G U A . . . => < =" : ;
- *
*
"
"!;
4"
=
- #%
= ! *4
!
- +
- #$
+
=
;/ @
*4
- #
* = **
< <
- ##
+
" = +
- #.
= +
"
C! +
""
/+
- #)
- >
Gene 1 Gene 2 Condition 1 Condition m
- #1
>
- :"
Gene n
4"
- #3
: & = > *"
!/!"
& &
- #2
& @ /"
>
,
- #0
"" """
<!
=/" *"""
$
*;4"
<
- .%
<
'
!4"D
< (
$
.0 3"
=
- .$
= >!
- &!
<" " <(("
- .
; =!
- /
&
- #
&(
- .#
&(
- *@E
- ..
+'(
- <+/
- 6 "!
,
- &(
F(>>/
6 89
"/8/9
- .)
"/8/9
- *
!
- .1
<
;((8"##9 *E// /"
.0
- .3
%'$
!
"
;*""#
;
"
"
- .2
" *?;'*?=6';'G''#(>>'
- 1#H!)%
$%*
;" ;""
- .0
;" "/! /
- 6
"
- >@
""!"'
/"##
- ?/IJ$+%K
!"
- )%
!" *
!F
- '"
- >!'
- )$
>!'
- 6
- !
!
&"
- )
&"
*4 &
- !
- <
- )#
<
- ("
6
- :
- @
- ).
("
Examples may belong to
more than one class
Pres.
- ))
Simultaneously Temp. P
Cough Aches Fever Cough e Aches Cough e Fever Cough, Aches e Fever Aches e Fever
("
<!
<(""
""(
- )1
""( (" !("
- ("
("
;"(" &("
- )3
&("
& !" ;"
;" ;"
("
;"("
",
"
- )2
Classifier Positive Negative A 1, 2, 3, 6 4, 5 B 1, 3, 5 2, 4, 6 C 4 1, 2, 3, 5, 6 Instance Classes 1 A and B 2 A 3 A and B 4 C 5 B 6 A Multi-label Problem Single-label problem
("
&("
&
- )0
Instance Class 2 A 4 C 5 B 6 A Instance Classes 1 A and B 2 A 3 A and B 4 C 5 B 6 A Multi-label Problem Single-label problem
("
&("
;"8"(!9
- 1%
Instance Class 1 D 2 A 3 D 4 C 5 B 6 A Instance Classes 1 A and B 2 A 3 A and B 4 C 5 B 6 A Multi-label Problem Single-label problem
("
&("
;"
- 1$
Instance Classes 1 A and B 2 A 3 A and B 4 C 5 B 6 A Instance Class 1 A 2 A 3 B 4 C 5 B 6 A Multi-label Problem Single-label problem
("
&("
;"8(9
- 1
Instance Classes 1 A, B 2 A 3 A, B 4 C 5 B 6 A Instance Class 1 B 2 A 3 B 4 C 5 B 6 A Instance Class 1 A 2 A 3 A 4 C 5 B 6 A
Multi-label Problem
Single-label problems
("
&("
;"89
Instance Class 1 B Instance Class 1 A
- 1#
Instance Classes 1 A, B 2 A 3 A, B 4 C 5 B 6 A 1 B 2 A 3 A 4 C 5 B 6 A Instance Class 1 B 2 A 3 B 4 C 5 B 6 A 1 A 2 A 3 A 4 C 5 B 6 A Instance Class 1 A 2 A 3 B 4 C 5 B 6 A
Multi-label Problem
Single-label problems
L
""
("
Class code
- 1.
Class code A B C 1 1 0 1 0 0 1 1 0 0 0 1 0 1 0 1 0 0 Instance Classes 1 A and B 2 A 3 A and B 4 C 5 B 6 A Multi-label Problem Single-label problem
("
=4 "
- 1)
"
- /
("
L((8L9
- 11
L((8L9 ;"(! (<
+
CM
*
*4M 4
- 66
C
.$3 8#2)("9
$%#"
- 13
$%#" "8#.'3#$-)'$2$19.-#,
*4
11 810("9 $$21" "8$1').-)'$3$9$-$),
C
60 Accuracy 60 Accuracy 60 Accuracy
- 12
5 10 15 20 25 30 35 40 45 50 55 60 KNN SVM C4.5 Ripper BN 5 10 15 20 25 30 35 40 45 50 55 60 KNN SVM C4.5 Ripper BN 5 10 15 20 25 30 35 40 45 50 55 60 KNN SVM C4.5 Ripper BN
OAA Cross-Training Label-Powerset
*4
100 Accuracy 100 Accuracy 100 Accuracy
- 10
70 73 76 79 82 85 88 91 94 97 100 KNN SVM C4.5 Ripper BN 70 73 76 79 82 85 88 91 94 97 100 KNN SVM C4.5 Ripper BN 70 73 76 79 82 85 88 91 94 97 100 KNN SVM C4.5 Ripper BN
OAA Cross-Training Label-Powerset
- :,!4("
- 3%
*?"
- *
:
"!+
"" "
- 3$
"
@
N$'$-$'$-'---'/'/-$'/-O
@
:
<"
doença doença Sick Sick
- 3
doença doença Spanish Spanish SARS SARS comum comum Cough Cough Aches Aches Fever Fever Sick Sick Common Common Cough Cough
<
- 3#
89< 8"989
:
<
"
- 3.
"
:!
- <(!
("8(9
:
<
"
="
- 3)
+
*
- +
+
:
L"
- 31
6 "
:
:!
- "
"
- 33
"
+
- +
"
- 89#-.8#9
:
"-$
- 32
Level 1 Level 2
"-$ "-
:
<(!
"
"
P!4'"'
- 30
P!4'"'
- "(!
- +
- +/
:
- 2%
:
("
!
- 2$
- +
5
+
:
- 2
:
("
- 2#
("
"!
- *("
@
:
=!
- 2.
- =
=
#
<!
((=8=9 @
- 2)
@
=
"+
&'!
!4
- ((=8=9
.%()%Q= 3.1$
- 21
$,).,2,)%
@
!
!
10)
,$,.2,23
&
4+ 6!+
6M "
- 23
6M " 6M <(! (
8F'%%#9
:
- 22
- :"
- =
- 20
=
<(!
L
"
- 0%
"
- " /
>!""
- 0$
- >!4
(
"
/!
6'F ;'6 ='&(*
6*@'&(*
- 6*@'&(*
F6'6* 'F; R-*'&(* *' RS'
/!
- 0#