ANNOVAR input with new prefix and new direcotry for output

To change prefix, please use option -p or --prefix and to change directory where your output will be generated, please use option --outputdir. Note that if you have other forms of input, such as VCF format and BED format, the syntax is the same.

[cocodong@biocluster ~/]$ head input.txt 
1   12919840    12919840    T   C
1   35332717    35332717    C   A
1   55148456    55148456    G   T
1   70504789    70504789    C   T
1   167059520   167059520   A   T
1   182496864   182496864   A   T
1   197073351   197073351   C   T
1   216373211   216373211   G   T
10  37490170    37490170    G   A
10  56089432    56089432    A   C
[cocodong@biocluster ~/]$ icages.pl input.txt -p newname --outputdir newoutputdir

ANNOVAR input annotated with hg38

To change database version, please use option --buildver. Note that if you have other forms of input, such as VCF format and BED format, the syntax is the same.

[cocodong@biocluster ~/]$ head input.txt 
1   12919840    12919840    T   C
1   35332717    35332717    C   A
1   55148456    55148456    G   T
1   70504789    70504789    C   T
1   167059520   167059520   A   T
1   182496864   182496864   A   T
1   197073351   197073351   C   T
1   216373211   216373211   G   T
10  37490170    37490170    G   A
10  56089432    56089432    A   C
[cocodong@biocluster ~/]$ icages.pl input.txt --buildver hg38

VCF input with one sample which contains both germline mutations and mutations in his/her tumor

If you do not have somatic mutations for one sample in VCF file, but what you have is a VCF file that contains both germline mutations and mutations in cancer for this sample, then you can specify the headers for germline mutations using options -t or --tumor and specify the headers for tumor mutations using options -g or --germline. iCAGES will be able to extract somatic mutations from this VCF file and carry on downstream analysis for you. In this example, the input file is a VCF file that contains tumor mutations with header "tumor" and germline mutations with header "germline", all annotated with reference genome version of hg19.

[cocodong@biocluster ~/]$ cat input.vcf
##fileformat=VCFv4.1
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##contig=<ID=1,length=249250621,assembly=b37>
##contig=<ID=2,length=243199373,assembly=b37>
##contig=<ID=3,length=198022430,assembly=b37>
##contig=<ID=4,length=191154276,assembly=b37>
##contig=<ID=5,length=180915260,assembly=b37>
##contig=<ID=6,length=171115067,assembly=b37>
##contig=<ID=7,length=159138663,assembly=b37>
##contig=<ID=8,length=146364022,assembly=b37>
##contig=<ID=9,length=141213431,assembly=b37>
##contig=<ID=10,length=135534747,assembly=b37>
##contig=<ID=11,length=135006516,assembly=b37>
##contig=<ID=12,length=133851895,assembly=b37>
##contig=<ID=13,length=115169878,assembly=b37>
##contig=<ID=14,length=107349540,assembly=b37>
##contig=<ID=15,length=102531392,assembly=b37>
##contig=<ID=16,length=90354753,assembly=b37>
##contig=<ID=17,length=81195210,assembly=b37>
##contig=<ID=18,length=78077248,assembly=b37>
##contig=<ID=19,length=59128983,assembly=b37>
##contig=<ID=20,length=63025520,assembly=b37>
##contig=<ID=21,length=48129895,assembly=b37>
##contig=<ID=22,length=51304566,assembly=b37>
##contig=<ID=X,length=155270560,assembly=b37>
##contig=<ID=Y,length=59373566,assembly=b37>
##contig=<ID=MT,length=16569,assembly=b37>
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  tumor   germline
1   12919840    .   T   C   .   .   .   GT  1|1 0|0
1   35332717    .   C   A   .   .   .   GT  1|1 0|0
1   55148456    .   G   T   .   .   .   GT  1|1 0|0
1   70504789    .   C   T   .   .   .   GT  1|1 0|0
1   167059520   .   A   T   .   .   .   GT  1|1 0|0
1   182496864   .   A   T   .   .   .   GT  1|1 0|0
1   197073351   .   C   T   .   .   .   GT  1|1 0|0
1   216373211   .   G   T   .   .   .   GT  1|1 0|0
10  37490170    .   G   A   .   .   .   GT  1|1 0|0
10  56089432    .   A   C   .   .   .   GT  1|1 0|0
...
[cocodong@biocluster ~/]$ icages.pl input.vcf -t tumor -g germline

VCF input with multiple samples which contains both germline mutations and tumor mutations

iCAGES is a personalized cancer driver analysis pipeline, so it only does analysis for ONE single patient. But if what you have is a VCF file that contains both germline mutations and tumor mutations for multiple individuals, then you can specify the headers for germline mutations for the patient of your interest using options -t or --tumor and specify the headers for tumor mutations for the patient of your interest using options -g or --germline. iCAGES will be able to extract somatic mutations for this particular individual from this VCF file and carry on downstream analysis for you. In this example, the input file is a VCF file that contains mutations from two individuals Sapmle1 and Sample2, each of them have both tumor mutations and germline mutations with slightly different headers, all annotated with reference genome version of hg19. By specifying headers for Sample1, iCAGES analyzes this sample for you.

[cocodong@biocluster ~/]$ cat input.vcf
##fileformat=VCFv4.1
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##contig=<ID=1,length=249250621,assembly=b37>
##contig=<ID=2,length=243199373,assembly=b37>
##contig=<ID=3,length=198022430,assembly=b37>
##contig=<ID=4,length=191154276,assembly=b37>
##contig=<ID=5,length=180915260,assembly=b37>
##contig=<ID=6,length=171115067,assembly=b37>
##contig=<ID=7,length=159138663,assembly=b37>
##contig=<ID=8,length=146364022,assembly=b37>
##contig=<ID=9,length=141213431,assembly=b37>
##contig=<ID=10,length=135534747,assembly=b37>
##contig=<ID=11,length=135006516,assembly=b37>
##contig=<ID=12,length=133851895,assembly=b37>
##contig=<ID=13,length=115169878,assembly=b37>
##contig=<ID=14,length=107349540,assembly=b37>
##contig=<ID=15,length=102531392,assembly=b37>
##contig=<ID=16,length=90354753,assembly=b37>
##contig=<ID=17,length=81195210,assembly=b37>
##contig=<ID=18,length=78077248,assembly=b37>
##contig=<ID=19,length=59128983,assembly=b37>
##contig=<ID=20,length=63025520,assembly=b37>
##contig=<ID=21,length=48129895,assembly=b37>
##contig=<ID=22,length=51304566,assembly=b37>
##contig=<ID=X,length=155270560,assembly=b37>
##contig=<ID=Y,length=59373566,assembly=b37>
##contig=<ID=MT,length=16569,assembly=b37>
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Sample1Tumor    Sample1Germline Sample2Tumor    Sample2Germline
1   12919840    .   T   C   .   .   .   GT  1|1 0|0 1|1 0|0
1   35332717    .   C   A   .   .   .   GT  1|1 0|0 1|1 0|0
1   55148456    .   G   T   .   .   .   GT  1|1 0|0 1|1 0|0
1   70504789    .   C   T   .   .   .   GT  1|1 0|0 1|1 0|0
1   167059520   .   A   T   .   .   .   GT  1|1 0|0 1|1 0|0
1   182496864   .   A   T   .   .   .   GT  1|1 0|0 1|1 0|0
1   197073351   .   C   T   .   .   .   GT  1|1 0|0 1|1 0|0
1   216373211   .   G   T   .   .   .   GT  1|1 0|0 1|1 0|0
10  37490170    .   G   A   .   .   .   GT  1|1 0|0 1|1 0|0
10  56089432    .   A   C   .   .   .   GT  1|1 0|0 1|1 0|0
...
[cocodong@biocluster ~/]$ icages.pl input.vcf -t Sample1Tumor -g Sample1Germline

VCF input with multiple samples which contains only somatic mutations

Again, iCAGES is a personalized cancer driver analysis pipeline, so it only does analysis for ONE single patient. But if what you have is a VCF file that contains somatic mutations for multiple individuals, then you can specify the header for the patient of your interest using options -i or --id. iCAGES will be able to extract somatic mutations for this particular individual from this VCF file and carry on downstream analysis for you. In this example, the input file is a VCF file that contains somatic mutations from two individuals Sapmle1 and Sample2, all annotated with reference genome version of hg19. By specifying header for Sample1, iCAGES analyzes this sample for you.

[cocodong@biocluster ~/]$ cat input.vcf
##fileformat=VCFv4.1
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##contig=<ID=1,length=249250621,assembly=b37>
##contig=<ID=2,length=243199373,assembly=b37>
##contig=<ID=3,length=198022430,assembly=b37>
##contig=<ID=4,length=191154276,assembly=b37>
##contig=<ID=5,length=180915260,assembly=b37>
##contig=<ID=6,length=171115067,assembly=b37>
##contig=<ID=7,length=159138663,assembly=b37>
##contig=<ID=8,length=146364022,assembly=b37>
##contig=<ID=9,length=141213431,assembly=b37>
##contig=<ID=10,length=135534747,assembly=b37>
##contig=<ID=11,length=135006516,assembly=b37>
##contig=<ID=12,length=133851895,assembly=b37>
##contig=<ID=13,length=115169878,assembly=b37>
##contig=<ID=14,length=107349540,assembly=b37>
##contig=<ID=15,length=102531392,assembly=b37>
##contig=<ID=16,length=90354753,assembly=b37>
##contig=<ID=17,length=81195210,assembly=b37>
##contig=<ID=18,length=78077248,assembly=b37>
##contig=<ID=19,length=59128983,assembly=b37>
##contig=<ID=20,length=63025520,assembly=b37>
##contig=<ID=21,length=48129895,assembly=b37>
##contig=<ID=22,length=51304566,assembly=b37>
##contig=<ID=X,length=155270560,assembly=b37>
##contig=<ID=Y,length=59373566,assembly=b37>
##contig=<ID=MT,length=16569,assembly=b37>
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Sample1 Sample2
1   12919840    .   T   C   .   .   .   GT  1|1 0|0
1   35332717    .   C   A   .   .   .   GT  1|1 1|1
1   55148456    .   G   T   .   .   .   GT  1|1 0|0
1   70504789    .   C   T   .   .   .   GT  1|1 1|1
1   167059520   .   A   T   .   .   .   GT  1|1 0|0
1   182496864   .   A   T   .   .   .   GT  0|0 0|0
1   197073351   .   C   T   .   .   .   GT  1|1 1|1
1   216373211   .   G   T   .   .   .   GT  1|1 0|0
10  37490170    .   G   A   .   .   .   GT  1|1 0|0
10  56089432    .   A   C   .   .   .   GT  1|1 0|0
...
[cocodong@biocluster ~/]$ icages.pl input.vcf -i Sample1

VCF input with multiple samples and BED files with additional structural variations

VCF has immaure development of annotation on structural variations. In order to better annotate personal cancer mutation profiles, we made iCAGES to support additional BED file input, which profiles structural variations, using options -b or --bed. iCAGES will be able to combine information from VCF files and BED files to do downstream data analysis for you. In this example, the input files are a VCF file that contains somatic mutations from two individuals Sapmle1 and Sample2, all annotated with reference genome version of hg19 and a BED file that contains coordinates of structural varations. This exemplary BED file is also provided in the package.

[cocodong@biocluster ~/]$ cat input.vcf
##fileformat=VCFv4.1
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##contig=<ID=1,length=249250621,assembly=b37>
##contig=<ID=2,length=243199373,assembly=b37>
##contig=<ID=3,length=198022430,assembly=b37>
##contig=<ID=4,length=191154276,assembly=b37>
##contig=<ID=5,length=180915260,assembly=b37>
##contig=<ID=6,length=171115067,assembly=b37>
##contig=<ID=7,length=159138663,assembly=b37>
##contig=<ID=8,length=146364022,assembly=b37>
##contig=<ID=9,length=141213431,assembly=b37>
##contig=<ID=10,length=135534747,assembly=b37>
##contig=<ID=11,length=135006516,assembly=b37>
##contig=<ID=12,length=133851895,assembly=b37>
##contig=<ID=13,length=115169878,assembly=b37>
##contig=<ID=14,length=107349540,assembly=b37>
##contig=<ID=15,length=102531392,assembly=b37>
##contig=<ID=16,length=90354753,assembly=b37>
##contig=<ID=17,length=81195210,assembly=b37>
##contig=<ID=18,length=78077248,assembly=b37>
##contig=<ID=19,length=59128983,assembly=b37>
##contig=<ID=20,length=63025520,assembly=b37>
##contig=<ID=21,length=48129895,assembly=b37>
##contig=<ID=22,length=51304566,assembly=b37>
##contig=<ID=X,length=155270560,assembly=b37>
##contig=<ID=Y,length=59373566,assembly=b37>
##contig=<ID=MT,length=16569,assembly=b37>
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Sample1 Sample2
1   12919840    .   T   C   .   .   .   GT  1|1 0|0
1   35332717    .   C   A   .   .   .   GT  1|1 1|1
1   55148456    .   G   T   .   .   .   GT  1|1 0|0
1   70504789    .   C   T   .   .   .   GT  1|1 1|1
1   167059520   .   A   T   .   .   .   GT  1|1 0|0
1   182496864   .   A   T   .   .   .   GT  0|0 0|0
1   197073351   .   C   T   .   .   .   GT  1|1 1|1
1   216373211   .   G   T   .   .   .   GT  1|1 0|0
10  37490170    .   G   A   .   .   .   GT  1|1 0|0
10  56089432    .   A   C   .   .   .   GT  1|1 0|0
...
[cocodong@biocluster ~/]$ cat input.bed
chr10   89677000    89690000
chr8    38336000    38353000
[cocodong@biocluster ~/]$ icages.pl input.vcf -i Sample1 -b input.bed