TIPseqHunter uses genome assembly GRCh37 (hg19) and can be run with a Docker image or by using individual programs. TIPseqHunter was developed by Java (version 7) and R (version 3.2) languages and tested under Linux operating system and is available to download at: https://github.com/fenyolab/TIPseqHunter Docker image for TIPseqHunter was developed with the stable version of Docker Community Edition (CE) and it may work under any operating system capable to run Docker. However, we recommend the Unix-like operating systems, such as Linux and Mac OS X. Our Docker image is an alternative to the conventional TIPseqHunter program mentioned above. This image version is available at Docker Hub registry (https://hub.docker.com/) and can be downloaded with the Docker client command: docker pull galantelab/tipseqhunter. For further details, check https://github.com/galantelab/tipseq_hunter/blob/master/README.md Testing data and masked and bowtie-built reference genome are available to download at: http://openslice.fenyolab.org/data/tipseqhunter/test_data Docker Prerequisite: The Docker image works as a container and runs exactly the same TIPseqHunter program. Neither downloading of dependencies nor manually setting of software used by TIPseqHunter are required. In order to run this container you will need only need to install Docker. For OS X: https://docs.docker.com/mac/started/ For Linux: https://docs.docker.com/linux/started/ For Windows: https://docs.docker.com/docker-for-windows/ TIPseqHunter Prerequisites: 1. At least 10GB of memory is needed if the number of sequenced read-pairs is greater than 20M. 2. Bowtie 2 alignment software (version 2.2.3 used for testing): http://bowtie-bio.sourceforge.net/bowtie2/index.shtml 3. Samtools software (latest version): http://samtools.sourceforge.net/ 4. Trimmomatic software (version 0.32 used for testing): http://www.usadellab.org/cms/?page=trimmomatic 5. Java packages: sam-1.112.jar, commons-math3-3.4.1.jar, jfreechart-1.0.14.jar, jcommon-1.0.17.jar, itextpdf-5.2.1.jar, biojava3-core-3.0.1.jar 6. R packages: pROC, ggplot2, caret, e1071 Critical: BAM file has to be generated by bowtie2 alignment with "XM" tag Running TIPseqHunter: (1) for quality control, alignment, feature selection, modeling, prediction: ./TIPseqHunterPipelineJar.sh fastq_path output_path fastq_r1 key_r1 key_r2 num_rp Critical: Detailed information is provided in the TIPseqHunterPipelineJar.sh file. Some parameters need to be pre-set. Parameters: fastq_path: path of the fastq files (Note: this is the only path and file name is not included) output_folder: path of the output files (Note: this is the only path and file name is not included) fastq_r1: read 1 file name of paired fastq files key_r1: key word to recognize read-1 fastq file (such as "_1" is the key word for CAGATC_1.fastq fastq file) Critical: key has to be unique in the file name key_r2: key word to recognize read-2 fastq file and replaceable with the read-1 key word to match to read-1 file (such as "_2" is the key word for CAGATC_2.fastq fastq file) Critical: key has to be unique in the file name num_rp: the total number of the read pairs in the paired fastq files (Note: it is the total number of read-pairs, i.e. either the total number of read1 or read2 but not together.) (This number is for normalization purpose) (2) for somatic insertions: TIPseqHunterPipelineJarSomatic.sh repred_path control_path repred_file control_file Critical: Detailed information is provided in the TIPseqHunterPipelineJarSomatic.sh file. Some parameters need to be pre-set. Parameters: repred_path: path of āmodelā folder under output folder control_path: path "TRLocator" folder under output folder repred_file: file with suffix ".repred" and generated from P11 in repred_path (Note: file name should be ending with ".repred".) (such as 302_T_GTCCGC.wsize100.regwsize1.minreads1.clip1.clipflk5.mindis150.FP.uniqgs.bed.csinfo.lm.l1hs.pred.txt.repred) control_file: file with suffix ā.bedā in control_path (Note: file name should be ending with ".bed".) (such as 302_N_GTGAAA.fastq.cleaned.fastq.pcsort.bam.w100.minreg1.mintag1 |