ITS Research Banner

ITS Health Informatics - BioHPC (High Performance Computing)

For more pages, click the Navigation Menu above

BioHPC is a resource for researchers to provide High Performance Computing for conducting scientific research that take up a tremendous amount of data storage as well as computing power. Solving the world's biggest problems require serious solutions utilizing an advanced computing system.

ITS Systems DiagramHardware:

Biohpc.utmb.edu – A small cluster of machines running Rocks 6.2. It is intended for smaller, less parallel or serial jobs as well as for parallel code development.

This resource is housed on campus and maintained by the Institute of Translational Sciences. It consists of 12 nodes with dual socket (12 cores per node) Intel Xeon X5650 2.67 GHz processors, and the nodes have either 132 or 24 GB of RAM, and are connected via a Mellanox InfiniBand switch to the cluster storage.

Applications:

  • I2b2 (Informatics for Integrating Biology and the Bedside).
  • REDCap (A resource to build and maintain online surveys and databases for data collection)
  • Amber14 and AmberTools14 – A software suite can also be used to carry out complete molecular dynamics simulations, with either explicit water or generalized Born solvent models.
  • Anaconda-2.2 - A high performance distribution of Python and R and includes over 100 of the most popular Python, R and Scala packages for data science.
  • Bowtie2-2.2.5 - An ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences.
  • Cufflinks-2.2.1 - Assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one, taking into account biases in library preparation protocols.
  • FastQC - A Java-based quality control tool for high throughput sequence data.
  • GATK-3.3.0 - This toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping.
  • GenomeBrowse – A free GUI tool that delivers stunning visualizations of your genomic data that give you the power to see what is occurring at each base pair in your samples.
  • Grace-5.1.25 - A WYSIWYG tool to make two-dimensional plots of numerical data. Its capabilities are roughly similar to GUI-based programs like Sigmaplot or Microcal Origin plus script-based tools like Gnuplot or Genplot.
  • Htslib-1.2.1 - An implementation of a unified C library for accessing common file formats, such as SAM, CRAM and VCF, used for high-throughput sequencing data, and is the core library used by samtools and bcftools.
  • Lofreq_star-2.1.2 - A fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It makes full use of base-call qualities and other sources of errors inherent in sequencing (e.g. mapping or base/indel alignment uncertainty), which are usually ignored by other methods or only used for filtering.
  • Ozagordi-shorah-1930ed8 (ShoRAH versions 0.5.1 and 0.8.2)- An open source project for the analysis of next generation sequencing data. It is designed to analyse genetically heterogeneous samples. Its tools are written in different programming languages and provide error correction, haplotype reconstruction and estimation of the frequency of the different genetic variants present in a mixed sample.
  • Picard-tools-1.130 - A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
  • R - a free software environment for statistical computing and graphics.
  • Segemehl_0_2_0 - A software to map short sequencer reads to reference genomes. Unlike other methods, segemehl is able to detect not only mismatches but also insertions and deletions.
  • Subread-1.4.6-p3 – A suite of software programs that provide high-performance read alignment, quantification and mutation discovery.
  • Tophat-2.0.14 – a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.


External software and applications that ITS manages:

BaseCAMP (Tools and methods for better communication and collaboration between projects).
Clarity LIMS Gold (Built for clinical or research, genomics and mass spec laboratories to provide end-to-end workflow tracking and integration).
find-us-on-facebook-badge ITS Logo New NT