Pittsburgh Supercomputing Center 

Advancing the state-of-the-art in high-performance computing,
communications and data analytics.




Glimmer3 is a system for finding genes in microbial DNA, especially the genomes of bacteria, archaea, and viruses.

Installed on blacklight, biou.

Other resources that may be helpful include:

The Glimmer3 web site (http://ccb.jhu.edu/software/glimmer/)
The Glimmer3 release notes (http://ccb.jhu.edu/software/glimmer/glim302notes.pdf)

Running Glimmer3

1) Make Glimmer3 commands availiable for use
a) blacklight:
The Glimmer3 programs will be made availiable for use through the module command. To load the Glimmer3 module enter:

module load glimmer

b) biou:

The Glimmer3 programs are availiable through the Galaxy instance on biou.

To make the Glimmer3 programs availiable through the command line, csh users should enter the following command:

% source /packages/bin/SETUP_BIO_SOFTWARE

To make the Glimmer3 programs availiable through the command line, bash users should enter the following command:

% source /packages/bin/SETUP_BIO_SOFTWARE

2) Glimmer3 Command line usage:

USAGE: glimmer3 [options] <sequence-file> <icm-file> <tag>

Read DNA sequences in and predict genes in them using the Interpolated Context Model in <icm-file>. Output details go to file <tag>.detail and predictions go to file <tag>.predict


--start_codons <codon-list>
  Use comma-separated list of codons as start codons
  Sample format: -A atg,gtg
  Use -P option to specify relative proportions of use.
  If -P not used, then proportions will be equal
-b <filename>
--rbs_pwm <filename>
  Read a position weight matrix (PWM) from <filename> to identify the
  ribosome binding site to help choose start sites
-C <p>
--gc_percent <p>
  Use <p> as GC percentage of independent model
  Note: <p> should be a percentage, e.g., -C 45.2
-E <filename>
--entropy <filename>
  Read entropy profiles from <filename>. Format is one header line, then 
  20 lines of 3 columns each. Columns are amino acid, positive entropy, 
  negative entropy. Rows must be in order by amino acid code letter
  Use first codon in orf as start codon
-g <n>
--gene_len <n>
  Set minimum gene length to <n>
  Print this message
-i <filename>
--ignore <filename>
  <filename> specifies regions of bases that are off 
  limits, so that no bases within that area will be examined
  Assume linear rather than circular genome, i.e., no wraparound 
-L <filename>
--orf_coords <filename>
  Use <filename> to specify a list of orfs that should be scored 
  separately, with no overlap rules
  <sequence-file> is a multifasta file of separate genes to be scored 
  separately, with no overlap rules
-o <n>
--max_olap <n>
  Set maximum overlap length to <n>. Overlaps this short or shorter 
  are ignored.
-P <number-list>
--start_probs <number-list>
  Specify probability of different start codons (same number & order 
  as in -A option). If no -A option, then 3 values for atg, gtg and ttg 
  in that order. Sample format: -P 0.6,0.35,0.05 
  If -A is specified without -P, then starts are equally likely.
-q <n>
--ignore_score_len <n>
  Do not use the initial score filter on any gene <n> or more base long
  Don't use independent probability score column
-t <n>
--threshold <n>
  Set threshold score for calling as gene to n. If the in-frame 
  score >= <n>, then the region is given a number and considered a
  potential gene.
  Allow orfs extending off ends of sequence to be scored
-z <n>
--trans_table <n>
  Use Genbank translation table number <n> for stop codons
-Z <codon-list>
--stop_codons <codon-list>
  Use comma-separated list of codons as stop codons
  Sample format: -Z tag,tga,taa

3) PBS example files are availiable for running Glimmer3 on blacklight

Stay Connected

Stay Connected with PSC!

facebook 32 twitter 32 google-Plus-icon