Built-in and custom alphabets

Alphabet overview

LogoJS combines sets of individual glyphs into alphabets. An alphabet is an array of symbols, each of which should have a unique color and glyph combination. Symbols are not restricted to just one glyph; if you are rendering a dinucleotide logo, for example, you might use symbols with two glyphs each. Alphabets can also have multiple symbols that use the same glyph with different colors (for example one red A and one gray A).

When you are building your PWM, the order of the columns corresponds to the order of the symbols in your alphabet array.

The symbol objects in an alphabet have the following fields:

  • color an array of colors to use for each letter in the symbol. If the symbol is only one letter, a single value may be used in place of an array.
  • component an array of components used to render the glyphs in this symbol. Should be the same length as color. If your alphabet uses custom glyphs, import your custom glyph components and use them here. If the symbol is only one letter, a single component may be used in place of an array. If you are only using letters and digits which are built in to LogoJS, you can use the regex field instead, and LogoJS will populate this field for you using the loadGlyphComponents function.
  • regex a string representing the sequence of letters in this symbol. This field is not required, but may be used as a shorthand rather than explicitly including components in the component field. LogoJS will populate the components field for you automatically with matching built-in letters and digits if you leave it empty.

Built-in alphabets

LogoJS provides built-in alphabets for common use cases for convenience. If you need to render a custom logo with these symbol sets, you can import these alphabets rather than build them yourself. In React, these can be imported directly from the logojs package; without React, they are accessible under the logojs namespace (i.e. logojs.DNAAlphabet).

The DNAAlphabet renders logos with a DNA symbol set. A is red, C is blue, G is gold, and T is green; columns in the PWM are in that order.

The RNAAlphabet renders logos with an RNA symbol set. A is red, C is blue, G is gold, and U is green; columns in the PWM are in that order.

The ProteinAlphabet renders logos with a protein symbol set. Acidic amino acids are red, basic amino acids are blue, and non-polar amino acids are black. B is used for D or N and Z is used for E or Q; both are gold.

The CompleteAlphabet includes the capital letters A-Z, then the lower case letters a-z, then the digits 0-9, all in that order and with custom colors. This can be used for experimentation with different symbols.

Custom alphabets

To make a custom alphabet, simply create a custom array of symbol objects as described above. If you have custom React components for custom glyphs not built-in to LogoJS, you can include them in the components field. The following is an example of a custom alphabet with M and W representing methylated CpG on the plus and minus strands (the syntax below first includes the core of the DNAAlphabet, then extends it with M and W):

import { DNAAlphabet, loadGlyphComponents } from 'logojs';

export const METHYL_ALPHABET = loadGlyphComponents([
   ...DNAAlphabet,
   { color: "#880088", regex: "M" },
   { color: "#888800", regex: "W" }
]);

Alphabet utilities

LogoJS provides two utility functions to make generating custom alphabets easier in particular use cases.

The loadGlyphComponents function reads the optional regex field of each symbol in a custom alphabet and automatically populates the corresponding component field with built-in glyphs from LogoJS. If a symbol has a component field already but has no regex field, it will be left unchanged; however, if it has a regex field and a components field the contents of the components field will be overwritten. The regex field must only contain the letters A-Z and a-z and the digits 0-9. The function takes the following argument:

  • alphabet the custom alphabet, containing regex fields for each symbol; this parameter remains unchanged, and a copy with component fields is returned.

The disymbolAlphabet function takes a custom alphabet and generates a new custom alphabet with every possible pairing of symbols from the original. Colors for individual letters are retained. For example, given the DNAAlphabet as input, this function would generate a new alphabet with the symbols AA, AC, AG, AT, CA, … TT. The function takes a single argument:

  • alphabet the input alphabet; remains unchanged. A new disymbol alphabet is returned.