Eur. J. Biochem. 74, 1 - 5 (1977)

Abbreviations and Symbols

Abbreviations are hindrances to readers in fields other than that of the author, to abstractors and to scientists in foreign countries. Therefore their use should be restricted to a miniimum. On the other hand, it is sometimes convenient to use abbreviations or symbols for the names of chemical substances, particularly in equations, tables or figures. The limited use of abbreviations and symbols, of specified meaning is therefore accepted. However, clarity and unambiguity are more important than brevity.

One of the most important areas of biochemistry for which special symbols are essential is that of biopolymers. It is almost impossible to represent the name of even a simple protein, polynucleotide, or polysaccharide except by the use of logical and universally accepted abbreviations. The name of, e.g., one of the chains of insulin, expressed in terms of 30 amino-acid-radical names in order, is so unwieldy as to be useless. The symbolic representation gives the structure in two lines of print.

For some of the most important biochemical reagents, coenzymes, etc., even shorter abbreviations are universally employed, e.g., ATP, NAD, RNA. These abbreviations do not represent a chemical structure in the way that symbols do. The creation of such new abbreviations should therefore be restricted to an absolute minimum.

Other symbols or abbreviations than those listed in the IUPAC-IUB Rules should be used only in those situations where an objective case may be made for necessity; none should be used when pronouns and similar short terms may replace a long word or phrase. They should always be defined in each paper. Such ad hoc abbreviations and symbols should not conflict with known ones, or with the general principles. None should be introduced except when repeated use is required. If, in exceptional circumstances, symbols or abbreviations are used in the Summary, they should be defined in the Summary, as well as in the body of the paper.

There are three main series of symbols for monomeric units, those for amino acids, monosaccharides, and mononucleosides, of which the amino-acid series is the oldest. The monomeric units are generally designated by three letter symbols - a capital followed by two lower-case letters. The abbreviations should not be used for the free monomers in the text of papers.

A standard treatment has been devised for the three groups of macromolecules which are built up from these units. Where the sequence of residues is known, the symbols are written in order and joined by short lines (dashes, hyphens). Where the sequence is not known, the group of symbols, separated by commas, is enclosed in parentheses. Example: Ala-Gly-(Met,Pro)-Lys means that the sequence of methionine and proline is unknown.

Macromolecules composed of repeating sequences may be represented by the prefix 'poly' or the subscript n, both indicating 'polymer of'. The symbols for the monomeric units of the sequence are enclosed in parentheses. Thus, poly(Lys) or (Lys)n is polylysine, poly(Ala-Lys) or (Ala-Lys)n is a linear polymer consisting of alanine and lysine in regular alternating sequence and poly(Ala,Lys) is the irregular random copolymer of equal amounts of these amino acids. Between poly and the parenthesis there is no intervening space or hyphen. The n may be relaced by a definite number, an average (e.g. ), or a range (e.g. 8-12), as appropriate. 'Oligo' may replace 'poly' for short chains.

When other abbreviations for chemical compounds are needed, the maximum use should be made of standard chemical symbols (C, H, O, N, P, S, Na, Cl, etc.), numerical multiples (subscripts 2 and 3, not di or D or T etc., as in Me2SO, Me3Si-) and of trivial names and their symbols (e.g. folate, P, Me, Pr, Bu, Ph, Ac).

Symbols may be combined to represent more complex symbols, such as Tos-Arg-OMe, in which the basic structure (arginine) remains recognisable.

Names of enzymes are not to be abbreviated except in terms of substrates for which accepted abbreviations exist (hence ATPase and RNase, but not LDH, GPDH, ACE, etc.).

Peptide Hormones. The IUPAC-IUB Commission on Biochemical Nomenclature (CBN) has recommended trivial names short enough to make abbreviations unnecessary, e.g. corticotropin (for ACTH), follitropin (for FSH), folliberin (for FSH-RF), etc. (ref 1).

Class names, such as fatty acids, protein, virus, etc., or short terms (poly, furan, folate, etc.) are not to be abbreviated even when an associated term is abbreviated or symbolised (e.g. poly(X), not PX; H4folate, not THF).

No abbreviations should be used for terms such as 'central nervous system', 'red blood cells', or 'extra-cellular fluid'.

The following tables have been compiled to aid authors and readers. They list the symbols and abbreviations proposed in the various CBN documents already published. The biochemical journals accept most of the CBN recommendations.

Table 1. Symbols for amino acids

The symbols preceded by a plus sign may be used without definition. The use of the one-letter abbreviations (in brackets) should be resiricted to comparisons of long sequences in tables, lists, or figures, and for such special use as tagging three-dimensional models of proteins. They should not be used in papers where the single-letter system for nucleoside sequences is employed, as in repeating codons. Di(α-amino acids) are listed in appendix B of reference 2.

NameSymbol
Alanine+ Ala (A)
AllohydroxylysineaHyl
AlloisoleucineaIle
Aminoacid residueAA
2-Aminoadipic acidAad
3-Aminoadipic acidβAad
2-Aminobutyric acidAbu
e-Aminohexanoic acidεAhx
3-Aminopropionic acidβAla
Arginine+ Arg (R)
Asparagine+ Asn (N)
Aspartic acid+ Asp (D)
Aspartic acid or asparagine+ Asx (B)
Cysteine (cf. half-cystine)+ Cys (C)
2,4-Diaminobutyric acidA2bu
2,2-Diaminopimelic acidA2pm
Glutamic acid+ Glu (E)
Glutamine+ Gln (Q)
Glutamic acid or glutamine+ Glx (Z)
Glycine+ Gly (G)
Half-cystine (cf. cysteine)+ Cys
Histidine+ His (H)
HornocysteineHcy
HomoserineHse
Homoserine lactoneHse >
Hydroxylysine+ Hyl
Hydroxyproline+ Hyp
Isoleucine+ Ile (I)
Leucine+ Leu (L)
Lysine+ Lys (K)
Methionine+ Met (M)
NorleucineNle
NorvalineNva
Ornithine+ Orn
Phenylalanine+ Phe (F)
Proline+ Pro (P)
5-Pyrrolidone-2-carboxylic acid (pyroglutamic acid; oxoproline)<Glu
SarcosineSar
Serine+ Ser (S)
Threonine+ Thr (T)
Tryptophan+ Trp (W)
Tyrosine+ Tyr (Y)
Valine+ Val (V)

Table 2. Symbols for substituents of amino acids and of reagents used for their modification

These symbols should be defined

NameSymbol
Acetyl-Ac-
Aminoethyl-Act- or -(CH2)2NH2
Benzhydryl-Bzh- or Ph2CH-
Benzimidazolyl-Bza-
Benzoyl-Bz- or PhCO-
Benzyl-Bzl- or PhCH2-
Benzyloxy--OBzl or -OCH2Ph
Benzyloxycarbonyl-Cbz- or Z-
Benzylthiomethyl-Btm- or PhSCH2-
p-Bromobenzyloxycarbonyl-Z(Br)-
t-Butoxy--OBut
Butoxycarbonyl-Boc- or ButOCO-
Butyl-Bu-
Carbamoyl-Cbm- or NH2CO-
Carbamoylmethyl-Cam- or -CH2CONH2
Carboxymethyl-Cm- or -CH2CO2H
1-Carboxy-2-nitrophenylthio-Nbs-
p-Carboxyphenylmercuri--HgBzOH
3-Carboxypropionyl- (cf. succinyl-)Suc-
p-ChloromercuribenzoatepCl-HgBzO-
Cyanomethoxy--OCH2CN
Cyclohexyl-cHx-
Cyclopentyl-cPe-
Cyclopentyloxycarbonyl-Poc- or cPeOCO
Diazoacetyl-N2Ac- or N2CHCO
DihydroH2
Diisopropyl fluorophosphate(PriO)2PO-F; PriP-F; iPr2P-F, or Dip-F
5-Dimethylaminonaphthalenesulfonyl-Dns- or dansyl-
Dimethyl sulfoxideMe2SO
Dinitrophenyl-N2ph- or Dnp-
Diphenylmethoxy--OBzh or -OCHPh2
Diphenylmethyl-Bzh- or Ph2CH
5,5'-Dithio-bis(2-nitrobenzoic acid)NbS2
Ethoxy--OEt or EtO-
Ethyl-Et
N-EthyimaleimideMalNEt
FluorodinitrobenzeneN2ph-F
Hydroxyethyl--(CH2)2OH
p-Iodophenylsulfonyl-Ips
Isopropylidene-Me2C<
Maleoyl-Mal< or -Mal-
Maleyl-Mal-
Methoxy--OMe
p-Methoxybenzyloxycarbonyl-Z(OMe)-
p-Methoxyphenylazobenzyloxycarbonyl-Mz-
Methyl-Me-
Methylthiocarbamoyl-Mtc- or MeNHCS
p-Nitrobenzyloxycarbonyl-Z(NO2)-
p-Nitrophenoxy--ONp
o-Nitrophenylthio-Nps-
p-Nitrophenylthio-Snp-
Pentyl-Pe-
Phenyl-Ph-
p-Phenylazobenzyloxycarbonyl-Pz-
PhenylisothiocyanatePhNCS
Phenylthiocarbarnoyl-PhNCS- or Ptc-
Phenylthio--SPh
Phenylthiohydantoin>PhNCS
Phosphoric residueP- or -P
Phthaloyl-Pht< or -Pht-
Phthalyl-Pht-
1-Piperidino-oxy--OPip
Pipsyl-(p-iodophenylsulfonyl-) Ips
Propyl-Pr-
8-Quinolyloxy--OQu
Succinimido-oxy--ONSu
Succinyl- (cf. 3-carboxypropionyl-) Suc< or -Suc-
TetrahydroH4
Tetrahydrofuran(yl-)H4furan(-)
Tetrahydropyran(yl-)H4pyran(-)
Tosyl-(p-toluenesulfonyl-) Tos-
Trifluoroacetyl-CF3CO-
Trimethylsilyl-Me3Si-
Triphenylmethyl-Ph3C- or Trt-

Table 3. Symbols for carbohydrates

This table lists the most commonly used symbols for carbohydrates; those preceded by a plus sign may be used without definition. Pyranose and furanose forms are designated where necessary by the suffixes p and f. Configurational symbols D and L (small Roman capital letters) and anomeric prefixes are shown where necessary as prefixes.

CarbohydrateSymbol
N-Acetylneuraminic acidAcNeu
Fructose+ Fru
Galactose+ Gal
Glucose+ Glc
Mannose+ Man
Muramic acidMur
Neuraminic acidNeu
Ribose+ Rib
Sialic acidSia
Derivatives of, e.g., glucose
N-Acety1glucosamineGleNAc
GlueosamineGlcN
2-DeoxyglucosedGlc
Gluconic acidGlcA
Glucuronic acidGlcUA

Note The prefix 'd' indicates a 2-deoxysugar. Other deoxysugars may be designated similarly with a positional numeral, e.g., 3-deoxyglucose: 3-dGlc.

Table 4. Symbols for bases

These symbols should be defined

BaseSymbol
AdenineAde
'a base'Base
CytosineCyt
GuanineGua
HypoxanthineHyp
6-Mercaptopurine (thiohypoxanthine)Shy
OrotateOro
'a purine'Pur
'a pyrimidine'Pyr
ThiouracilSur
ThymineThy
UracilUra
XanthineXan

Table 5. Symbols for nucleosides and nucleolides

The symbols preceded by a plus sign may be used without definition.

Two systems are recognised, one using three-letter symbols for the common nueleosides and a capital italic P for the phosphoric residue, the other using single capital letters for the common nucleosides and a lower-case p for the phosphoric residue. The three letter symbols should be used whenever chemical changes involving nucleosides or nucleotides are being discussed. The one-letter symbols are intended for the nucleoside residues in sequences or partial sequences only; in these they should always be connected by hyphens (for internal phosphodiester 3'-5' linkages) and the terminal phosphoric residue should be indicated by p. The 2'-deoxyribonucleosides are indicated by the prefix 'd'

NucleosideSymbol
Three-letterOne-letter
Adenosine+ Ado+ A
BromouridineBrUrdB
Cytidine+ Cyd+ C
DihydrouridineD or hU
Guanosine+ Guo+ G
Inosine+ Ino+ I
6-Mercaptopurine ribonucleoside (6-thioinosine)SnoM or sI
'a nucleoside'NucN
OrotidineOrdO
Pseudouridine+ ψrd+ ψ or Q (for computer work)
'a purine nucleoside'PuoR
'a pyrimidine nucleoside'PydY
RibosylnicotinamideNir
Ribosylthymine+ Thd+ T
ThiouridineSrdS or sU
Thymidine (2'-deoxyribosylthymine)+ dThd+ dT
Uridine+ Urd+ U
Xanthosine+ Xao+ X
Phosphoric residue-Pp or - (For internal phosphodiester bonds)

Table 6. Symbols for modified bases, sugars, or phosphoric acid residues in polynucleotides

a) Substituents on bases and internal sugars. These symbols, all in lower-case letters, generally precede the nucleoside letter for base substitution and follow the nucleoside letter for sugar substitution. Locants are given as superscript, multipliers as subscripts

NameSymbol
Acetyl-ac
Amino-n
Aminoacyl-aa
Anisoyl-an
Arabinosea (Precedes the nucleoside letter.)
Aza-z
Benzhydryl-bh
Benzoyl-bz
Benzyl-bzl
Bromo-br
Chloro-cl
N-Cyclohexyl-N'-[β-(4-methylmorpholino)amidino]-cmc
Dansyl-dns
Deamino-o
Deaza-c
Deoxyribosed (May precede the nucleoside letter
   or the whole chain, as appropriate.)
Dihydro-h (not h2)
Dimethoxytrityl-dmt
Ethyl-e
Fluoro-fl
Formyl-f
Formylaminoacyl-fa
Hydroxy-ho or oh
Hydroxymethyl-hm
Iodo-io
Isopentyl-i
Lyxosel (Precedes the nucleoside letter.)
Methyl-m
Monomethyltrityl-mmt
Tetrahydropyranyl-thp
Thio-s
Tosyl-tos
Phosphoric residuep (Precedes the nucleoside letter for 5';
   follows the letter for 3'; > or >p for
   2',3'-cyclic phosphoric acid residue;
   replaced by hyphen for internal
   phosphodiester bond.)
Trityl-tr
Xylosex (Precedes the nucleoside letter.)

b) Substituents on terminal sugar hydroxyl groups, and phosphoric acid protecting groups. These symbols, generally placed in parentheses, follow the appropriate nueleoside symbol or adjoin the appropriate symbol for the phosphoric acid residue

NameSymbol
Aminoacyl-(AA)
Anisyl-(MeOPh)
Benzhydryl-(Ph2CH)
Benzyl-(Bzl)
Borate(>BOH)
Carbonyl(>CO)
5'-Cyanoethyl-; 3' (or 5')-cyanoethyl-(CNEt)-; -(CNEt)
Dimethoxytrityl-[(MeO)2Tr]
1-Ethoxyethyl-(EtOEt)
Ethoxymethyl-(EtOMe)
Ethyl-(Et)
Glycyl-(Gly)
Leucyl-(Leu)
Isopropylidene-(>CMe2)
Methyl-(Me)
Monomethoxytrityl-(MeOTr)
Phenyl-(Ph)
Tetrahydropyranyl-(Thp)
Tosyl-(Tos)
Trifluoroacetyl-F3CCO
Trityl-(Tr)

Table 7. Symbols for specific preparations of nucleic acids

These symbols may be used without definition

NameSymbol
Complementary DNA cDNA
Complementary RNA cRNA
Messenger RNAmRNA
Mitochondrial DNAmtDNA
Mitochondrial RNAmtRNA
Nuclear DNAnDNA
Nuclear RNAnRNA
Ribosomal RNArRNA
Transfer RNAtRNA
Specific transfer RNA species
Alanine-accepting tRNAtRNAAla
Aminoacylated alanine-accepting tRNAAla-tRNAAla
Isoacceptor species of alanine-accepting tRNAtRNA1Ala, tRNA2Ala etc.
Methionine-accepting tRNAtRNAMet
Formylatable methionine-accepting tRNAtRNAfMet or tRNAfMet
Formylaminoacylated formylatable methionine-accepting tRNAfMet-tRNAfMet or fMet-tRNAfMet

Table 8. Miscellaneous symbols

These symbols should be defined

NameSymbol
CobalaminCbI
CobamideCba
CobinamideCbi
CorrinCrn
FerredoxinFd
MenaquinoneMKa
PlastoquinorePQa
Phosphoric residueP- or -P
PhylloquinoneKa
Pteroic acid (pteroyl-)Pte
Pteroylglutamic acidbPteGlu
Pyridoxyl-Pxy
PyridoxylidenePxd=
TocopherolTa
TocopherolquinoneTQa
Nα-Tosylarginine methyl esterTos-Arg-OMe
N-Tosylphenylalanine chloromethyl ketonecTos-PheCH2Cl
UbiquinoneQ

a See ref 3 for the special application of these symbols.

b Folate and folyl- are not abbreviated.

c Correctly (2-phenyl-1-tosylamido)ethyl chloromethyl ketone, or chloro-(N-tosylphenylalanyl)methane.

Table 9. Abbreviations for semisystematic or trivial names

Those abbreviations preceded by a plus sign may be used without definition. The preceding tables list alternative symbols that may be preferred by some journals. Trivial names for peptide hormones have been recommended (ref 1)

NameAbbreviation
Acetyl-coenzyme A+ CoASAc
Adenosine 5'-mono, di, and triphosphates+ AMP, ADP, and ATPa
O-(Carboxymethyl)-celluloseCM-cellulose
Coenzyme A+ CoA(orCoASH)
Corticotropin (adrenocorticotropin, adrenocorticotropic hormone)ACTH
Cytidine 5'-mono-, di-, and triphosphates+ CMP, CDP, and CTPa
Deoxyribonucleic acid, or deoxyribonucleate+ DNA
O-(Diethylaminoethyl)-cellulose+ DEAE-cellulose
3,4-DihydroxyphenylalanineDOPA
Di-isopropyl fluorophosphateDFP
2,3-DimercaptopropanolBAL
2,4-Dinitrophenyl-DNP-
Diphosphopyridine nucleotideDPN
Diphosphothiamin (thiamin pyrophosphate)DPT
Ethylenediamine tetraacetate+ EDTA
Flavin-adenine dinucleotide+ FAD
Fluoro-2,4-dinitrobenzeneFDW
Glutathione and its oxidised form+ GSH, GSSG
Guanosine 5'-mono-, di-, and triphosphates+ GMP, GDP, and GTPa
Haemoglobin, carbon monoxide haemoglobin, oxyhaemoglobinHb, HbCO, HbO2
Inorganic orthophosphate+ Pi
Inorganic pyrophosphate+ PPi
Inosine 5'-mono-, di-, and triphosphates+ IMP, IDP, and ITPa
Melanotropin (melanocyte-stimulating hormone)MSH
Methemoglobin, metmyoglobinMetHb, MetMb
Myoglobin, carbon monoxide myoglobin, oxymyoglobinMb, MbCO, MbO2
Nicotinamide-adenine dinucleotide and its oxidised and reduced forms+ NAD, NAD+, and NADH
Nicotinamide-adenine dinucleotide phosphate and its oxidised and reduced forms+ NADP, NADP+, and NADPH
Nicotinamide mononucleotide+ NMN
Riboflavin 5'-phosphate+ FMN
Ribonucleic acid or ribonucleate+ RNA
Ribosylthymine 5'-mono-, di-, and triphosphates+ TMP, TDP, and TTPa
Thymidine 5'-mono-, di-, and triphosphates+ dTMP, dTDP, and dTTPa
1,1,1-Trichloro-2,2-bis(p-chlorophenyl)ethaneDDT
O-(Triethylaminoethyl)-celluloseTEAE-cellulose
Triphosphopyridine nucleotideTPN
Tris(hydroxymethyl)aminomethane+ Tris
Uridine 5'-mono-, di-, and triphosphates+ UMP, UDP, and UTPa
Uridinediphosphoglucose+ UDPG

a The d prefix may be used to represent the corresponding deoxyribonucleoside phosphates, e.g. dADP. The various isomers of adenosine monophosphate may be written 2'-AMP, 3'-AMP, or 5'-AMP (in case of possible ambiguity). A similar procedure may be applied to other nucleoside or deoxyribonucleoside monophosphates.

References

1 IUPAC-IUB Commission on Biochemical Nomenclature (1975) Nomenclature of Peptide Hormones, Recommendations, 1974, Eur. J. Biochem. 55, 485-486.

2. IUPAC Commission on the Nomenclature of Organic Chemistry and IUPAC-IUB Commission on Biochemical Nomenclature (1975) Nomenclature of a-Amino Acids, Recommendations, 1974, Eur. J. Biochem. 53, 1-14.

3. IUPAC-IUB Commission on Biochemical Nomenclature (1975) Nomenclature of Quinones with lsoprenoid Side-chains, Recommendations, 1973, Eur. J. Biochem. 53, 15-18.