Insulin is a linear chain of amino acids.

Structurally, insulin is a protein.  Proteins are linear chains where each link in the chain is an amino acid.    The body, and all living things, have twenty chemically different amino acids to choose from.  By convention, each amino acid is represented by one letter of the alphabet. (B,J,O,U, X, and Z are the six unused letters.)  So, a protein can be accurately represented by a string of these letters.  The active form of insulin has 51 amino acids, but a larger form is initially synthesized.

Insulin is initially synthesized in beta cells as an 110 amino acid long protein.  Its initial sequence is

MALWMRLLPL LALLALWGPD PAAAFVNQHL CGSHLVEALY LVCGERGFFY TPKTRREAED LQVGQVELGG GPGAGSLQPL ALEGSLQKRG IVEQCCTSIC SLYQLENYCN

And this is called preproinsulin.  This 110 amino acid long protein then undergoes processing.  Three cuts are made inside the chain, splitting it into four different chains.  These are called chain A (shown in yellow), chain B (shown in green), chain C (shown in grey) and the signal peptide.  Chain A has 21 amino acids, Chain B has 30 amino acids, chain C has 31 amino acids and the signal peptide has 24 amino acids. An additional 4 amino acids are lost in during the split.

The A Chain has a sequence:

GIVEQCCTSI CSLYQLENYC N

The B chain is :

FVNQHLCGSH LVEALYLVCG ERGFFYTPKT

The final form of insulin is made by linking, through special bridges called disulfide bonds, the A and B chains.  The C chain, also called the C-peptide, is released into the bloodstream and is itself a signaling molecule.  Its exact function is not known, but one hypothesis is that the lack of this c-peptide in patients who take daily insulin may contribute to the long-term complications of diabetes.

Insulin’s particular sequence of amino acids was first worked out by Dr Frederick Sanger.  Sanger was a British biochemist.  When he began working to understand the structure of insulin, little was known about how proteins were formed.  It was known that proteins were very large molecules composed of amino acids, but the structure was unknown.  One possibility is that a protein had a tree-like structure, where each protein had a central trunk and amino acids were hooked together like branches from the trunk.  Sanger showed that all proteins had a precise linear sequence of amino acids, and he determined the exact sequence of amino acids for the A and B chains of insulin. Insulin is one of the smallest proteins of major significance.  For this work, he was awarded the 1958 Nobel Prize in Chemistry.  (Sanger went on to work out a chemical method of determining the sequence of bases in DNA.  He was awarded a second Nobel Prize in Chemistry (1980) for his work on DNA.)

Under normal conditions, the linear chain of a protein folds up into a distinctive three dimensional shape.  This three dimensional shape is determined by the precise sequence of amino acids.  All proteins with the same sequence of amino acids will have just about the same three dimensional shape.  This shape is thought to be essential to its function.  Recall that insulin is a messenger.  It functions by communicating a message to a receptor.  The receptor recognizes insulin by its shape.

 

Insulin is far too small to see its shape with a light microscope or even an electron microscope. Insulin is only about 10 nm across.  The shape of an insulin protein was determined instead by exposing crystals of insulin to a beam of x-rays and studying the scattering pattern of the crystal. This technique, called x-ray crystallography, allows one to infer the position of the individual atoms within the insulin protein.

Insulin is an ancient molecule, evolutionary speaking.  Insulin is found in birds, fishes, alligators, snakes, newts, and all mammals.  One can compare the amino acid sequence between different species to measure how close the two species are.  For instance, the amino acid sequence for pigs and dogs are exactly the same.  These both differ from human insulin by zero on the A chain and one on the B chain.  Bovine insulin, from cows, differ from human insulin by two amino acids on the A chain and one amino acid on the B chain.  These small changes are not enough to make significant physiological differences.  Bovine and pork insulin were used in humans for over 50 years and many diabetic patients relied on these forms of insulin with no apparent problems.