Automatically Calculating Masses of Biomolecules
ProMass can automatically calculate the theoretical molecular masses of peptide, protein, and oligonucleotide sequences and display them in the web-based results report. ProMass will automatically evaluate the mass of the BioSequence string and determine if the target mass was found in the data set. This is very useful if you want to confirm the masses of known components. There are 2 options for entering biomolecule sequence information to have ProMass automatically calculate molecular weight. You can either enter the sequence or a sequence file in the BioSequence field of the Xcalibur Sequence Setup window as described below:
First make sure that the BioSequence and Target Info fields are present in your Xcalibur Sequence Setup window. If they are not, read this first.
Specifying a text string: If your sequence is less than 255 characters in length, you can paste the text string directly in the BioSequence field in the Xcalibur sample list. All you need to do is paste your sequence in the BioSequence field in the Xcalibur Sequence Setup window. The sequence string should use single letter codes for either amino acids or nucleotides. Copy your complete sequence (in single letter codes) to the clipboard. Double-click on the BioSequence field and right click your mouse to be able to paste the sequence into the Sequence Setup window.
Specifying a sequence file: ProMass can also read a text-based sequence file. This is particularly useful if your sequence is longer than 255 characters. Simply specify a text file name in the BioSequence field, including the path, which contains your sequence (e.g., C:\Xcalibur\sequence\myoglobin.pep). In order for ProMass to recognize the file, the text file should have the extension .txt, .pep, or .fasta. ProMass will automatically ignore lines in the file that begin with the characters: >, #, or '. This feature
allows you to read fasta format sequences.
Once you have specified a sequence or file, you also need to instruct ProMass what type of sequence is represented. The sequence type is entered in the Target Info field of the Xcalibur Sequence Setup program. To specify a peptide or protein sequence enter one of the following:
sequence = peptide
sequence = protein
sequence = amino acid
To specify an oligonucleotide DNA sequence (single-stranded only), enter one of the following in the Target Info field:
sequence = nucleotide
sequence = oligo
sequence = oligonucleotide
sequence = dna
To specify an oligonucleotide RNA sequence (single-stranded only), enter one of the following in the Target Info field:
sequence = rna
You must specify one of sequence types as shown above or ProMass will not calculate masses. You may also specify termini in the Target Info field. Normally, ProMass assumes that H, and OH must be added to the sequence to calculate a mass of either a polypeptide or oligonucleotide. Therefore, you do not have to specify termini if for example your peptide contains a free amino N-terminus and a free acid at the C-terminus, or if your oligonucleotide contains hydroxyl groups at the 5' and 3' ends. Therefore, entering no termini options is the same as entering:
termini = H, OH
However, if you are analyzing oligonucleotides and you have a phosphate at the 3' end you could enter:
termini = H, H2PO4
ProMass uses a text file to store the masses of the amino acid, nucleotide, termini, and custom groups. The text file is called znova_masses.ini and it can be found in your ZNova install directory (e.g., C:\Program Files\ProMassXcali\ZNova\znova_masses.ini). In our example above, the text strings 'H' and 'H2PO4' have been pre-defined in the znova_masses.ini file to be equal to the masses of a proton and H2PO4, respectively. You can create your own amino acid or nucleotide groups by editing the znova_masses.ini file. You may use single letters (upper or lower case) or numbers to represent amino acids or nucleotides. Termini groups may contain more than one letter or number. Before specifying termini, make sure they have been defined in the znova_masses.ini file.
As an example, we'll use the myoglobin LC/MS file from the Getting Started example. Set up the Xcalibur sequence as described previously in the Getting Started section with the myolcmsdata.raw data file and test.pmd processing method from the ProMass TestData directory. The amino acid sequence of horse myoglobin is shown below:
GLSDGEWQQVLNVWGKVEADIAGHGQEVLIRLFTGHPETLEKFDKFKHLKTEAEMKASEDLKKHGTVVLTALGGILKKKGHHEAELKPLAQSHATKHKIPIKYLEFISDAIIHVLHSKHPGDFGADAQGAMTKALELFRNDIAAKYKELGFQG
Paste this entire sequence in the BioSequence field in the Xcalibur Sequence Setup program. (Note: Xcalibur will terminate input of any field at a carriage return, so you may want to check that the whole sequence gets pasted in. If not, copy the above sequence to a notepad, remove the line-ending character, then copy/paste into the Xcalibur Sequence Setup). Also enter:
sequence = protein
in the Target Info field. Entry of the termini = H, OH in the Target Info field is optional. When you're done, your Xcalibur Sequence Setup should look something like this:
Select Row 1 with the mouse, and hit the Batch Reprocess button in Xcalibur Sequence Setup. Make sure the following options are checked in the batch reprocess dialog box: Qual - Peak detection and integration, Programs, and Replace Sample Info.
When the file is finished processing, display the resulting ProMass summary file in your browser. It should look something like the web page shown below. In the ProMass Browser summary you should see an entry for the myolcmsdata raw file and the first 50 amino acids of the myoglobin sequence in the Sample Comments field of the summary report. Note also that ProMass has automatically considered the calculated mass from the BioSequence string a Target Mass. The Green Result Status confirms that a mass within the user specified Mass Tolerance has been found in the data set.
Click on the sample row to navigate to the chromatogram-level summary for this data file. Note how the myoglobin sequence appears in the report along with the calculated average and monoisotopic masses.
Due to the 255 character length limitation in the BioSequence string, if you need to confirm the masses of larger biomolecules (>255 residues), you will need to specify a text-based sequence file containing your sequence or explicitly define your Target Mass as described in Configuring ProMass to Search for Target Masses.
Related topics: