BMRB Logo

Biological Magnetic Resonance Data Bank


A Repository for Data from NMR Spectroscopy on Proteins, Peptides, Nucleic Acids, and other Biomolecules
Member of WWPDB

This page made possible by:

CHTC Logo OSG Logo CS-Rosetta Logo

Chemical shifts:

Chemical shifts may be deposited in STAR or TALOS format.

TALOS format:

REMARK Text on this line is optional, but include at least one newline before the data sequence.

DATA SEQUENCE PGARQE

VARS   RESID RESNAME ATOMNAME SHIFT
FORMAT %4d   %1s     %4s      %8.3f

1 P    C  175.432
1 P    H    8.239
1 P    N  120.543
1 P   CA   57.946
1 P   CB   67.178
2 G    C  172.203
2 G    H    8.000
...
6 E    C  170.123

NMR-STAR format:

If they are deposited in STAR format, they should be in the 3.1 format. You can either submit a full NMR-STAR file, or just the chemical shift saveframe. If you only submit the chemical shift saveframe ensure to include the residue sequence using the "_Assigned_chem_shift_list.Polymer_seq_one_letter_code" tag. Here is an example of the minimum needed to submit in NMR-STAR format.


_Assigned_chem_shift_list.Polymer_seq_one_letter_code AQQSPY

loop_
    _Atom_chem_shift.ID
    _Atom_chem_shift.Comp_index_ID
    _Atom_chem_shift.Comp_ID
    _Atom_chem_shift.Atom_ID
    _Atom_chem_shift.Val
    _Atom_chem_shift.Entity_ID

    1     1    ALA    HA    4.02   1
    2     1    ALA    HB1   1.48   1
    3     1    ALA    HB2   1.48   1
    4     1    ALA    HB3   1.48   1
    5     2    GLN    HA    4.36   1
    6     2    GLN    HB2   2.1    1
    7     2    GLN    HB3   1.98   1
    8     2    GLN    HG2   2.37   1
    9     2    GLN    HG3   2.37   1
    10    3    GLN    HA    4.36   1
    11    3    GLN    HB2   1.98   1
    12    3    GLN    HB3   2.1    1
    13    3    GLN    HG2   2.37   1
    14    3    GLN    HG3   2.37   1
    15    4    SER    HA    4.78   1
    16    4    SER    HB2   3.92   1
    17    4    SER    HB3   3.85   1
    18    5    PRO    HA    4.4    1
    19    5    PRO    HB2   1.72   1
    20    5    PRO    HB3   2.2    1
    21    5    PRO    HG2   1.82   1
    22    5    PRO    HG3   1.95   1
    23    5    PRO    HD2   3.64   1
    24    5    PRO    HD3   3.75   1
    25    6    TYR    HA    4.56   1
    26    6    TYR    HB2   2.94   1
    27    6    TYR    HB3   3.07   1
    28    6    TYR    HD1   7.12   1
    29    6    TYR    HD2   7.12   1
    30    6    TYR    HE1   6.84   1
    31    6    TYR    HE2   6.84   1
stop_

Fragment generation failures:

If you receive a fragment generation failed error, it is most likely due to one or more of the following

  • The protein may be too small for meaningful fragments to be selected.
  • The protein may have an unusual structure that does not have enough matching fragments.
    The CS-Rosetta FAQ has more details on this type of failure.
  • The structure contains a non-standard residue.
  • The submitted data file is of an unusable format. Please reference the format help on this page.

Flexible tail exclusion

By default flexible tails are trimmed before structure generation. Sometimes the automatic process by which this happens fails, and the run proceeds with all original residues. You will receive an email if this occurs.

Constraints:

Constraints may be submitted using the Rosetta constraint format. Below are examples of NOE distance and dihedral angle constraints in the Rosetta format. You can mix constraints of different types in one file.

Caveats to be aware of:

  • Any line in the file that isn't a valid constraint will make the file invalid. Any comments in the file must begin with a #.
  • You cannot include distance restraints involving atom 1 H. Rosetta does not recognize an "H" atom in residue 1 because it is the N-terminus.
  • The residue numbering in the constraint file must match that in the chemical shift file.

NOE distance constraints in Rosetta format

Here in an example:

AtomPair HB3 153 HA 153 BOUNDED 1.700 3.300 .500 NOE AtomPair HB2 153 HA 153 BOUNDED 1.800 3.800 .500 NOE AtomPair HB3 153 HB2 153 BOUNDED 1.200 2.200 .500 NOE AtomPair MB 154 H 154 BOUNDED 2.000 6.000 .500 NOE AtomPair HA 154 H 155 BOUNDED 1.500 2.700 .500 NOE AtomPair MB 154 H 155 BOUNDED 1.900 3.900 .500 NOE AtomPair HA 154 MB 154 BOUNDED 1.600 2.800 .500 NOE

Column descriptions:

  • The first column should always be "AtomPair".
  • The second column is the atom name of the first atom.
  • The third column is the residue number of the first atom.
  • The fourth column is the atom name of the second atom.
  • The fifth column is the atom name of the second atom.
  • The sixth column is the Rosetta function type. You can use any of the Rosetta functions, but the simplest to use is BOUNDED.
  • The seventh column is the distance lower bound.
  • The eighth column is the distance upper bound.
  • The ninth column should always be .5
  • The tenth column should always be NOE

Dihedral angle constraints in Rosetta format

Here in an example:

Dihedral CG 13 CD2 13 NE2 13 ZN 32 CIRCULARHARMONIC 3.14 0.35

Column descriptions:

  • The first column should always be "Dihedral".
  • The second through ninth columns are the four atoms and their residue numbers.
  • The tenth column is the score function. You will probably want to use CIRCULARHARMONIC. Refer to the documentation regarding function types.
  • The eleventh column is the value of x0 in the CIRCULARHARMONIC function.
  • The twelfth column is the value of sd in the CIRCULARHARMONIC function.

You can use PdbStat to convert dihedral angles in XPLOR and other formats into the Rosetta constraint format.

RDCs:

Only backbone H-N RDCs can be used and they must be in the following format:

2 N 2 H 4.800 3 N 3 H 10.220 5 N 5 H 27.130 6 N 6 H 21.608

Please reference the CS-Rosetta documentation if needed.

Disulfide bond linkages:

To specify disulfide linkages, simply include one line per linkage with the residue numbers of the two involved residues:

6 88 38 68

Please reference the CS-Rosetta documentation if needed.

Please contact us if you encounter any issues.

Citation information:

"Determination of solution structures of proteins up to 40 kDa using CS-Rosetta with sparse NMR data from deuterated samples" Oliver F. Lange; Paolo Rossi; Nikolaos G. Sgourakis; Yifan Song; Hsiau-Wei Lee; James M. Aramini; Asli Ertekin; Rong Xiao; Thomas B. Acton; Gaetano T. Montelione; David Baker; Proceedings of the National Academy of Sciences 109(27) 10873-10878 (2012) doi: 10.1073/pnas.1203013109

"De novo structure generation using chemical shifts for proteins with high-sequence identity but different folds," Yang Shen; Philip N. Bryan; Yanan He; John Orban; David Baker; Ad Bax; Protein Science 19, 349-356 (2010) doi: 10.1002/pro.303

"De novo protein structure generation from incomplete chemical shift assignments," Yang Shen; Robert Vernon; David Baker; Ad Bax; J. Biomol. NMR 43, 63-78 (2009) doi: 10.1007/s10858-008-9288-5

"Consistent blind protein structure generation from NMR chemical shift data," Yang Shen; Oliver Lange; Frank Delaglio; Paolo Rossi; James M. Aramini; Gaohua Liu; Alexander Eletsky; Yibing Wu; Kiran K. Singarapu; Alexander Lemak; Alexandr Ignatchenko; Cheryl H. Arrowsmith; Thomas Szyperski; Gaetano T. Montelione; David Baker; Ad Bax; Proceedings of the National Academy of Sciences 105(12) 4685-4690 (2008) doi: 10.1073/pnas.0800256105