An Information Portal to Biological Macromolecular Structures
RCSB PDB Home \| Contact Us		Software Tools Home \| pdb_extract Online \| pdb_extract Workstation

pdb_extract - Workstation Version Manual

Extract information from each step of X-ray crystallographic and NMR software applications

Table of Contents
pdb_extract Template Files Data template file: (data_template.text) script file: (log_script.inp) Data template file for NMR: (data_template.text) Contact author template file: (author-infor.text) Return to pdb_extract Full Manual

Data Template Files

Data template file: (data_template.text) TOP

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
		    THE DATA_TEMPLATE.TEXT FILE	FOR X-RAY		
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

			  NOTES AND REMINDER
The data template file contains data entries for unique chemical sequences 
present in the structure and other non-electronically captured information. 

PLEASE CHECK CATEGORIES 1 & 2: Before proceeding any further, make necessary 
corrections here so that all information in these categories are complete 
and correct.

You may choose to fill in CATEGORIES (3-19) either here or later in ADIT.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

			GUIDELINES FOR USING THIS FILE
  1. Only strings included between the 'lesser than' and 'greater than' 
     signs (<.....>) will be parsed for evaluation by the program. Therefore, 
     DO NOT write either on the left or right of the 'less than' and 'greater 
     than' signs respectively.

  2. All alphanumeric values or strings that you include in the different 
     categories should be within double-quotes. Blank spaces or carriage 
     returns within a pair of double quotes are ignored by the program. 
     DO NOT use double quotes (") within strings that you enter.
   
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
   
~~~~~~~~~~~~~~~~~~~~~~~~~~~~START INPUT DATA BELOW~~~~~~~~~~~~~~~~~~~~~~~

================CATEGORY 1:   Crystallographic Data=======================
Enter crystallographic data

<space_group = "P 1 21 1"> (use International Table conventions)
<space_group_number = "? ">

<unit_cell_a     = "  56.800 " >
<unit_cell_b     = "  69.950 " >
<unit_cell_c     = "  60.530 " >
<unit_cell_alpha = " 90.00 " >
<unit_cell_beta  = "114.50 " >
<unit_cell_gamma = " 90.00 " >

 
================CATEGORY 2:   Sequence Information =======================
Enter one letter sequence for each polymeric entity in asymmetric unit

--------------------------------------------------------------------------
			  SOME DEFINITIONS
     An ENTITY is defined as any unique molecule present in the asymmetric 
     unit. Each unique biological polymer (protein or nucleic acids) in the 
     structure is considered an entity. Thus, if there are five copies of 
     a single protein in the asymmetric unit, the molecular entity is still 
     only one. Water and non-polymers like ions, ligands and sugars are 
     also entities. 

     Here we only consider the sequences of polymeric entities (protein or 
     nucleic acid).  

	        GUIDELINES FOR COMPLETING THIS CATEGORY
      * In a PDB or mmCIF format file, all residues of a single polymeric 
      entity should have one chain ID. Multiple copies of the same entity 
      should each be assigned a unique chain ID. The multiple chain IDs 
      should be separated by commas as 'A,B,C,...'. If incorrect chain IDs 
      are used the entity groups extracted by this program will not be 
      correct. To avoid this, make necessary corrections in the PDB or mmCIF 
      file used to generate the data_template file and regenerate the 
      data_template.text file. Alternatively, edit the extracted sequence 
      in this file to correctly represent the sequence and chain IDs of each 
      polymeric entity.  

      * In addition to chain IDs, this program uses distance geometry to 
      asses if there are any breaks in the polymer sequence. These breaks 
      may occur due to missing residues (not included in the model due to 
      missing electron density) or due to poor geometry. Four question marks 
      '????' are used to denote these chain breaks. Replace these question 
      marks with the sequence of residues missing from the coordinates. Also 
      add any residues missing from the N- and/or C-termini here.

      * If there are non-standard residues in the coordinates, this program 
      lists them according to the three letter code used in the coordinate
      file as (ABC). If all the residues in your sequence are nonstandard, 
      check and edit the sequence manually to represent it correctly in this 
      file. 

      * If any residue was modeled as Ala or Gly due to lack of the side-chain 
      density, the sequence extracted here will represent them as A or G 
      respectively. Correct this to the original sequence that was present in 
      the crystal.
----------------------------------------------------------------------------

	Below is the one letter chemical sequence extracted from your PDB 
	coordinate file. The molecular entities are grouped and listed 
	together. 

PLEASE CHECK THE SEQUENCE of each entity carefully and modify it, as necessary.
Make sure that you REVIEW THE FOLLOWING:  
   * chain breaks due to missing residues, 
   * missing residues in the N- and/or C-termini, 
   * non-standard residues and 
   * cases of residues modeled as Ala or Gly due to missing side-chain density.

<molecule_entity_id="1" >
<molecule_entity_type="polypeptide(L)" >
<molecule_one_letter_sequence=" 
MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRLDTETEGVPSTAIREISLLKELNHPNIVKLLDVI
HTENKLYLVFEFLHQDLKKFMDASALTGIPLPLIKSYLFQLLQGLAFCHSHRVLHRDLKPQNLLINTEGA
IKLADFGLARAFGVPVRTYTHEVVTLWYRAPEILLGCKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEID
QLFRIFRTLGTPDEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRISAKAA
LAHPFFQDVTKPVPHLRL" >
< molecule_chain_id="A" >
< target_DB_id=" " > (if known) 

<molecule_entity_id="2" >
<molecule_entity_type="polypeptide(L)" >
<molecule_one_letter_sequence=" 
MSHKQIYYSDKYDDEEFEYRHVMLPKDIAKLVPKTHLMSESEWRNLGVQQSQGWVHYMIHEPEPHILLFR
RPLPKKPKK" >
< molecule_chain_id="B" >
< target_DB_id=" " > (if known) 

<molecule_entity_id=" " >
<molecule_entity_type=" " >
<molecule_one_letter_sequence="  " >
<molecule_chain_id=" " >

<target_DB_id=" " >  (if known)


================CATEGORY 3:   Contact Authors=============================
Enter information about the contact authors.
    Note: items marked by (e.g. ) are manditory. 
          PI information should be always given.
   
1.  Information about the Principal investigator (PI) should be given. 

<contact_author_PI_id = "1 ">           (must be given 1)
<contact_author_PI_salutation = " ">     ( Dr./Prof./Mr./Mrs./Ms.)
<contact_author_PI_first_name = " ">      (e.g. John)
<contact_author_PI_last_name = " ">        (e.g. Rodgers)
<contact_author_PI_middle_name = " ">         
<contact_author_PI_role = " ">   (e.g. investigator/responsible scientist)
<contact_author_PI_organization_type = " ">  (e.g. academica/commercial/goverment/other)
<contact_author_PI_email = " ">        (e.g.   name@host.domain.country)      
<contact_author_PI_address = " ">            (e.g. 610 Taylor road)
<contact_author_PI_city = " ">               (e.g. Piscataway)
<contact_author_PI_State_or_Province = " ">   (e.g.  New Jersey)
<contact_author_PI_Zip_Code = " ">           (e.g.  08864)
<contact_author_PI_Country = " ">          (e.g.  UNITED STATES)
<contact_author_PI_fax_number = " ">
<contact_author_PI_phone_numer = " ">

2. Information about other contact authors

<contact_author_id = "2 ">       (e.g. 2,3,4..)
<contact_author_salutation = " ">   
<contact_author_first_name = " ">      
<contact_author_last_name = " ">       
<contact_author_middle_name = " ">         
<contact_author_role = " ">    
<contact_author_organization_type = " ">  
<contact_author_email = " ">             
<contact_author_address = " ">            
<contact_author_city = " ">              
<contact_author_State_or_Province = " ">   
<contact_author_Zip_Code = " ">           
<contact_author_Country = " ">          
<contact_author_fax_number = " ">
<contact_author_phone_numer = " ">


...(add more if needed)...

================CATEGORY 4:   Structure Genomics=========================
If it is the structure genomics project, give the information

<SG_project_id = " 1">  
<SG_project_name = " ">        (e.g. NPPSFA/PSI, Protein Structure Initiative)
<full_name_of_SG_center = " ">   (e.g. Berkeley Structural Genomics Center)


================CATEGORY 5:   Release Status==============================
Enter release status for the coordinates,structure_factor, and sequence

   Status for sequence should be chosen from one of the following:
   (release now, hold for release)

   Status for others should be chosen from one of the following:
  (release now, hold for publication,  hold for 4 weeks, hold for 6 weeks, 
   hold for 6 months, hold for 1 year)

<Release_status_for_coordinates = " ">      (e.g. release now)
<Release_status_for_structure_factor = " ">
<Release_status_for_sequence = " ">     

================CATEGORY 6:   Title=======================================
Enter the title for the structure

<structure_title = " ">     (e.g. Crystal Structure Analysis of the B-DNA)
<structure_details = " ">  


================CATEGORY 7: Authors of Structure============================
Enter authors of the deposited structures (e.g. Surname, F.M.) 

<structure_author_name = " ">
<structure_author_name = " ">
<structure_author_name = " ">
<structure_author_name = " ">
...add more if needed...


================CATEGORY 8:   Citation Authors============================
Enter author names for the publications associated with this deposition.

      The primary citation is the article in which the deposited coordinates 
      were first reported. Other related citations may also be provided.

1. For the primary citation
<primary_citation_author_name = " ">    (e.g. Surname, F.M.) 
<primary_citation_author_name = " ">
<primary_citation_author_name = " ">
<primary_citation_author_name = " ">
...add more if needed...

2. For other related citations  (if applicable)
<citation_author_id = " ">    (e.g. 1, 2 ..)
<citation_author_name = " ">
<citation_author_name = " ">
<citation_author_name = " ">
<citation_author_name = " ">
...add more if needed...


...(add more other citations if needed)...

================CATEGORY 9:   Citation Article============================
Enter citation article (journal, title, year, volume, page)  

      If the citation has not yet been published, use 'To be published' 
      for the category 'journal_abbrev' and leave pages and volume blank. 

1. For primary citation
<primary_citation_id = "primary">     
<primary_citation_journal_abbrev = " ">     (e.g. to be published)
<primary_citation_title = " ">   
<primary_citation_year = " ">
<primary_citation_journal_volume = " "> 
<primary_citation_page_first = " ">
<primary_citation_page_last = " ">

2. For other related citation (if applicable)
<citation_id = "1 ">               (e.g. 1, 2, 3 ...)
<citation_journal_abbrev = " ">
<citation_title = " ">
<citation_year = " ">
<citation_journal_volume = " "> 
<citation_page_first = " ">
<citation_page_last = " ">


...(add more citations if needed)...


================CATEGORY 10:   Molecule Names==============================
Enter the names of the molecules (entities) that are in the asymmetric unit
 
NOTE: The number of molecular names should be the same as CATEGORY 2 !
      The name of molecule should be obtained from the appropriate 
      sequence database reference, if available. Otherwise the gene name or
      other common name of the entity may be used. 
      e.g. HIV-1 integrase for protein 
           RNA Hammerhead Ribozyme for RNA 

<molecule_name = " ">    (entity 1)
<molecule_name = " ">    (entity 2)

...(add more if needed)...


================CATEGORY 11:   Molecule Details============================
Enter additional information about each entity, if known. (optional)

      Additional information would include details such as fragment name 
      (if applicable), mutation, and E.C.number.

1. For entity 1
<Molecular_entity_id = "1 ">       (e.g. 1, 2, ...)
<Fragment_name = " ">             (e.g. ligand binding domain, hairpin)
<Specific_mutation = " ">         (e.g. C280S)
<Enzyme_Comission_number = " ">   (if known: e.g. 2.7.7.7)

2. For entity 2
<Molecular_entity_id = "2 ">       
<Fragment_name = " ">   
<Specific_mutation = " ">      
<Enzyme_Comission_number = " "> 

...(add more if needed)...

================CATEGORY 12:   Genetically Manipulated Source=============
Enter data in the genetically manipulated source category 

      If the biomolecule has been genetically manipulated, describe its 
      source and expression system here. 

1. For entity 1
<Manipulated_entity_id = "1 ">               (e.g. 1, 2, ...)
<Source_organism_scientific_name = " ">      (e.g. Homo sapiens)
<Source_organism_gene = " ">                 (e.g. RPOD, ALKA...)
<Source_organism_strain = " ">               (e.g. BH10 ISOLATE, K-12...)
<Expression_system_scientific_name = " ">    (e.g. Escherichia coli)
<Expression_system_strain = " ">	     (e.g. BL21(DE3))
<Expression_system_vector_type = " ">	     (e.g. plasmid)
<Expression_system_plasmid_name = " ">       (e.g. pET26)
<Manipulated_source_details = " ">           (any other relevant information)

2. For entity 2
<Manipulated_entity_id = "2 ">       
<Source_organism_scientific_name = " ">    
<Source_organism_gene = " ">     
<Source_organism_strain = " ">               
<Expression_system_scientific_name = " ">  
<Expression_system_strain = " ">	     
<Expression_system_vector_type = " ">	     
<Expression_system_plasmid_name = " ">     
<Manipulated_source_details = " ">        


...(add more if needed)...

================CATEGORY 13:   Natural Source=============================
Enter data in the natural source category  (if applicable)

    If the biomolecule was derived from a natural source, describe it here.
      

1. For entity 1
<natural_source_entity_id = " ">          (e.g. 1, 2, ...)
<natural_source_scientific_name = " ">    (e.g. Homo sapiens)
<natural_source_organism_strain = " ">    (e.g. DH5a , BMH 71-18)
<natural_source_details = " ">            (e.g. organ, tissue, cell ..)


2. For entity 2
<natural_source_entity_id = " ">    
<natural_source_scientific_name = " "> 
<natural_source_organism_strain = " ">    
<natural_source_details = " ">   


...(add more if needed)...

================CATEGORY 14:  Synthetic Source=============================
If the biomolecule has not been genetically manipulated or synthesized, 
describe its source here. 

1. For entity 1
<synthetic_source_entity_id = " ">          (e.g. 1, 2, ...)
<synthetic_source_description = " ">      (if known)

2. For entity 2
<synthetic_source_entity_id = " ">    
<synthetic_source_description = " ">     

...(add more if needed)...


================CATEGORY 15:   Keywords===================================
Enter a list of keywords that describe important features of the deposited
structure.  

      For example, beta barrel, protein-DNA complex, double helix, 
      hydrolase, structural genomics etc. 

<structure_keywords = " ">  

================CATEGORY 16:   Biological Assembly========================
Enter data in the biological assembly category (if applicable)

      Biological assembly describes the functional unit(s) present in the
      structure. There may be part of a biological assembly, one or more 
      than one biological assemblies in the asymmetric unit.
      Case 1
      * If the asymmetric unit is the same as the biological assembly
	nothing special needs to be noted here.
      Case 2
      * If the asymmetric unit does not contain a complete biological unit. 
	Please provide symmetry operations including translations required 
	to build the biological unit.
	(example:
	The biological assembly is a hexamer generated from the dimer
	in the asymmetric unit by the operations:  -y, x-y-1, z-1 and 
	-x+y, -x-1, z-l.)
      Case 3
      * If the asymmetric unit has multiple biological units
	Please specify how to group the contents of the asymmetric unit into 
	biological units.
	(example:
	The biological unit is a dimer. There are 2 biological units in the 
	asymmetric unit (chains A & B and chains C & D).

<biological_assembly = " ">     (biological unit 1)
<biological_assembly = " ">     (biological unit 1)

....(add more if needed)....

================CATEGORY 17:   Methods and Conditions=====================
Enter the crystallization conditions for each crystal

1. For crystal 1:				
<crystal_number = "1 ">	           (e.g. 1, 2, ...)
<crystallization_method = " ">      (e.g. vapor diffusion, hanging drop) 
<crystallization_pH = " ">          (e.g. 7.5 ...)
<crystallization_temperature = " "> (e.g. 298) (in Kelvin) 
<crystallization_details = " ">  (e.g. PEG 4000, NaCl etc.)

2. For crystal 2:
<crystal_number = " ">                 
<crystallization_method = " ">
<crystallization_pH = " ">
<crystallization_temperature = " ">
<crystallization_details = " ">

...(add more if needed)...

================CATEGORY 18:   Crystal Property===========================
Enter solvent content, Matthews coefficient
      These values were calculated based on the sequence as shown in 
      CATEGORY 2. If there are missing residues, you need to add the
      missing residues and re-run the program to get accurate values.
      (The command to re-run is 'extract -sol data_template.text')

1. For crystal 1:
<crystals_number = " 1 ">                  (e.g. 1, 2, ...)
<crystals_solvent_content = "50.6 ">
<crystals_matthews_coefficient = "2.5 ">
<crystals_mosaicity = " ">    (e.g. 0.5 ...)


2. For crystal 2:
<crystals_number = "  ">                 
<crystals_solvent_content = "50.6 ">
<crystals_matthews_coefficient = "2.5 ">
<crystals_mosaicity = " ">    



...(add more if needed)...



================CATEGORY 19:   Radiation Source (experiment)============
Enter the details of the source of radiation, the X-ray generator, 
and the wavelength for each diffraction.

1. For experiment 1:
<radiation_experiment = "1 ">      (e.g. 1, 2, ...)
<radiation_source = " ">           (e.g. SYNCHROTRON, ROTATING ANODE ...)
<radiation_source_type = " ">      (e.g. NSLS BEAMLINE X8C ...)
<radiation_wavelengths= " ">       (e.g. 1.502 ...)
<radiation_detector = " ">         (e.g. CCD/AREA DETECTOR/IMAGE PLATE ...)
<radiation_detector_type= " ">     (e.g. SIEMENS-NICOLET/RIGAKU RAXIS ...)
<radiation_detector_details = " ">    (e.g. mirrors...)
<data_collection_date = " ">             (e.g. 2004-11-27)
<data_collection_temperature = " ">      (e.g. 100 for crystal  1:)
<data_collection_protocol= " ">          (e.g. SINGLE WAVELENGTH, MAD, ...)
<data_collection_monochromator= " ">     (e.g. GRAPHITE, Ni FILTER ...)

2. For experiment 2:

<radiation_experiment = "2 ">      
<radiation_source = " ">      
<radiation_source_type = " ">      
<radiation_wavelengths= " ">       
<radiation_detector = " ">     
<radiation_detector_type= " ">     
<radiation_detector_details = " ">    
<data_collection_data = " ">           
<data_collection_temperature = " ">      
<data_collection_protocol= " ">          
<data_collection_monochromator= " ">          


....(add more if needed)....

=====================================END==================================

script file: (log_script.inp) TOP


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
		      	THE LOG_SCRIPT.INP FILE
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

			 NOTES AND REMINDER 
This script file is used to enter the names of the crystallographic 
software used for structure determination and the log, PDB, mmCIF or 
text files generated by them.

PLEASE COMPLETE the ENTRY FIELDS according to the type of your experiment 
and use the command 'extract -ext log_script.inp' to obtain the completed 
structure data ready for validation and deposition.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 
			GUIDELINES FOR USING THIS FILE
  1. Only strings included between the 'lesser than' and 'greater than' 
     signs (<.....>) will be parsed for evaluation by the program. Therefore, 
     DO NOT write either on the left or right of the 'less than' and 'greater 
     than' signs respectively.

  2. All alphanumeric values or strings that you include in the different 
     categories should be within double-quotes. Blank spaces or carriage 
     returns within a pair of double quotes are ignored by the program. 
     DO NOT use double quotes (") within strings that you enter.
   
  3. Log files used for generating the deposition should be generated from 
     the best (usually the last) trial for each crystallographic software.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

~~~~~~~~~~~~~~~~~~~~~~~~~~~~START INPUT DATA BELOW~~~~~~~~~~~~~~~~~~~~~~~

===============PART 1: Structure Factor for Final Refinement==============
Enter reflection data file used for final structure refinement
	 
     NOTE:
     * 	Usually the highest resolution or best data set is used for the 
	refinement. Use that structure factor file here.
	
     *  In some cases, it may not be possible to collect a complete dataset 
	from a single crystal. Thus, multiple data sets have to be scaled 
	and merged together for refinement. Use the merged reflection file 
	here. 

     * 	If the reflection data format is not one of those listed below, 
	please use OTHER for the data format, and provide an ASCII file 
	that has at least five values [H, K, L, I (or F), sigmaI (or sigmaF)] 
	for each reflection and seperate each item by one or more spaces.   
	Include the test flags as the sixth column in the file (if available).

     * 	If the reflection file is in mtz format (e.g. using REFMAC5), convert 
	it to mmCIF format using the mtz2various application provided by CCP4. 

	Reflection data format:
  	CNS|SHELX|TNT|REFMAC5|HKL|SCALEPACK|DTREK|SAINT|SCALA|3DSCALE

<reflection_data_type = "F" >      [enter I (intensity) or F (amplitude)]
<reflection_data_format = "CNS" >
<reflection_data_file_name = " " >

==============PART 2: Structure Factors for Protein Phasing================
Enter reflection data files used for heavy atom or MAD phasing

     NOTE:
     *  Enter this category if you have more than one complete reflection 
	file (e.g. in the case of MAD,SIRAS, MIR). The LOG files generated 
	from data scaling software for all these data sets are also needed.

     *  If the scaling program is not one of those listed below 
	(HKL|SCALEPACK|DTREK|SAINT|3DSCALE), enter OTHER for the program 
	name and provide an ASCII file with five values 
	[H, K, L, I (or F), sigmaI (or sigmaF)] for each reflection and 
        seperate each item by a space

     *  If the same crystal was used for collecting multiple data sets, the
	crystal number will remain '1' as the wavelength numbers change. 
	However, if multiple crystals were used, for the data collections, 
	the corresponding crystal numbers should be used for each data set.

     *  IT IS IMPORTANT THAT THE LOG FILE AND DATA FILE COME FROM THE 
	SAME PROGRAM.

<scale_data_type = "I" >          [enter I (intensity) or F (amplitude)]
<scale_program_name = "HKL" >

For data set 1:
<crystal_number  = "1" >
<diffract_number = "1" >
<scale_data_file_name  = " " >
<scale_log_file_name   = " " >

For data set 2:
<crystal_number  = "1" >
<diffract_number = "2" >
<scale_data_file_name  = " " >
<scale_log_file_name   = " " >

For data set 3:
<crystal_number  = "1" >
<diffract_number = "3" >
<scale_data_file_name  = " " >
<scale_log_file_name   = " " >

==================PART 3: Statistics for Indexing=====================
Enter log file and software name for data indexing  

     NOTE: 
     * 	This is only for the data of final structure refinment.

	Software for indexing is one of the following: 
	(HKL|DENZO|DTREK|MOSFLM)

<data_indexing_software = "HKL" >
<data_indexing_LOG_file_name = " " >
<data_indexing_CIF_file_name = " " >  (if mmCIF format)

==================PART 4: Statistics for Data Scaling=====================
Enter log file and software name for data scaling 

     NOTE: 
     * 	The log file included here should have scaling statistics of 
	the file used for the final structure refinement. If multiple data 
	sets were scaled and merged for refinement (as described in Part 1
	above) use the log file generated during merging of the data sets. 

	Software for scaling is one of the following: 
	(HKL|SCALEPACK|DTREK|SAINT|3DSCALE|SCALA)

<data_scaling_software = "HKL" >
<data_scaling_LOG_file_name = " " >
<data_scaling_CIF_file_name = " " >  (if  mmCIF format)

==============PART 5: Statistics for Molecular Replacement================
Enter log files and software name for molecular replacement

     NOTE: 
	Software is one of the following:
	(CNS|AMORE|MOLREP|EPMR|PHASER)
	The log file should be from the best trial of MR.

<mr_software = " " >
<mr_log_file_LOG_1 = " " >
<mr_log_file_LOG_2 = " " >

=================PART 6: Statistics for Protein Phasing===================
Enter log files and software name for heavy atom phasing

     NOTE: 
        The phasing method should be one of (SAD|MAD|SIR|SIRAS|MIR|MIRAS).
	Software is one of the following:
	(CNS|MLPHARE|SOLVE|SHELXS|SHELXD|SNB|BNP|SHARP|PHASES)
	The log file should be from the best trial of phasing.

<phasing_method = "MAD" >        
<phasing_software = "SOLVE" >

<phasing_log_file_LOG_1 = " " >    
<phasing_log_file_PDB_1 = " " >    (if PDB format (heavy atom coordinates))
<phasing_log_file_CIF_1 = " " >    (if mmCIF format)

<phasing_log_file_LOG_2 = " " >
<phasing_log_file_PDB_2 = " " >
<phasing_log_file_CIF_2 = " " >

... add more if needed ...

===============PART 7: Statistics for Density Modification================
Enter log files and software name for density modification

     NOTE: 
	Software is one of the following:
	(CNS|DM|RESOLVE|SOLOMON|SHELXE)
	The log file should be from the best trial of density modification.

<dm_software = "RESOLVE " >
<dm_log_file_LOG_1 = " " >
<dm_log_file_CIF_1 = " " >         (if mmCIF format)

===============PART 8: Statistics for Structure Refinement================
Enter log files and software name used for final structure refinement

     NOTE: 
        
	Software is one of the following:
	(CNS|REFMAC5|SHELXL|TNT|PROLSQ|NUCLSQ|RESTRAIN)
	The log file should be from the final trial of structure refinement.

<refine_software = "REFMAC5" >

<refine_log_file_PDB_1 = " " >  (coordinate file in PDB format)
<refine_log_file_CIF_1 = " " >  (mmCIF file containing refinement statistics)
<refine_log_file_LOG_1 = " " >



=======================PART 9: Data Template File=========================
Enter file name of the data template file

     NOTE: 
	This file 'data_template.text' was generated by using the
	command 'extract -pdb pdb_file' or 'extract -cif cif_file'. It 
	contains the sequences of all unique polymers (protein or nucleic 
	acid) present in the structure. It also contains other 
	non-electronically captured information. Please complete the 
	data template file before running pdb_extract.

<data_template_file = "data_template.text" >


==========================PART 10: Output Files============================
Enter the output file names 

     NOTE: 
        If you do not give the output file names, the default names
        pdb_extract_sf.mmcif containing structure factors and 
        pdb_extract.mmcif containing coordinates will be assigned 
        by the program


<sf_output= " " >            (for structure factors)
<statistics_output= " " >    (for coordinates and statistics)

=====================================END==================================

Data template file for NMR: (data_template.text) TOP




++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
		    THE DATA_TEMPLATE.TEXT FILE FOR NMR	
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

			  NOTES AND REMINDER
The data template file contains data entries for unique chemical sequences 
present in the structure and other non-electronically captured information. 

PLEASE CHECK CATEGORIES 1. Before proceeding any further, make necessary 
corrections here so that all information in these categories are complete 
and correct.

You may choose to fill in CATEGORIES (2-21) either here or later in ADIT.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

			GUIDELINES FOR USING THIS FILE
  1. Only strings included between the 'lesser than' and 'greater than' 
     signs (<.....>) will be parsed for evaluation by the program. Therefore, 
     DO NOT write either on the left or right of the 'less than' and 'greater 
     than' signs respectively.

  2. All alphanumeric values or strings that you include in the different 
     categories should be within double-quotes. Blank spaces or carriage 
     returns within a pair of double quotes are ignored by the program. 
     DO NOT use double quotes (") within strings that you enter.
   
   
~~~~~~~~~~~~~~~~~~~~~~~~~~~~START INPUT DATA BELLOW~~~~~~~~~~~~~~~~~~~~~~~
 
================CATEGORY 1:   Molecular Entity Sequence===================
Enter one letter code sequence for each molecular entity

A Molecular entity is defined as a unique monomer in each model.The
molecular entities are calculated and grouped together. 
Please carefully check the entity and modify it, if necessary. 

If a chain is broken, four question marks ???? are given at the broken
point. Please REPLACE the ? by the missing sequences including N and C 
terminals. If residue name is not the standard one letter code (due to 
modification), the full residue (three letter name) name should be given 
and parenthesized.

NOTE: If all the residues are modified, sequence may not be extracted.
      Please manually add the sequence.

<molecule_entity_id="1" >
<molecule_entity_type="polypeptide(L)" >
<molecule_one_letter_sequence=" 
MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRLDT????TAIREISLLKELNHPNIVKLLDVIHTENKLY
LVFEFLHQDLKKFMDASALTGIPLPLIKSYLFQLLQGLAFCHSHRVLHRDLKPQNLLINTEGAIKLADFG
LARAFGVPVRTYTHEVVTLWYRAPEILLGCKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEIDQLFRIFR
TLGTPDEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRISAKAALAHPFFQ
DVTKPVP" >
< molecule_chain_id="A" >
< target_DB_id=" " > (if known) 

<molecule_entity_id="2" >
<molecule_entity_type="polypeptide(L)" >
<molecule_one_letter_sequence=" 
QIYYSDKYDDEEFEYRHVMLPKDIAKLVPKTHLMSESEWRNLGVQQSQGWVHYMIHEPEPHILLFRRPLP
" >
< molecule_chain_id="B" >
< target_DB_id=" " > (if known) 

<molecule_entity_id=" " >
<molecule_entity_type=" " >
<molecule_one_letter_sequence="  " >
<molecule_chain_id=" " >

<target_DB_id=" " >  (if known)


================CATEGORY 2:   Contact Authors=============================
Enter information about the contact authors.
    Note: items marked by (e.g. ) are manditory. 
          PI information should be always given.
   
1.  Information about the Principal investigator (PI) should be given. 

<contact_author_PI_id = "1 ">           (must be given 1)
<contact_author_PI_salutation = " ">     ( Dr./Prof./Mr./Mrs./Ms.)
<contact_author_PI_first_name = " ">      (e.g. John)
<contact_author_PI_last_name = " ">        (e.g. Rodgers)
<contact_author_PI_middle_name = " ">         
<contact_author_PI_role = " ">   (e.g. investigator/responsible scientist)
<contact_author_PI_organization_type = " ">  (e.g. academica/commercial/goverment/other)
<contact_author_PI_email = " ">        (e.g.   name@host.domain.country)      
<contact_author_PI_address = " ">            (e.g. 610 Taylor road)
<contact_author_PI_city = " ">               (e.g. Piscataway)
<contact_author_PI_State_or_Province = " ">   (e.g.  New Jersey)
<contact_author_PI_Zip_Code = " ">           (e.g.  08864)
<contact_author_PI_Country = " ">          (e.g.  UNITED STATES)
<contact_author_PI_fax_number = " ">
<contact_author_PI_phone_numer = " ">

2. Information about other contact authors

<contact_author_id = "2 ">       (e.g. 2,3,4..)
<contact_author_salutation = " ">   
<contact_author_first_name = " ">      
<contact_author_last_name = " ">       
<contact_author_middle_name = " ">         
<contact_author_role = " ">    
<contact_author_organization_type = " ">  
<contact_author_email = " ">             
<contact_author_address = " ">            
<contact_author_city = " ">              
<contact_author_State_or_Province = " ">   
<contact_author_Zip_Code = " ">           
<contact_author_Country = " ">          
<contact_author_fax_number = " ">
<contact_author_phone_numer = " ">


...(add more if needed)...

================CATEGORY 3:   Structure Genomics=========================
If it is the structure genomics project, give the information

<SG_project_id = " 1">  
<SG_project_name = " ">        (e.g. NPPSFA/PSI, Protein Structure Initiative)
<full_name_of_SG_center = " ">   (e.g. Berkeley Structural Genomics Center)


================CATEGORY 4:   Release Status==============================
Enter Release Status for Coordinates, Constraints, Sequence

   Status for sequence should be chosen from one of the following:
   (release now, hold for release)

   Status for others should be chosen from one of the following:
  (release now, hold for publication,  hold for 4 weeks, hold for 6 weeks, 
   hold for 6 months, hold for 1 year)

<Release_status_for_coordinates = " ">
<Release_status_for_NMR_constraints = " ">
<Release_status_for_sequence = " ">

================CATEGORY 5:   Title=======================================
Enter a title for the structure

<structure_title = " ">     (e.g. Crystal Structure Analysis of the B-DNA)
<structure_details = " ">  

================CATEGORY 6: Authors of Structure============================
Enter authors of the deposited structures (e.g. Surname, F.M.) 

<structure_author_name = " ">
<structure_author_name = " ">
<structure_author_name = " ">
<structure_author_name = " ">
...add more if needed...


================CATEGORY 7:   Citation Authors============================
Enter author names for the publications associated with this deposition.

      The primary citation is the article in which the deposited coordinates 
      were first reported. Other related citations may also be provided.

1. For the primary citation
<primary_citation_author_name = " ">    (e.g. Surname, F.M.) 
<primary_citation_author_name = " ">
<primary_citation_author_name = " ">
<primary_citation_author_name = " ">
...add more if needed...

2. For other related citations  (if applicable)
<citation_author_id = " ">    (e.g. 1, 2 ..)
<citation_author_name = " ">
<citation_author_name = " ">
<citation_author_name = " ">
<citation_author_name = " ">
...add more if needed...


...(add more other citations if needed)...
================CATEGORY 8:   Citation Article============================
Enter citation article (journal, title, year, volume, page)  

      If the citation has not yet been published, use 'To be published' 
      for the category 'journal_abbrev' and leave pages and volume blank. 

1. For primary citation
<primary_citation_id = "primary">     
<primary_citation_journal_abbrev = " ">     (e.g. to be published)
<primary_citation_title = " ">   
<primary_citation_year = " ">
<primary_citation_journal_volume = " "> 
<primary_citation_page_first = " ">
<primary_citation_page_last = " ">

2. For other related citation (if applicable)
<citation_id = "1 ">               (e.g. 1, 2, 3 ...)
<citation_journal_abbrev = " ">
<citation_title = " ">
<citation_year = " ">
<citation_journal_volume = " "> 
<citation_page_first = " ">
<citation_page_last = " ">


...(add more citations if needed)...
================CATEGORY 9:   Molecule Names==============================
Enter the name of the molecule for each entity

      The name of molecule should be obtained from the appropriate 
      sequence database reference, if available. Otherwise the gene name or
      other common name of the entity may be used. 
      e.g. HIV-1 integrase for protein 
           RNA Hammerhead Ribozyme for RNA 
      The number of entities should be the same as in CATEGORY 1.

<molecule_name = " ">    (entity 1)
<molecule_name = " ">    (entity 2)

...(add more if needed)...

================CATEGORY 10:  Molecule Details============================
Enter additional information about each entity, if known. (optional)

      Additional information would include details such as fragment name 
      (if applicable), mutation, and E.C.number.

1. For entity 1
<Molecular_entity_id = "1 ">       (e.g. 1, 2, ...)
<Fragment_name = " ">             (e.g. ligand binding domain, hairpin)
<Specific_mutation = " ">         (e.g. C280S)
<Enzyme_Comission_number = " ">   (if known: e.g. 2.7.7.7)

2. For entity 2
<Molecular_entity_id = "2 ">       
<Fragment_name = " ">   
<Specific_mutation = " ">      
<Enzyme_Comission_number = " "> 

...(add more if needed)...

================CATEGORY 11:   Genetically Manipulated Source==============
Enter data in the genetically manipulated source category 

      If the biomolecule has been genetically manipulated, describe its 
      source and expression system here. 

1. For entity 1
<Manipulated_entity_id = "1 ">               (e.g. 1, 2, ...)
<Source_organism_scientific_name = " ">      (e.g. Homo sapiens)
<Source_organism_gene = " ">                 (e.g. RPOD, ALKA...)
<Expression_system_scientific_name = " ">    (e.g. Escherichia coli)
<Expression_system_strain = " ">	     (e.g. BL21(DE3))
<Expression_system_vector_type = " ">	     (e.g. plasmid)
<Expression_system_plasmid_name = " ">       (e.g. pET26)
<Manipulated_source_details = " ">           (any other relevant information)

2. For entity 2
<Manipulated_entity_id = "2 ">       
<Source_organism_scientific_name = " ">    
<Source_organism_gene = " ">     
<Expression_system_scientific_name = " ">  
<Expression_system_strain = " ">	     
<Expression_system_vector_type = " ">	     
<Expression_system_plasmid_name = " ">     
<Manipulated_source_details = " ">        


...(add more if needed)...

================CATEGORY 12:   Natural Source=============================
Enter data in the natural source category  (if applicable)

    If the biomolecule was derived from a natural source, describe it here.
      

1. For entity 1
<natural_source_entity_id = " ">          (e.g. 1, 2, ...)
<natural_source_scientific_name = " ">    (e.g. Homo sapiens)
<natural_source_organism_strain = " ">    (e.g. DH5a , BMH 71-18)
<natural_source_details = " ">            (e.g. organ, tissue, cell ..)


2. For entity 2
<natural_source_entity_id = " ">    
<natural_source_scientific_name = " "> 
<natural_source_organism_strain = " ">    
<natural_source_details = " ">   


...(add more if needed)...

================CATEGORY 13:  Synthetic Source=============================
If the biomolecule has not been genetically manipulated or synthesized, 
describe its source here. 

1. For entity 1
<synthetic_source_entity_id = " ">          (e.g. 1, 2, ...)
<synthetic_source_description = " ">      (if known)

2. For entity 2
<synthetic_source_entity_id = " ">    
<synthetic_source_description = " ">     

...(add more if needed)...

================CATEGORY 14:   Keywords===================================
Enter a list of keywords that describe important features of the deposited
structure.  

      For example, beta barrel, protein-DNA complex, double helix, 
      hydrolase, structural genomics etc. 

<structure_keywords = " ">  

================CATEGORY 15:   Ensemble===================================
Enter data in category ensemble
   
  Skip this section, if only one average structure has been deposited.

<conformers_calculated_total_number = " ">   (e.g. 200)
<conformers_submitted_total_number = " ">    (e.g. 20)
<conformers_selection_criteria = " ">  (e.g. 20 structures for lowest energy)

================CATEGORY 16:   Representative Conformers==================
Enter data in category representative conformers

  Normally, only one of the ensemble is selected as a representative
  structure.

<conformer_id = " ">      (e.g. 1,2..)
<conformer_selection_criteria = " ">  (e.g.lowest energy, fewest violations)

================CATEGORY 17:   Sample Details=============================
Enter a description of each NMR sample, including the solvent system used. 

1. for sample 1.
<solution_id_1= "1 ">       (e.g. 1, 2.. )
<solution_content_1= " ">  (e.g. 50mM phosphate buffer NA; 90% H2O, 10% D2O)
<solvent_system_1= " ">    (e.g. 90% H2O, 10% D2O )

2. for sample 2.
<solution_id_2= " ">  
<solution_content_2= " "> 
<solvent_system_2= " ">   

....add more if needed....

================CATEGORY 18:   Sample Conditions==========================
Enter experimental conditions used for each sample. 

  Each set of conditions is identified by a numerical code. 

1. for sample 1.
<Conditions_id_1 = "1 ">    (e.g. 1, 2..)
<Temperature_1 = " ">      (e.g. 298)  (in Kelvin) 
<Pressure_1 = " ">         (e.g. ambient, 1atm)
<pH_value_1 = " ">         (e.g. 7.2)
<Ionic_strength_1 = " ">   (e.g.  100MM KCL)

2. for sample 2.
<Conditions_id_2 = " ">  
<Temperature_2 = " ">   
<Pressure_2 = " ">   
<pH_value_2 = " ">     
<Ionic_strength_2 = " ">  

....add more if needed....

================CATEGORY 19:   Spectrometer===============================
Enter the details about each spectrometer used to collect data. 

1. for experiment 1:
<spectrometer_id_1 = "1 ">              (e.g. 1, 2..)
<spectrometer_manufacturer_1 = " ">    (e.g. Bruker ..) 
<spectrometer_model_1 = " ">           (e.g. DRX)
<spectrometer_field_strength_1 = " ">  (e.g. 500, 700)

2. for experiment 2:
<spectrometer_id_2 = " ">    
<spectrometer_manufacturer_2 = " ">    
<spectrometer_model_2 = " ">    
<spectrometer_field_strength_2 = " ">    

....add more if needed....

================CATEGORY 20:   Experiment Type============================
Enter information for those experiments that were used to generate
constraint data. For each NMR experiment, indicate which sample and 
which sample conditions were used for the experiment. 

1. for experiment type 1:
<experiment_type_id_1 = "1 ">    (e.g. 1, 2..)
<solution_type_id_1= " 1">       (same ID as solution_id_1 in CATEGORY 17)
<conditions_type_id_1 = "1 ">    (same ID as conditions_id_1 in CATEGORY 18)
<Experiment_type_1= " ">        (e.g. 3D_15N-separated_NOESY)

2. for experiment type 2:
<experiment_type_id_2 = " ">    (e.g. 1, 2..)
<solution_type_id_2= " ">       (same ID as solution_id_1 in CATEGORY 17)
<conditions_type_id_2 = " ">    (same ID as conditions_id_1 in CATEGORY 18)
<Experiment_type_2= " ">     

....add more if needed....

================CATEGORY 21:   Method and Details=========================
Enter the method and details of the refinement for the deposited structure. 

<NMR_method = " ">   (e.g. simulated annealing)
<NMR_details = " ">  (enter details about the NMR refinement)


=====================================END==================================