RCSB PDB Protein Data Bank A Member of the wwPDB
An Information Portal to Biological Macromolecular Structures
RCSB PDB Home | Contact Us

pdb_extract - Workstation Version Manual

Extract information from each step of X-ray crystallographic and NMR software applications

Table of Contents
  • pdb_extract Template Files
  • Return to pdb_extract Full Manual
  • Data Template Files
    Data template file: (data_template.text)     TOP
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    		    THE DATA_TEMPLATE.TEXT FILE	FOR X-RAY		
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    
    			  NOTES AND REMINDER
    The data template file contains data entries for unique chemical sequences 
    present in the structure and other non-electronically captured information. 
    
    PLEASE CHECK CATEGORIES 1 & 2: Before proceeding any further, make necessary 
    corrections here so that all information in these categories are complete 
    and correct.
    
    You may choose to fill in CATEGORIES (3-19) either here or later in ADIT.
    
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    
    			GUIDELINES FOR USING THIS FILE
      1. Only strings included between the 'lesser than' and 'greater than' 
         signs (<.....>) will be parsed for evaluation by the program. Therefore, 
         DO NOT write either on the left or right of the 'less than' and 'greater 
         than' signs respectively.
    
      2. All alphanumeric values or strings that you include in the different 
         categories should be within double-quotes. Blank spaces or carriage 
         returns within a pair of double quotes are ignored by the program. 
         DO NOT use double quotes (") within strings that you enter.
       
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
       
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~START INPUT DATA BELOW~~~~~~~~~~~~~~~~~~~~~~~
    
    ================CATEGORY 1:   Crystallographic Data=======================
    Enter crystallographic data
    
    <space_group = "P 1 21 1"> (use International Table conventions)
    <space_group_number = "? ">
    
    <unit_cell_a     = "  56.800 " >
    <unit_cell_b     = "  69.950 " >
    <unit_cell_c     = "  60.530 " >
    <unit_cell_alpha = " 90.00 " >
    <unit_cell_beta  = "114.50 " >
    <unit_cell_gamma = " 90.00 " >
    
     
    ================CATEGORY 2:   Sequence Information =======================
    Enter one letter sequence for each polymeric entity in asymmetric unit
    
    --------------------------------------------------------------------------
    			  SOME DEFINITIONS
         An ENTITY is defined as any unique molecule present in the asymmetric 
         unit. Each unique biological polymer (protein or nucleic acids) in the 
         structure is considered an entity. Thus, if there are five copies of 
         a single protein in the asymmetric unit, the molecular entity is still 
         only one. Water and non-polymers like ions, ligands and sugars are 
         also entities. 
    
         Here we only consider the sequences of polymeric entities (protein or 
         nucleic acid).  
    
    	        GUIDELINES FOR COMPLETING THIS CATEGORY
          * In a PDB or mmCIF format file, all residues of a single polymeric 
          entity should have one chain ID. Multiple copies of the same entity 
          should each be assigned a unique chain ID. The multiple chain IDs 
          should be separated by commas as 'A,B,C,...'. If incorrect chain IDs 
          are used the entity groups extracted by this program will not be 
          correct. To avoid this, make necessary corrections in the PDB or mmCIF 
          file used to generate the data_template file and regenerate the 
          data_template.text file. Alternatively, edit the extracted sequence 
          in this file to correctly represent the sequence and chain IDs of each 
          polymeric entity.  
    
          * In addition to chain IDs, this program uses distance geometry to 
          asses if there are any breaks in the polymer sequence. These breaks 
          may occur due to missing residues (not included in the model due to 
          missing electron density) or due to poor geometry. Four question marks 
          '????' are used to denote these chain breaks. Replace these question 
          marks with the sequence of residues missing from the coordinates. Also 
          add any residues missing from the N- and/or C-termini here.
    
          * If there are non-standard residues in the coordinates, this program 
          lists them according to the three letter code used in the coordinate
          file as (ABC). If all the residues in your sequence are nonstandard, 
          check and edit the sequence manually to represent it correctly in this 
          file. 
    
          * If any residue was modeled as Ala or Gly due to lack of the side-chain 
          density, the sequence extracted here will represent them as A or G 
          respectively. Correct this to the original sequence that was present in 
          the crystal.
    ----------------------------------------------------------------------------
    
    	Below is the one letter chemical sequence extracted from your PDB 
    	coordinate file. The molecular entities are grouped and listed 
    	together. 
    
    PLEASE CHECK THE SEQUENCE of each entity carefully and modify it, as necessary.
    Make sure that you REVIEW THE FOLLOWING:  
       * chain breaks due to missing residues, 
       * missing residues in the N- and/or C-termini, 
       * non-standard residues and 
       * cases of residues modeled as Ala or Gly due to missing side-chain density.
    
    <molecule_entity_id="1" >
    <molecule_entity_type="polypeptide(L)" >
    <molecule_one_letter_sequence=" 
    MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRLDTETEGVPSTAIREISLLKELNHPNIVKLLDVI
    HTENKLYLVFEFLHQDLKKFMDASALTGIPLPLIKSYLFQLLQGLAFCHSHRVLHRDLKPQNLLINTEGA
    IKLADFGLARAFGVPVRTYTHEVVTLWYRAPEILLGCKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEID
    QLFRIFRTLGTPDEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRISAKAA
    LAHPFFQDVTKPVPHLRL" >
    < molecule_chain_id="A" >
    < target_DB_id=" " > (if known) 
    
    <molecule_entity_id="2" >
    <molecule_entity_type="polypeptide(L)" >
    <molecule_one_letter_sequence=" 
    MSHKQIYYSDKYDDEEFEYRHVMLPKDIAKLVPKTHLMSESEWRNLGVQQSQGWVHYMIHEPEPHILLFR
    RPLPKKPKK" >
    < molecule_chain_id="B" >
    < target_DB_id=" " > (if known) 
    
    <molecule_entity_id=" " >
    <molecule_entity_type=" " >
    <molecule_one_letter_sequence="  " >
    <molecule_chain_id=" " >
    
    <target_DB_id=" " >  (if known)
    
    
    ================CATEGORY 3:   Contact Authors=============================
    Enter information about the contact authors.
        Note: items marked by (e.g. ) are manditory. 
              PI information should be always given.
       
    1.  Information about the Principal investigator (PI) should be given. 
    
    <contact_author_PI_id = "1 ">           (must be given 1)
    <contact_author_PI_salutation = " ">     ( Dr./Prof./Mr./Mrs./Ms.)
    <contact_author_PI_first_name = " ">      (e.g. John)
    <contact_author_PI_last_name = " ">        (e.g. Rodgers)
    <contact_author_PI_middle_name = " ">         
    <contact_author_PI_role = " ">   (e.g. investigator/responsible scientist)
    <contact_author_PI_organization_type = " ">  (e.g. academica/commercial/goverment/other)
    <contact_author_PI_email = " ">        (e.g.   name@host.domain.country)      
    <contact_author_PI_address = " ">            (e.g. 610 Taylor road)
    <contact_author_PI_city = " ">               (e.g. Piscataway)
    <contact_author_PI_State_or_Province = " ">   (e.g.  New Jersey)
    <contact_author_PI_Zip_Code = " ">           (e.g.  08864)
    <contact_author_PI_Country = " ">          (e.g.  UNITED STATES)
    <contact_author_PI_fax_number = " ">
    <contact_author_PI_phone_numer = " ">
    
    2. Information about other contact authors
    
    <contact_author_id = "2 ">       (e.g. 2,3,4..)
    <contact_author_salutation = " ">   
    <contact_author_first_name = " ">      
    <contact_author_last_name = " ">       
    <contact_author_middle_name = " ">         
    <contact_author_role = " ">    
    <contact_author_organization_type = " ">  
    <contact_author_email = " ">             
    <contact_author_address = " ">            
    <contact_author_city = " ">              
    <contact_author_State_or_Province = " ">   
    <contact_author_Zip_Code = " ">           
    <contact_author_Country = " ">          
    <contact_author_fax_number = " ">
    <contact_author_phone_numer = " ">
    
    
    ...(add more if needed)...
    
    ================CATEGORY 4:   Structure Genomics=========================
    If it is the structure genomics project, give the information
    
    <SG_project_id = " 1">  
    <SG_project_name = " ">        (e.g. NPPSFA/PSI, Protein Structure Initiative)
    <full_name_of_SG_center = " ">   (e.g. Berkeley Structural Genomics Center)
    
    
    ================CATEGORY 5:   Release Status==============================
    Enter release status for the coordinates,structure_factor, and sequence
    
       Status for sequence should be chosen from one of the following:
       (release now, hold for release)
    
       Status for others should be chosen from one of the following:
      (release now, hold for publication,  hold for 4 weeks, hold for 6 weeks, 
       hold for 6 months, hold for 1 year)
    
    <Release_status_for_coordinates = " ">      (e.g. release now)
    <Release_status_for_structure_factor = " ">
    <Release_status_for_sequence = " ">     
    
    ================CATEGORY 6:   Title=======================================
    Enter the title for the structure
    
    <structure_title = " ">     (e.g. Crystal Structure Analysis of the B-DNA)
    <structure_details = " ">  
    
    
    ================CATEGORY 7: Authors of Structure============================
    Enter authors of the deposited structures (e.g. Surname, F.M.) 
    
    <structure_author_name = " ">
    <structure_author_name = " ">
    <structure_author_name = " ">
    <structure_author_name = " ">
    ...add more if needed...
    
    
    ================CATEGORY 8:   Citation Authors============================
    Enter author names for the publications associated with this deposition.
    
          The primary citation is the article in which the deposited coordinates 
          were first reported. Other related citations may also be provided.
    
    1. For the primary citation
    <primary_citation_author_name = " ">    (e.g. Surname, F.M.) 
    <primary_citation_author_name = " ">
    <primary_citation_author_name = " ">
    <primary_citation_author_name = " ">
    ...add more if needed...
    
    2. For other related citations  (if applicable)
    <citation_author_id = " ">    (e.g. 1, 2 ..)
    <citation_author_name = " ">
    <citation_author_name = " ">
    <citation_author_name = " ">
    <citation_author_name = " ">
    ...add more if needed...
    
    
    ...(add more other citations if needed)...
    
    ================CATEGORY 9:   Citation Article============================
    Enter citation article (journal, title, year, volume, page)  
    
          If the citation has not yet been published, use 'To be published' 
          for the category 'journal_abbrev' and leave pages and volume blank. 
    
    1. For primary citation
    <primary_citation_id = "primary">     
    <primary_citation_journal_abbrev = " ">     (e.g. to be published)
    <primary_citation_title = " ">   
    <primary_citation_year = " ">
    <primary_citation_journal_volume = " "> 
    <primary_citation_page_first = " ">
    <primary_citation_page_last = " ">
    
    2. For other related citation (if applicable)
    <citation_id = "1 ">               (e.g. 1, 2, 3 ...)
    <citation_journal_abbrev = " ">
    <citation_title = " ">
    <citation_year = " ">
    <citation_journal_volume = " "> 
    <citation_page_first = " ">
    <citation_page_last = " ">
    
    
    ...(add more citations if needed)...
    
    
    ================CATEGORY 10:   Molecule Names==============================
    Enter the names of the molecules (entities) that are in the asymmetric unit
     
    NOTE: The number of molecular names should be the same as CATEGORY 2 !
          The name of molecule should be obtained from the appropriate 
          sequence database reference, if available. Otherwise the gene name or
          other common name of the entity may be used. 
          e.g. HIV-1 integrase for protein 
               RNA Hammerhead Ribozyme for RNA 
    
    <molecule_name = " ">    (entity 1)
    <molecule_name = " ">    (entity 2)
    
    ...(add more if needed)...
    
    
    ================CATEGORY 11:   Molecule Details============================
    Enter additional information about each entity, if known. (optional)
    
          Additional information would include details such as fragment name 
          (if applicable), mutation, and E.C.number.
    
    1. For entity 1
    <Molecular_entity_id = "1 ">       (e.g. 1, 2, ...)
    <Fragment_name = " ">             (e.g. ligand binding domain, hairpin)
    <Specific_mutation = " ">         (e.g. C280S)
    <Enzyme_Comission_number = " ">   (if known: e.g. 2.7.7.7)
    
    2. For entity 2
    <Molecular_entity_id = "2 ">       
    <Fragment_name = " ">   
    <Specific_mutation = " ">      
    <Enzyme_Comission_number = " "> 
    
    ...(add more if needed)...
    
    ================CATEGORY 12:   Genetically Manipulated Source=============
    Enter data in the genetically manipulated source category 
    
          If the biomolecule has been genetically manipulated, describe its 
          source and expression system here. 
    
    1. For entity 1
    <Manipulated_entity_id = "1 ">               (e.g. 1, 2, ...)
    <Source_organism_scientific_name = " ">      (e.g. Homo sapiens)
    <Source_organism_gene = " ">                 (e.g. RPOD, ALKA...)
    <Source_organism_strain = " ">               (e.g. BH10 ISOLATE, K-12...)
    <Expression_system_scientific_name = " ">    (e.g. Escherichia coli)
    <Expression_system_strain = " ">	     (e.g. BL21(DE3))
    <Expression_system_vector_type = " ">	     (e.g. plasmid)
    <Expression_system_plasmid_name = " ">       (e.g. pET26)
    <Manipulated_source_details = " ">           (any other relevant information)
    
    2. For entity 2
    <Manipulated_entity_id = "2 ">       
    <Source_organism_scientific_name = " ">    
    <Source_organism_gene = " ">     
    <Source_organism_strain = " ">               
    <Expression_system_scientific_name = " ">  
    <Expression_system_strain = " ">	     
    <Expression_system_vector_type = " ">	     
    <Expression_system_plasmid_name = " ">     
    <Manipulated_source_details = " ">        
    
    
    ...(add more if needed)...
    
    ================CATEGORY 13:   Natural Source=============================
    Enter data in the natural source category  (if applicable)
    
        If the biomolecule was derived from a natural source, describe it here.
          
    
    1. For entity 1
    <natural_source_entity_id = " ">          (e.g. 1, 2, ...)
    <natural_source_scientific_name = " ">    (e.g. Homo sapiens)
    <natural_source_organism_strain = " ">    (e.g. DH5a , BMH 71-18)
    <natural_source_details = " ">            (e.g. organ, tissue, cell ..)
    
    
    2. For entity 2
    <natural_source_entity_id = " ">    
    <natural_source_scientific_name = " "> 
    <natural_source_organism_strain = " ">    
    <natural_source_details = " ">   
    
    
    ...(add more if needed)...
    
    ================CATEGORY 14:  Synthetic Source=============================
    If the biomolecule has not been genetically manipulated or synthesized, 
    describe its source here. 
    
    1. For entity 1
    <synthetic_source_entity_id = " ">          (e.g. 1, 2, ...)
    <synthetic_source_description = " ">      (if known)
    
    2. For entity 2
    <synthetic_source_entity_id = " ">    
    <synthetic_source_description = " ">     
    
    ...(add more if needed)...
    
    
    ================CATEGORY 15:   Keywords===================================
    Enter a list of keywords that describe important features of the deposited
    structure.  
    
          For example, beta barrel, protein-DNA complex, double helix, 
          hydrolase, structural genomics etc. 
    
    <structure_keywords = " ">  
    
    ================CATEGORY 16:   Biological Assembly========================
    Enter data in the biological assembly category (if applicable)
    
          Biological assembly describes the functional unit(s) present in the
          structure. There may be part of a biological assembly, one or more 
          than one biological assemblies in the asymmetric unit.
          Case 1
          * If the asymmetric unit is the same as the biological assembly
    	nothing special needs to be noted here.
          Case 2
          * If the asymmetric unit does not contain a complete biological unit. 
    	Please provide symmetry operations including translations required 
    	to build the biological unit.
    	(example:
    	The biological assembly is a hexamer generated from the dimer
    	in the asymmetric unit by the operations:  -y, x-y-1, z-1 and 
    	-x+y, -x-1, z-l.)
          Case 3
          * If the asymmetric unit has multiple biological units
    	Please specify how to group the contents of the asymmetric unit into 
    	biological units.
    	(example:
    	The biological unit is a dimer. There are 2 biological units in the 
    	asymmetric unit (chains A & B and chains C & D).
    
    <biological_assembly = " ">     (biological unit 1)
    <biological_assembly = " ">     (biological unit 1)
    
    ....(add more if needed)....
    
    ================CATEGORY 17:   Methods and Conditions=====================
    Enter the crystallization conditions for each crystal
    
    1. For crystal 1:				
    <crystal_number = "1 ">	           (e.g. 1, 2, ...)
    <crystallization_method = " ">      (e.g. vapor diffusion, hanging drop) 
    <crystallization_pH = " ">          (e.g. 7.5 ...)
    <crystallization_temperature = " "> (e.g. 298) (in Kelvin) 
    <crystallization_details = " ">  (e.g. PEG 4000, NaCl etc.)
    
    2. For crystal 2:
    <crystal_number = " ">                 
    <crystallization_method = " ">
    <crystallization_pH = " ">
    <crystallization_temperature = " ">
    <crystallization_details = " ">
    
    ...(add more if needed)...
    
    ================CATEGORY 18:   Crystal Property===========================
    Enter solvent content, Matthews coefficient
          These values were calculated based on the sequence as shown in 
          CATEGORY 2. If there are missing residues, you need to add the
          missing residues and re-run the program to get accurate values.
          (The command to re-run is 'extract -sol data_template.text')
    
    1. For crystal 1:
    <crystals_number = " 1 ">                  (e.g. 1, 2, ...)
    <crystals_solvent_content = "50.6 ">
    <crystals_matthews_coefficient = "2.5 ">
    <crystals_mosaicity = " ">    (e.g. 0.5 ...)
    
    
    2. For crystal 2:
    <crystals_number = "  ">                 
    <crystals_solvent_content = "50.6 ">
    <crystals_matthews_coefficient = "2.5 ">
    <crystals_mosaicity = " ">    
    
    
    
    ...(add more if needed)...
    
    
    
    ================CATEGORY 19:   Radiation Source (experiment)============
    Enter the details of the source of radiation, the X-ray generator, 
    and the wavelength for each diffraction.
    
    1. For experiment 1:
    <radiation_experiment = "1 ">      (e.g. 1, 2, ...)
    <radiation_source = " ">           (e.g. SYNCHROTRON, ROTATING ANODE ...)
    <radiation_source_type = " ">      (e.g. NSLS BEAMLINE X8C ...)
    <radiation_wavelengths= " ">       (e.g. 1.502 ...)
    <radiation_detector = " ">         (e.g. CCD/AREA DETECTOR/IMAGE PLATE ...)
    <radiation_detector_type= " ">     (e.g. SIEMENS-NICOLET/RIGAKU RAXIS ...)
    <radiation_detector_details = " ">    (e.g. mirrors...)
    <data_collection_date = " ">             (e.g. 2004-11-27)
    <data_collection_temperature = " ">      (e.g. 100 for crystal  1:)
    <data_collection_protocol= " ">          (e.g. SINGLE WAVELENGTH, MAD, ...)
    <data_collection_monochromator= " ">     (e.g. GRAPHITE, Ni FILTER ...)
    
    2. For experiment 2:
    
    <radiation_experiment = "2 ">      
    <radiation_source = " ">      
    <radiation_source_type = " ">      
    <radiation_wavelengths= " ">       
    <radiation_detector = " ">     
    <radiation_detector_type= " ">     
    <radiation_detector_details = " ">    
    <data_collection_data = " ">           
    <data_collection_temperature = " ">      
    <data_collection_protocol= " ">          
    <data_collection_monochromator= " ">          
    
    
    ....(add more if needed)....
    
    =====================================END==================================
    
    
    
    script file: (log_script.inp)     TOP
    
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    		      	THE LOG_SCRIPT.INP FILE
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    
    			 NOTES AND REMINDER 
    This script file is used to enter the names of the crystallographic 
    software used for structure determination and the log, PDB, mmCIF or 
    text files generated by them.
    
    PLEASE COMPLETE the ENTRY FIELDS according to the type of your experiment 
    and use the command 'extract -ext log_script.inp' to obtain the completed 
    structure data ready for validation and deposition.
    
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
     
    			GUIDELINES FOR USING THIS FILE
      1. Only strings included between the 'lesser than' and 'greater than' 
         signs (<.....>) will be parsed for evaluation by the program. Therefore, 
         DO NOT write either on the left or right of the 'less than' and 'greater 
         than' signs respectively.
    
      2. All alphanumeric values or strings that you include in the different 
         categories should be within double-quotes. Blank spaces or carriage 
         returns within a pair of double quotes are ignored by the program. 
         DO NOT use double quotes (") within strings that you enter.
       
      3. Log files used for generating the deposition should be generated from 
         the best (usually the last) trial for each crystallographic software.
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~START INPUT DATA BELOW~~~~~~~~~~~~~~~~~~~~~~~
    
    ===============PART 1: Structure Factor for Final Refinement==============
    Enter reflection data file used for final structure refinement
    	 
         NOTE:
         * 	Usually the highest resolution or best data set is used for the 
    	refinement. Use that structure factor file here.
    	
         *  In some cases, it may not be possible to collect a complete dataset 
    	from a single crystal. Thus, multiple data sets have to be scaled 
    	and merged together for refinement. Use the merged reflection file 
    	here. 
    
         * 	If the reflection data format is not one of those listed below, 
    	please use OTHER for the data format, and provide an ASCII file 
    	that has at least five values [H, K, L, I (or F), sigmaI (or sigmaF)] 
    	for each reflection and seperate each item by one or more spaces.   
    	Include the test flags as the sixth column in the file (if available).
    
         * 	If the reflection file is in mtz format (e.g. using REFMAC5), convert 
    	it to mmCIF format using the mtz2various application provided by CCP4. 
    
    	Reflection data format:
      	CNS|SHELX|TNT|REFMAC5|HKL|SCALEPACK|DTREK|SAINT|SCALA|3DSCALE
    
    <reflection_data_type = "F" >      [enter I (intensity) or F (amplitude)]
    <reflection_data_format = "CNS" >
    <reflection_data_file_name = " " >
    
    ==============PART 2: Structure Factors for Protein Phasing================
    Enter reflection data files used for heavy atom or MAD phasing
    
         NOTE:
         *  Enter this category if you have more than one complete reflection 
    	file (e.g. in the case of MAD,SIRAS, MIR). The LOG files generated 
    	from data scaling software for all these data sets are also needed.
    
         *  If the scaling program is not one of those listed below 
    	(HKL|SCALEPACK|DTREK|SAINT|3DSCALE), enter OTHER for the program 
    	name and provide an ASCII file with five values 
    	[H, K, L, I (or F), sigmaI (or sigmaF)] for each reflection and 
            seperate each item by a space
    
         *  If the same crystal was used for collecting multiple data sets, the
    	crystal number will remain '1' as the wavelength numbers change. 
    	However, if multiple crystals were used, for the data collections, 
    	the corresponding crystal numbers should be used for each data set.
    
         *  IT IS IMPORTANT THAT THE LOG FILE AND DATA FILE COME FROM THE 
    	SAME PROGRAM.
    
    <scale_data_type = "I" >          [enter I (intensity) or F (amplitude)]
    <scale_program_name = "HKL" >
    
    For data set 1:
    <crystal_number  = "1" >
    <diffract_number = "1" >
    <scale_data_file_name  = " " >
    <scale_log_file_name   = " " >
    
    For data set 2:
    <crystal_number  = "1" >
    <diffract_number = "2" >
    <scale_data_file_name  = " " >
    <scale_log_file_name   = " " >
    
    For data set 3:
    <crystal_number  = "1" >
    <diffract_number = "3" >
    <scale_data_file_name  = " " >
    <scale_log_file_name   = " " >
    
    ==================PART 3: Statistics for Indexing=====================
    Enter log file and software name for data indexing  
    
         NOTE: 
         * 	This is only for the data of final structure refinment.
    
    	Software for indexing is one of the following: 
    	(HKL|DENZO|DTREK|MOSFLM)
    
    <data_indexing_software = "HKL" >
    <data_indexing_LOG_file_name = " " >
    <data_indexing_CIF_file_name = " " >  (if mmCIF format)
    
    ==================PART 4: Statistics for Data Scaling=====================
    Enter log file and software name for data scaling 
    
         NOTE: 
         * 	The log file included here should have scaling statistics of 
    	the file used for the final structure refinement. If multiple data 
    	sets were scaled and merged for refinement (as described in Part 1
    	above) use the log file generated during merging of the data sets. 
    
    	Software for scaling is one of the following: 
    	(HKL|SCALEPACK|DTREK|SAINT|3DSCALE|SCALA)
    
    <data_scaling_software = "HKL" >
    <data_scaling_LOG_file_name = " " >
    <data_scaling_CIF_file_name = " " >  (if  mmCIF format)
    
    ==============PART 5: Statistics for Molecular Replacement================
    Enter log files and software name for molecular replacement
    
         NOTE: 
    	Software is one of the following:
    	(CNS|AMORE|MOLREP|EPMR|PHASER)
    	The log file should be from the best trial of MR.
    
    <mr_software = " " >
    <mr_log_file_LOG_1 = " " >
    <mr_log_file_LOG_2 = " " >
    
    =================PART 6: Statistics for Protein Phasing===================
    Enter log files and software name for heavy atom phasing
    
         NOTE: 
            The phasing method should be one of (SAD|MAD|SIR|SIRAS|MIR|MIRAS).
    	Software is one of the following:
    	(CNS|MLPHARE|SOLVE|SHELXS|SHELXD|SNB|BNP|SHARP|PHASES)
    	The log file should be from the best trial of phasing.
    
    <phasing_method = "MAD" >        
    <phasing_software = "SOLVE" >
    
    <phasing_log_file_LOG_1 = " " >    
    <phasing_log_file_PDB_1 = " " >    (if PDB format (heavy atom coordinates))
    <phasing_log_file_CIF_1 = " " >    (if mmCIF format)
    
    <phasing_log_file_LOG_2 = " " >
    <phasing_log_file_PDB_2 = " " >
    <phasing_log_file_CIF_2 = " " >
    
    ... add more if needed ...
    
    ===============PART 7: Statistics for Density Modification================
    Enter log files and software name for density modification
    
         NOTE: 
    	Software is one of the following:
    	(CNS|DM|RESOLVE|SOLOMON|SHELXE)
    	The log file should be from the best trial of density modification.
    
    <dm_software = "RESOLVE " >
    <dm_log_file_LOG_1 = " " >
    <dm_log_file_CIF_1 = " " >         (if mmCIF format)
    
    ===============PART 8: Statistics for Structure Refinement================
    Enter log files and software name used for final structure refinement
    
         NOTE: 
            
    	Software is one of the following:
    	(CNS|REFMAC5|SHELXL|TNT|PROLSQ|NUCLSQ|RESTRAIN)
    	The log file should be from the final trial of structure refinement.
    
    <refine_software = "REFMAC5" >
    
    <refine_log_file_PDB_1 = " " >  (coordinate file in PDB format)
    <refine_log_file_CIF_1 = " " >  (mmCIF file containing refinement statistics)
    <refine_log_file_LOG_1 = " " >
    
    
    
    =======================PART 9: Data Template File=========================
    Enter file name of the data template file
    
         NOTE: 
    	This file 'data_template.text' was generated by using the
    	command 'extract -pdb pdb_file' or 'extract -cif cif_file'. It 
    	contains the sequences of all unique polymers (protein or nucleic 
    	acid) present in the structure. It also contains other 
    	non-electronically captured information. Please complete the 
    	data template file before running pdb_extract.
    
    <data_template_file = "data_template.text" >
    
    
    ==========================PART 10: Output Files============================
    Enter the output file names 
    
         NOTE: 
            If you do not give the output file names, the default names
            pdb_extract_sf.mmcif containing structure factors and 
            pdb_extract.mmcif containing coordinates will be assigned 
            by the program
    
    
    <sf_output= " " >            (for structure factors)
    <statistics_output= " " >    (for coordinates and statistics)
    
    =====================================END==================================
    
    
    
    
    Data template file for NMR: (data_template.text)     TOP
    
    
    
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    		    THE DATA_TEMPLATE.TEXT FILE FOR NMR	
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    
    			  NOTES AND REMINDER
    The data template file contains data entries for unique chemical sequences 
    present in the structure and other non-electronically captured information. 
    
    PLEASE CHECK CATEGORIES 1. Before proceeding any further, make necessary 
    corrections here so that all information in these categories are complete 
    and correct.
    
    You may choose to fill in CATEGORIES (2-21) either here or later in ADIT.
    
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    
    			GUIDELINES FOR USING THIS FILE
      1. Only strings included between the 'lesser than' and 'greater than' 
         signs (<.....>) will be parsed for evaluation by the program. Therefore, 
         DO NOT write either on the left or right of the 'less than' and 'greater 
         than' signs respectively.
    
      2. All alphanumeric values or strings that you include in the different 
         categories should be within double-quotes. Blank spaces or carriage 
         returns within a pair of double quotes are ignored by the program. 
         DO NOT use double quotes (") within strings that you enter.
       
       
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~START INPUT DATA BELLOW~~~~~~~~~~~~~~~~~~~~~~~
     
    ================CATEGORY 1:   Molecular Entity Sequence===================
    Enter one letter code sequence for each molecular entity
    
    A Molecular entity is defined as a unique monomer in each model.The
    molecular entities are calculated and grouped together. 
    Please carefully check the entity and modify it, if necessary. 
    
    If a chain is broken, four question marks ???? are given at the broken
    point. Please REPLACE the ? by the missing sequences including N and C 
    terminals. If residue name is not the standard one letter code (due to 
    modification), the full residue (three letter name) name should be given 
    and parenthesized.
    
    NOTE: If all the residues are modified, sequence may not be extracted.
          Please manually add the sequence.
    
    <molecule_entity_id="1" >
    <molecule_entity_type="polypeptide(L)" >
    <molecule_one_letter_sequence=" 
    MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRLDT????TAIREISLLKELNHPNIVKLLDVIHTENKLY
    LVFEFLHQDLKKFMDASALTGIPLPLIKSYLFQLLQGLAFCHSHRVLHRDLKPQNLLINTEGAIKLADFG
    LARAFGVPVRTYTHEVVTLWYRAPEILLGCKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEIDQLFRIFR
    TLGTPDEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRISAKAALAHPFFQ
    DVTKPVP" >
    < molecule_chain_id="A" >
    < target_DB_id=" " > (if known) 
    
    <molecule_entity_id="2" >
    <molecule_entity_type="polypeptide(L)" >
    <molecule_one_letter_sequence=" 
    QIYYSDKYDDEEFEYRHVMLPKDIAKLVPKTHLMSESEWRNLGVQQSQGWVHYMIHEPEPHILLFRRPLP
    " >
    < molecule_chain_id="B" >
    < target_DB_id=" " > (if known) 
    
    <molecule_entity_id=" " >
    <molecule_entity_type=" " >
    <molecule_one_letter_sequence="  " >
    <molecule_chain_id=" " >
    
    <target_DB_id=" " >  (if known)
    
    
    ================CATEGORY 2:   Contact Authors=============================
    Enter information about the contact authors.
        Note: items marked by (e.g. ) are manditory. 
              PI information should be always given.
       
    1.  Information about the Principal investigator (PI) should be given. 
    
    <contact_author_PI_id = "1 ">           (must be given 1)
    <contact_author_PI_salutation = " ">     ( Dr./Prof./Mr./Mrs./Ms.)
    <contact_author_PI_first_name = " ">      (e.g. John)
    <contact_author_PI_last_name = " ">        (e.g. Rodgers)
    <contact_author_PI_middle_name = " ">         
    <contact_author_PI_role = " ">   (e.g. investigator/responsible scientist)
    <contact_author_PI_organization_type = " ">  (e.g. academica/commercial/goverment/other)
    <contact_author_PI_email = " ">        (e.g.   name@host.domain.country)      
    <contact_author_PI_address = " ">            (e.g. 610 Taylor road)
    <contact_author_PI_city = " ">               (e.g. Piscataway)
    <contact_author_PI_State_or_Province = " ">   (e.g.  New Jersey)
    <contact_author_PI_Zip_Code = " ">           (e.g.  08864)
    <contact_author_PI_Country = " ">          (e.g.  UNITED STATES)
    <contact_author_PI_fax_number = " ">
    <contact_author_PI_phone_numer = " ">
    
    2. Information about other contact authors
    
    <contact_author_id = "2 ">       (e.g. 2,3,4..)
    <contact_author_salutation = " ">   
    <contact_author_first_name = " ">      
    <contact_author_last_name = " ">       
    <contact_author_middle_name = " ">         
    <contact_author_role = " ">    
    <contact_author_organization_type = " ">  
    <contact_author_email = " ">             
    <contact_author_address = " ">            
    <contact_author_city = " ">              
    <contact_author_State_or_Province = " ">   
    <contact_author_Zip_Code = " ">           
    <contact_author_Country = " ">          
    <contact_author_fax_number = " ">
    <contact_author_phone_numer = " ">
    
    
    ...(add more if needed)...
    
    ================CATEGORY 3:   Structure Genomics=========================
    If it is the structure genomics project, give the information
    
    <SG_project_id = " 1">  
    <SG_project_name = " ">        (e.g. NPPSFA/PSI, Protein Structure Initiative)
    <full_name_of_SG_center = " ">   (e.g. Berkeley Structural Genomics Center)
    
    
    ================CATEGORY 4:   Release Status==============================
    Enter Release Status for Coordinates, Constraints, Sequence
    
       Status for sequence should be chosen from one of the following:
       (release now, hold for release)
    
       Status for others should be chosen from one of the following:
      (release now, hold for publication,  hold for 4 weeks, hold for 6 weeks, 
       hold for 6 months, hold for 1 year)
    
    <Release_status_for_coordinates = " ">
    <Release_status_for_NMR_constraints = " ">
    <Release_status_for_sequence = " ">
    
    ================CATEGORY 5:   Title=======================================
    Enter a title for the structure
    
    <structure_title = " ">     (e.g. Crystal Structure Analysis of the B-DNA)
    <structure_details = " ">  
    
    ================CATEGORY 6: Authors of Structure============================
    Enter authors of the deposited structures (e.g. Surname, F.M.) 
    
    <structure_author_name = " ">
    <structure_author_name = " ">
    <structure_author_name = " ">
    <structure_author_name = " ">
    ...add more if needed...
    
    
    ================CATEGORY 7:   Citation Authors============================
    Enter author names for the publications associated with this deposition.
    
          The primary citation is the article in which the deposited coordinates 
          were first reported. Other related citations may also be provided.
    
    1. For the primary citation
    <primary_citation_author_name = " ">    (e.g. Surname, F.M.) 
    <primary_citation_author_name = " ">
    <primary_citation_author_name = " ">
    <primary_citation_author_name = " ">
    ...add more if needed...
    
    2. For other related citations  (if applicable)
    <citation_author_id = " ">    (e.g. 1, 2 ..)
    <citation_author_name = " ">
    <citation_author_name = " ">
    <citation_author_name = " ">
    <citation_author_name = " ">
    ...add more if needed...
    
    
    ...(add more other citations if needed)...
    ================CATEGORY 8:   Citation Article============================
    Enter citation article (journal, title, year, volume, page)  
    
          If the citation has not yet been published, use 'To be published' 
          for the category 'journal_abbrev' and leave pages and volume blank. 
    
    1. For primary citation
    <primary_citation_id = "primary">     
    <primary_citation_journal_abbrev = " ">     (e.g. to be published)
    <primary_citation_title = " ">   
    <primary_citation_year = " ">
    <primary_citation_journal_volume = " "> 
    <primary_citation_page_first = " ">
    <primary_citation_page_last = " ">
    
    2. For other related citation (if applicable)
    <citation_id = "1 ">               (e.g. 1, 2, 3 ...)
    <citation_journal_abbrev = " ">
    <citation_title = " ">
    <citation_year = " ">
    <citation_journal_volume = " "> 
    <citation_page_first = " ">
    <citation_page_last = " ">
    
    
    ...(add more citations if needed)...
    ================CATEGORY 9:   Molecule Names==============================
    Enter the name of the molecule for each entity
    
          The name of molecule should be obtained from the appropriate 
          sequence database reference, if available. Otherwise the gene name or
          other common name of the entity may be used. 
          e.g. HIV-1 integrase for protein 
               RNA Hammerhead Ribozyme for RNA 
          The number of entities should be the same as in CATEGORY 1.
    
    <molecule_name = " ">    (entity 1)
    <molecule_name = " ">    (entity 2)
    
    ...(add more if needed)...
    
    ================CATEGORY 10:  Molecule Details============================
    Enter additional information about each entity, if known. (optional)
    
          Additional information would include details such as fragment name 
          (if applicable), mutation, and E.C.number.
    
    1. For entity 1
    <Molecular_entity_id = "1 ">       (e.g. 1, 2, ...)
    <Fragment_name = " ">             (e.g. ligand binding domain, hairpin)
    <Specific_mutation = " ">         (e.g. C280S)
    <Enzyme_Comission_number = " ">   (if known: e.g. 2.7.7.7)
    
    2. For entity 2
    <Molecular_entity_id = "2 ">       
    <Fragment_name = " ">   
    <Specific_mutation = " ">      
    <Enzyme_Comission_number = " "> 
    
    ...(add more if needed)...
    
    ================CATEGORY 11:   Genetically Manipulated Source==============
    Enter data in the genetically manipulated source category 
    
          If the biomolecule has been genetically manipulated, describe its 
          source and expression system here. 
    
    1. For entity 1
    <Manipulated_entity_id = "1 ">               (e.g. 1, 2, ...)
    <Source_organism_scientific_name = " ">      (e.g. Homo sapiens)
    <Source_organism_gene = " ">                 (e.g. RPOD, ALKA...)
    <Expression_system_scientific_name = " ">    (e.g. Escherichia coli)
    <Expression_system_strain = " ">	     (e.g. BL21(DE3))
    <Expression_system_vector_type = " ">	     (e.g. plasmid)
    <Expression_system_plasmid_name = " ">       (e.g. pET26)
    <Manipulated_source_details = " ">           (any other relevant information)
    
    2. For entity 2
    <Manipulated_entity_id = "2 ">       
    <Source_organism_scientific_name = " ">    
    <Source_organism_gene = " ">     
    <Expression_system_scientific_name = " ">  
    <Expression_system_strain = " ">	     
    <Expression_system_vector_type = " ">	     
    <Expression_system_plasmid_name = " ">     
    <Manipulated_source_details = " ">        
    
    
    ...(add more if needed)...
    
    ================CATEGORY 12:   Natural Source=============================
    Enter data in the natural source category  (if applicable)
    
        If the biomolecule was derived from a natural source, describe it here.
          
    
    1. For entity 1
    <natural_source_entity_id = " ">          (e.g. 1, 2, ...)
    <natural_source_scientific_name = " ">    (e.g. Homo sapiens)
    <natural_source_organism_strain = " ">    (e.g. DH5a , BMH 71-18)
    <natural_source_details = " ">            (e.g. organ, tissue, cell ..)
    
    
    2. For entity 2
    <natural_source_entity_id = " ">    
    <natural_source_scientific_name = " "> 
    <natural_source_organism_strain = " ">    
    <natural_source_details = " ">   
    
    
    ...(add more if needed)...
    
    ================CATEGORY 13:  Synthetic Source=============================
    If the biomolecule has not been genetically manipulated or synthesized, 
    describe its source here. 
    
    1. For entity 1
    <synthetic_source_entity_id = " ">          (e.g. 1, 2, ...)
    <synthetic_source_description = " ">      (if known)
    
    2. For entity 2
    <synthetic_source_entity_id = " ">    
    <synthetic_source_description = " ">     
    
    ...(add more if needed)...
    
    ================CATEGORY 14:   Keywords===================================
    Enter a list of keywords that describe important features of the deposited
    structure.  
    
          For example, beta barrel, protein-DNA complex, double helix, 
          hydrolase, structural genomics etc. 
    
    <structure_keywords = " ">  
    
    ================CATEGORY 15:   Ensemble===================================
    Enter data in category ensemble
       
      Skip this section, if only one average structure has been deposited.
    
    <conformers_calculated_total_number = " ">   (e.g. 200)
    <conformers_submitted_total_number = " ">    (e.g. 20)
    <conformers_selection_criteria = " ">  (e.g. 20 structures for lowest energy)
    
    ================CATEGORY 16:   Representative Conformers==================
    Enter data in category representative conformers
    
      Normally, only one of the ensemble is selected as a representative
      structure.
    
    <conformer_id = " ">      (e.g. 1,2..)
    <conformer_selection_criteria = " ">  (e.g.lowest energy, fewest violations)
    
    ================CATEGORY 17:   Sample Details=============================
    Enter a description of each NMR sample, including the solvent system used. 
    
    1. for sample 1.
    <solution_id_1= "1 ">       (e.g. 1, 2.. )
    <solution_content_1= " ">  (e.g. 50mM phosphate buffer NA; 90% H2O, 10% D2O)
    <solvent_system_1= " ">    (e.g. 90% H2O, 10% D2O )
    
    2. for sample 2.
    <solution_id_2= " ">  
    <solution_content_2= " "> 
    <solvent_system_2= " ">   
    
    ....add more if needed....
    
    ================CATEGORY 18:   Sample Conditions==========================
    Enter experimental conditions used for each sample. 
    
      Each set of conditions is identified by a numerical code. 
    
    1. for sample 1.
    <Conditions_id_1 = "1 ">    (e.g. 1, 2..)
    <Temperature_1 = " ">      (e.g. 298)  (in Kelvin) 
    <Pressure_1 = " ">         (e.g. ambient, 1atm)
    <pH_value_1 = " ">         (e.g. 7.2)
    <Ionic_strength_1 = " ">   (e.g.  100MM KCL)
    
    2. for sample 2.
    <Conditions_id_2 = " ">  
    <Temperature_2 = " ">   
    <Pressure_2 = " ">   
    <pH_value_2 = " ">     
    <Ionic_strength_2 = " ">  
    
    ....add more if needed....
    
    ================CATEGORY 19:   Spectrometer===============================
    Enter the details about each spectrometer used to collect data. 
    
    1. for experiment 1:
    <spectrometer_id_1 = "1 ">              (e.g. 1, 2..)
    <spectrometer_manufacturer_1 = " ">    (e.g. Bruker ..) 
    <spectrometer_model_1 = " ">           (e.g. DRX)
    <spectrometer_field_strength_1 = " ">  (e.g. 500, 700)
    
    2. for experiment 2:
    <spectrometer_id_2 = " ">    
    <spectrometer_manufacturer_2 = " ">    
    <spectrometer_model_2 = " ">    
    <spectrometer_field_strength_2 = " ">    
    
    ....add more if needed....
    
    ================CATEGORY 20:   Experiment Type============================
    Enter information for those experiments that were used to generate
    constraint data. For each NMR experiment, indicate which sample and 
    which sample conditions were used for the experiment. 
    
    1. for experiment type 1:
    <experiment_type_id_1 = "1 ">    (e.g. 1, 2..)
    <solution_type_id_1= " 1">       (same ID as solution_id_1 in CATEGORY 17)
    <conditions_type_id_1 = "1 ">    (same ID as conditions_id_1 in CATEGORY 18)
    <Experiment_type_1= " ">        (e.g. 3D_15N-separated_NOESY)
    
    2. for experiment type 2:
    <experiment_type_id_2 = " ">    (e.g. 1, 2..)
    <solution_type_id_2= " ">       (same ID as solution_id_1 in CATEGORY 17)
    <conditions_type_id_2 = " ">    (same ID as conditions_id_1 in CATEGORY 18)
    <Experiment_type_2= " ">     
    
    ....add more if needed....
    
    ================CATEGORY 21:   Method and Details=========================
    Enter the method and details of the refinement for the deposited structure. 
    
    <NMR_method = " ">   (e.g. simulated annealing)
    <NMR_details = " ">  (enter details about the NMR refinement)
    
    
    =====================================END==================================