sf_convert - Workstation Version Manual

Converting structure factors from one format to another

(Created Nov. 12, 2007; last modified Oct. 10, 2008) | (Latest version 1.050)
Table of Contents
Introduction     TOP

We have developed the program sf_convert to help depositors easily convert various structure factor data formats to the mmCIF format (macromolecular Crystallographic Information File) for deposition. It also has other functions for the user to utilize the deposited SF data.

What does sf_convert do?     TOP

  • Convert various formats (mmCIF, CIF, MTZ, CNS, Xplor, HKL2000, Scalepack, Dtrek, TNT, SHELX, SAINT, EPMR, XSCALE, XPREP, XTALVIEW, X-GEN, XENGEN, MULTAN, MAIN, OTHER) to the mmCIF format.

  • Utilize the mmCIF file . It can convert from the mmCIF format to other formats (MTZ, CNS, TNT, SHELX, EPMR, XTALVIEW, HKL2000, Dtrek, XSCALE, MULTAN, MAIN, OTHER).

  • Direct conversion from one format to another without losing information.

  • Allow adding the test flag to the SF for the free R calculation.

  • Allow multiple data set conversion. Multiple data sets can be merged into on mmCIF file.

  • Allow manual labeling of data items for CNS and MTZ format for a safer conversion, if the automatic conversion failed.

  • For details, please read the 'Run the program (Xray data)' below.
  • Program access     TOP
    The source and binary versions of sf_convert can be downloaded from the address http://deposit.pdb.org/software . The source is available under an Open Source license. The binary distributions are available for Intel-Linux, SGI-IRIX, DEC-Alpha, and Sun-Solaris.

    Installations     TOP

    System Requirements:

  • platforms Intel-Linux, SGI-IRIX, DEC-Alpha, and Sun-Solaris
  • C/C++ compilers
  • Installation of binary distribution     TOP
    It is recommended to install the binary distribution, since it is very fast to install. The binary distributions are available for Intel-Linux, SGI-IRIX, DEC-Alpha, and Sun-Solaris.

    Step 1. Uncompress and unbundle the distribution using the following command:

    zcat sf-convert-vX.XXX-XXX.tar.gz | tar -xf -

    The executable file sf_convert is in the directory sf-convert-vX.XXX-XXX/bin.

    Installation of source code distribution     TOP
    1.  Installation   
     
        Uncompress and unbundle the distribution using the following command:
    
            zcat sf-convert-vX.XXX-XXX.tar.gz | tar -xf - 
    
    2.  Building the Application
    
        Position in the sf-convert-vX.XXX-XXX directory and run "make" command:
    
            cd sf-convert-vX.XXX-XXX 
            make
    
        The executable file sf_convert is in the sf-convert-vX.XXX-XXX/bin.
        subdirectory.
    
        NOTE: Users working on Sun platforms are advised to
        check the compiler flags in etc/make.platform.sunos5 file. Depending on
        the compiler version, users may be required to make modifications to those
        compiler flags.
    
    
    Run the program     TOP
    There is a test example included in this distribution. To run the example, from the directory sf-convert-vX.XXX-XXX directory run the "make test" command. The output files are in the sf-convert-vX.XXX-XXX/sf-convert-vX.X/test.

    Type sf_convert or sf_convert -h for help.

    
    
    Usage: sf_convert  -i input_format -o output_format -sf data_file
    -i     input format: 
           mmCIF, CIF, MTZ, CNS, Xplor, HKL2000, Scalepack, Dtrek, TNT, 
           SHELX, SAINT, EPMR, XSCALE, XPREP, XTALVIEW, X-GEN, XENGEN, 
           MULTAN, MAIN, OTHER.
    -o     output format:
           mmCIF, MTZ, CNS, TNT, SHELX, EPMR, XTALVIEW, HKL2000, Dtrek,
           XSCALE, MULTAN, MAIN, OTHER.
    -sf    input structure factor file name.
    
    Note: If the input option '-i input_format' is ommited, the program will
    guess the input format for conversion. The simple command line is 
    'sf_convert   -o output_format -sf data_file'. 
    
    
    Optional options:
    -out   followed by output file name (If not given, default by program)
    -pdb   followed by PDB file (must contain symmetry and cell parameters)
    -man   manually type in cells & symmetry (a,b,c,alpha,beta,gamma,p21)
           (separate each item by a comma ',' without space).
    -flag  followed by a number for Rfree test.
           (e.g. '-flag 8' means 8% of reflection selected for free R test)
    -sf_type  followed by I or F. (If not given, guessed by program)
    -label   followed by label name for CNS & MTZ (see 'sf_convert -h')
    -format  guess the format of the SF file (sf_convert -sf sffile -format)
    
    Example of CNS to mmCIF conversion:  
        sf_convert  -i cns -o mmcif -sf sf_file_name
    
    Example of Scalepack to MTZ conversion:  
        sf_convert  -i HKL -o MTZ  -sf sf_file 
        sf_convert  -i HKL -o MTZ  -sf sf_file -man  a,b,c,alpha,beta,gamma,P21
    
    Default data type: F for EPMR, TNT, OTHER, MULTAN, MAIN,XTALVIEW;
                       I for SHELX,HKL2000,DTREK,XSCALE,XPREP,SAINT;
    
    ==============================================================================
    Note:  
     1. If MTZ format is involved, CCP4 must be installed!
     2. -pdb or -man must be given for conversion to MTZ file,
        if input SF file does not contain symmetry and cells.
     3. If -man is used, separate each item by a comma ',' without space.
     4. If input format is OTHER, you should provide a ASCII file 
        with H, K, L, F, SigmaF separated by space.
     5. CIF and mmCIF are for small molecule and macro-molecule format.
    ==============================================================================
    
    For CNS/Xplor/MTZ to mmCIF, if automatic conversion can not be 
    performed, or there are multiple datasets in the SF file, the option '-label'
    should be provided!
    Each label is separated by a comma ',' and each data set is separated by a ':'
    
    All the data labels follow those of CCP4 as below. 
    THE LABELS ARE CASE SENSITIVE!. 
    
    If the labels involved in the '(' or ')', they must be quoted by '?=?'
    
    Examples:
     1. convert CNS to mmcif for one data set (if auto conversion failed)
        sf_convert -i cns -o mmcif -sf data_file -label FP=? , SIGFP=?, FREE=?
    
     2. convert CNS to mmcif for two data sets
        sf_convert -i cns -o mmcif -sf data_file -label 
                   FP=? ,SIGFP=?, I=?, SIGI=? : FP=? , SIGFP=?, I=?, SIGI=? 
    
     3. convert MTZ to mmcif for one data set
        sf_convert -i mtz -o mmcif -sf data_file -label FP=? ,SIGFP=?, FREE=?
    
     4. convert MTZ to mmcif for two data set
        sf_convert -i mtz -o mmcif -sf data_file -label FP=? ,SIGFP=?,I=?,SIGI=?
                  FREE=? : FP=? , SIGFP=?, I=?, SIGI=? 
    
     5. convert MTZ to mmcif (one data set with anomalous)
        sf_convert -i mtz -o mmcif -sf data_file -label FP=? ,SIGFP=? , 
                  FREE=? , 'F(+)=?' ,'SIGF(+)=?','F(-)=?' ,'SIGF(-)=?'
    
    Note: The question marks '?'  correspond to the labels in the SF file.
    
          mmCIF token                    type     data label   
    
     _refln.F_meas_au         	      F           FP 
     _refln.F_meas_sigma_au               Q           SIGFP 
     _refln.intensity_meas    	      J           I     
     _refln.intensity_sigma   	      Q           SIGI   
    
     _refln.F_calc            	      F           FC     
     _refln.phase_calc        	      P           PHIC   
     _refln.phase_meas        	      P           PHIB   
     _refln.fom               	      W           FOM    
     
     _refln.pdbx_HL_A_iso                 A       	  HLA   
     _refln.pdbx_HL_B_iso                 A       	  HLB   
     _refln.pdbx_HL_C_iso                 A       	  HLC   
     _refln.pdbx_HL_D_iso                 A       	  HLD   
     
     _refln.pdbx_F_plus         	      G           F(+)   
     _refln.pdbx_F_plus_sigma   	      L           SIGF(+) 
     _refln.pdbx_F_minus        	      G           F(-)   
     _refln.pdbx_F_minus_sigma  	      L           SIGF(-)
     _refln.pdbx_anom_difference          D           DP   
     _refln.pdbx_anom_difference_sigma    Q           SIGDP
     _refln.pdbx_I_plus                   K           I(+)   
     _refln.pdbx_I_plus_sigma             M           SIGI(+)
     _refln.pdbx_I_minus                  K           I(-)   
     _refln.pdbx_I_minus_sigma            M           SIGI(-) 
    
    _refln.status            	      I           FREE