sf_convert - Workstation Version Manual

Converting structure factors from one format to another

(Created Nov. 12, 2007; Latest version: 1.204 : 2014-07-08)
Table of Contents
Introduction     TOP

The sf_convert program helps depositors easily convert various structure factor data formats to the mmCIF format (macromolecular Crystallographic Information File) for deposition. It also has other functions for the user to utilize the deposited SF data. For example, the users can easily convert the deposited mmCIF files to various other formats.

Changes from previous release     TOP
The following changes have been made to the previous version (1.200) of sf_convert:

  • Remove the CCP4.setup dependency.
  • Added new options:

    -rescut : add '-rescut value', cut high resolution to the given value in the SF file SF_4_validate.cif

    -sigcut : add '-sigcut value', cut I/SigI to the given value in the SF file SF_4_validate.cif

    -sigcut : add '-diags diagfile', export diagfile for warning/error message.

  • What does sf_convert do?     TOP

  • Convert various formats (mmCIF, CIF, MTZ, CNS, Xplor, HKL2000, Scalepack, Dtrek, TNT, SHELX, SAINT, EPMR, XSCALE, XPREP, XTALVIEW, X-GEN, XENGEN, MULTAN, MAIN, OTHER) to the mmCIF format.

  • Utilize the mmCIF file . It can convert from the mmCIF format to other formats (MTZ, CNS, TNT, SHELX, EPMR, XTALVIEW, HKL2000, Dtrek, XSCALE, MULTAN, MAIN, OTHER).

  • Direct conversion from one format to another without losing information.

  • Allow adding the test flag to the SF for the free R calculation.

  • Allow multiple data set conversion. Multiple data sets can be merged into one mmCIF file.

  • Allow manual labeling of data items for CNS and MTZ format for a safer conversion, if the automatic conversion failed.

  • For details, please read the 'Run the program (Xray data)' below.
  • Program access     TOP
    The source and binary versions of sf_convert can be downloaded from the address http://deposit.pdb.org/software . The source is available under an Open Source license. The binary distributions are available for Intel-Linux/Unix.

    Installations     TOP

    System Requirements:

  • platforms Intel-Linux/Unix
  • C/C++ compilers
  • Installation of binary distribution     TOP
    It is recommended to install the binary distribution, since it is very fast to install. The binary distributions are available for Intel-Linux/Unix.

    Step 1. Uncompress and unbundle the distribution using the following command:

    zcat sf-convert-vX.XXX-XXX.tar.gz | tar -xf -

    The executable file sf_convert is in the directory sf-convert-vX.XXX-XXX/bin.

    Installation of source code distribution     TOP
    1.  Installation   
     
        Uncompress and unbundle the distribution using the following command:
    
            zcat sf-convert-vX.XXX-XXX.tar.gz | tar -xf - 
    
    2.  Building the Application
    
        Position in the sf-convert-vX.XXX-XXX directory and run "make" command:
    
            cd sf-convert-vX.XXX-XXX 
            make
    
        The executable file sf_convert is in the sf-convert-vX.XXX-XXX/bin.
        subdirectory.
    
    
    
    Run the program     TOP
    There is a test example included in this distribution. To run the example, from the directory sf-convert-vX.XXX-XXX directory run the "make test" command. The output files are in the sf-convert-vX.XXX-XXX/sf-convert-vX.X/test.

    Type sf_convert or sf_convert -h for help.

    ===================================================================
               sf_convert (version: 1.204 : 2014-07-08 )
    ===================================================================
    Usage: 'sf_convert  -i input_format -o output_format -sf data_file'
       or simply type:  'sf_convert  -o output_format -sf data_file'
    ===================================================================
    -i     input format:
           mmCIF, CIF, MTZ, CNS, Xplor, HKL2000, Scalepack, Dtrek, TNT,
           SHELX, SAINT, EPMR, XSCALE, XPREP, XTALVIEW, X-GEN, XENGEN,
           MULTAN, MAIN, OTHER.
    -o     output format:
           mmCIF, MTZ, CNS, TNT, SHELX, EPMR, XTALVIEW, HKL2000, Dtrek,
           XSCALE, MULTAN, MAIN, OTHER.
    -sf    input structure factor file name.
    
    Optional options:
    -out   followed by output file name (If not given, default by program)
    -pdb   followed by PDB file (must contain symmetry and cell parameters)
    -cif   followed by cif file (must contain symmetry and cell parameters)
    -man   manually type in cells & symmetry (a,b,c,alpha,beta,gamma,p21)
           (separate each item by a comma ',' without space).
           This option can be replaced by -pdb!
    -flag  followed by a number for Rfree test set.
           (e.g. '-flag 8' means 8% of reflection selected for free R test)
    -sf_type  followed by I or F. (If not given, guessed by program)
    -label   followed by label name for CNS & MTZ (see 'sf_convert -h')
    -freer   followed by the free test value in CNS & MTZ (see 'sf_convert -h')
    -format  guess the format of the SF file (sf_convert -sf sffile -format)
    -reformat  reformat header part of sf file (sf_convert -reformat ciffile)
    -audit   update the audit record (ADD: -audit date :reason@ )
        If only given '-audit date', add auto-corrected info. (yyyy-mm-dd)
        If given '-audit  date :reason@', add reason (start ':' & end '@')
    -valid   (sf_convert -valid sffile) show various SF errors, and correct!
    -cif_check  Add this option to check pdbx dictionary. (local use only)
    -add_map   get map coefficients (FoFc & 2FoFc) adding this option.
    -rescut  (add '-rescut value', cut high resolution to value in SF_4_validate.cif
    -sigcut  (add '-sigcut value', cut I/SigI to value in SF_4_validate.cif
    -diags  (if add '-diags diagfile', export diagfile for warning/error message.
    Example: convert any supported formats to mmCIF:
        sf_convert  -o mmcif -sf sf_file_name
    
    Example: convert any supported formats to mtz:(cell in sf file)
        sf_convert  -o MTZ  -sf sf_file
    
    Example: convert any supported formats to mtz:(cell NOT in sf file)
        sf_convert  -o MTZ  -sf sf_file -pdb pdbfile
        sf_convert  -o MTZ  -sf sf_file -man  a,b,c,alpha,beta,gamma,P21
    
    Default data type: F for EPMR, TNT, OTHER, MULTAN, MAIN,XTALVIEW;
                       I for SHELX,HKL2000,DTREK,XSCALE,XPREP,SAINT;
    
    NOTE: If input SF file name is rxxxxsf.org, the xxxx is used as pdbid.
    
    ==============================================================================
    Note:
     1. If MTZ format is involved, newer version of CCP4 must be installed!
     2. -pdb or -cif or -man must be given for conversion to MTZ file,
        if input SF file does not contain symmetry and cells.
     3. If -man is used, separate each item by a comma ',' without space.
     4. If input format is OTHER, you should provide a ASCII file
        with H, K, L, F, SigmaF separated by space.
     5. CIF and mmCIF are for small molecule and macro-molecule format.
    ==============================================================================
    
    For CNS/Xplor/MTZ to mmCIF, if automatic conversion failed,
    or there are multiple datasets in the SF file, use option '-label'
    Each label be separated by a comma ',' and each data set be separated by a ':'.
    Default freeR set (1 for CNS; 0 for mtz). If not, specify by '-freer ?'.
    
    All the data labels follow those of CCP4 as below.
    THE LABELS ARE CASE SENSITIVE!!.
    
    If the labels involved in the '(' or ')', they must be quoted by ' '
    
    Examples for manual conversion:
     1. convert CNS to mmcif for one data set (with free set as 1)
        sf_convert -i cns -o mmcif -sf data_file
        -label FP=? , SIGFP=?, FREE=? -freer 1
    
     2. convert CNS to mmcif for two data sets
        sf_convert -i cns -o mmcif -sf data_file -label
              FP=? ,SIGFP=?, I=?, SIGI=? , FREE=? : FP=? , SIGFP=?, I=?, SIGI=?
    
     3. convert MTZ to mmcif for one data set
        sf_convert -i mtz -o mmcif -sf data_file -label FP=? ,SIGFP=?, FREE=?
    
     4. convert MTZ to mmcif for two data set
        sf_convert -i mtz -o mmcif -sf data_file -label FP=? ,SIGFP=?,I=?,SIGI=?
                  FREE=? : FP=? , SIGFP=?, I=?, SIGI=?
    
     5. convert MTZ to mmcif (one data set with anomalous)
        sf_convert -i mtz -o mmcif -sf data_file -label FP=? ,SIGFP=? ,
                  FREE=? , 'F(+)=?' ,'SIGF(+)=?','F(-)=?' ,'SIGF(-)=?'
    
     6. convert MTZ to mmcif (specify free set to 2)
        sf_convert -i mtz -o mmcif -sf data_file -freer 2 -label FP=? ,SIGFP=? ,
                  FREE=? , 'F(+)=?' ,'SIGF(+)=?','F(-)=?' ,'SIGF(-)=?'
    
    Note: The question marks '?' correspond to the labels in the SF file.
    
          mmCIF token                    type     data label
    
     _refln.F_meas_au                     F           FP
     _refln.F_meas_sigma_au               Q           SIGFP
     _refln.intensity_meas                J           I
     _refln.intensity_sigma               Q           SIGI
    
     _refln.F_calc                        F           FC
     _refln.phase_calc                    P           PHIC
     _refln.phase_meas                    P           PHIB
     _refln.fom                           W           FOM
    
    _refln.pdbx_FWT                       F
    _refln.pdbx_PHWT                      P
    _refln.pdbx_DELFWT                    F
    _refln.pdbx_DELPHWT                   P
    
     _refln.pdbx_HL_A_iso                 A           HLA
     _refln.pdbx_HL_B_iso                 A           HLB
     _refln.pdbx_HL_C_iso                 A           HLC
     _refln.pdbx_HL_D_iso                 A           HLD
    
     _refln.pdbx_F_plus                   G           F(+)
     _refln.pdbx_F_plus_sigma             L           SIGF(+)
     _refln.pdbx_F_minus                  G           F(-)
     _refln.pdbx_F_minus_sigma            L           SIGF(-)
     _refln.pdbx_anom_difference          D           DP
     _refln.pdbx_anom_difference_sigma    Q           SIGDP
     _refln.pdbx_I_plus                   K           I(+)
     _refln.pdbx_I_plus_sigma             M           SIGI(+)
     _refln.pdbx_I_minus                  K           I(-)
     _refln.pdbx_I_minus_sigma            M           SIGI(-)
    
    _refln.status                         I           FREE