Installation and Usage Notes for the Validation Application (Binary Distribution) Note: The binary distribution contains an additional submodule, called PROCHECK, which does more structure checking. This submodule is not available in the source distribution. 1. Installation a. Uncompress and unbundle the distribution using the following command: zcat validation-vX.XXX-XXX.tar.gz | tar -xf - The result of this command is a subdirectory validation-vX.XXX-XXX in the current directory, which contains the following: bin - subdirectory that contains application executable "validation-v8" data - subdirectory that contains some data files needed by the application. etc - subdirectory that contains utility scripts and application software license agreement. procheck - subdirectory that contains executables for "procheck" b. Set up the environment variables. - Define the RCSBROOT environment variable to point to the installation directory. Note that the RCSBROOT environment is also used for other RCSB applications like ADIT and PDB_EXTRACT. If all these applications are running on a computer, the last instance of setenv command will define the environment. Thus, set the environment at the command line as follows, just prior to running the application. Assuming that the installation directory is /home/username/validation-vX.XXX-XXX, execute in the shell: For C shell users: setenv RCSBROOT /home/username/validation-vX.XXX-XXX For Bourne shell users: RCSBROOT=/home/username/validation-vX.XXX-XXX; export RCSBROOT - Add "bin" subdirectory to the PATH environment variable. Execute in the shell: For C shell users: setenv PATH "$RCSBROOT/bin:"$PATH For Bourne shell users: PATH="$RCSBROOT/bin:"$PATH; export PATH c. Make binary data from ASCII data - Position in the validation-vX.XXX-XXX/etc directory and run the script binary.sh: cd validation-vX.XXX-XXX/etc ./binary.sh This command will create certain binary data files, using the ASCII data files in data/ascii directory. The resulting files are stored in data/binary directory. Note that it may take several minutes for this step to complete. This step must be executed before the tool can be utilized. 2. Application Usage Notes Usage: For mmCIF files (Please note, only mmCIF format files downloaded from the PDB or generated by PDB_EXTRACT should be used): validation-v8 -f file_name -o 2 -adit -exchange -public For PDB files: validation-v8 -f file_name -o 0 -adit For example, to create reports for a file in mmCIF format named 1xyz.cif type: validation-v8 -f 1xyz.cif -o 2 -adit -exchange -public Output: The names of the output files begin with the root identifier , which is followed by an extension that indicates the file type. For a PDB format file, the program converts the file name without extension into uppercase for the . For an mmCIF format file, the program uses data block identifier as the . The application creates the following files: A. .letter: a text file that contains a summary validation letter. B. .ps: a PostScript file that contains molecular graphics of the structure. For crystal structures, this includes a view of the asymmetric unit and crystal packing. If the mmCIF file was validated and the biological unit of the entry is either larger or smaller than the asymmetric unit, and the struct_biol_gen category was appropriately completed in the mmCIF file, then a view of the biological unit(s) will be included. For NMR ensemble structures, a view of the first model and the ensemble of all models is included. If the NMR entry contains one model, a view of the model will be included. NUCHECK output: If the structure contains nucleic acids, the .ps file also includes plots describing the geometry, torsion, and base morphology of the nucleic acids generated by the program NUCHECK. C. PROCHECK output: For crystal structures containing protein, there are ten PostScript files from PROCHECK: File name / File contains _01.ps: Ramachandran plot _02.ps: Ramachandran plots by residue _03.ps: Chi1-Chi2 plots _04.ps: Main-chain parameters _05.ps: Side-chain parameters _06.ps: Residue properties _07.ps: Main-chain bond distance comparisons _08.ps: Main-chain bond angle comparisons _09.ps: RMS deviations from planarity _10.ps: Summary of geometrical distortions For NMR structures containing protein, there are nine PostScript files from PROCHECK: File name / File contains _01.ps: Ramachandran plot _02.ps: Ramachandran plots for all residue types _03.ps: Chi1-Chi2 plots _04.ps: Chi1 frequency distributions _05.ps: Chi2 frequency distributions _06.ps: Ensemble Ramachandran plots _07.ps: Residue properties _08.ps: Equivalent resolution _09.ps: Model secondary structures D. .html: This html file is an Atlas summary containing the following: For all structures: The sequence of the residues in each chain (from entity_poly for a mmCIF file, from SEQRES for a PDB file, or from the coordinates if entity_poly or SEQRES are not provided). Citation information (if provided). Refinement information (if provided). For crystal structures, additional information is listed: Space group and cell constants. Crystallization conditions (if provided). Refinement information (if provided). 3. REFERENCES PROCHECK: Roman A. Laskowski, Malcolm W. MacArthur, David S. Moss and Janet M. Thornton (1993). PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Cryst., 26, 283-291. A. Louise Morris, Malcolm W. MacArthur, E. Gail Hutchinson and Janet M. Thornton (1992). Stereochemical quality of protein structure coordinates. Proteins, 12, 345-364. NUCHECK: Zukang Feng, John Westbrook, Helen M. Berman. NUCheck: Rutgers University, New Brunswick, NJ; 1998. Report No.: NDB-407.