Steps in achieving protein 3D in CADD
1. Firstly, get protein sequences from MEROPS database (merops.sanger.ac.uk).
a. Once you open the main website, click on searches and search for ClpP.
b. From there, click on S14.001 and then, click sequences.
c. Click on the link with MERXXXXXX and the window will show the protein sequences for that merop ID.
d. Copy the whole protein sequences from >MERXXXXXX until the end of the sequences. An example would be such as this:
>MERXXXXXX - peptidase Clp (type 1) [S14.001] peptidase unit: X-X ( active site residue(s): X,X ) (Burkholderia cenocepacia) (Source: ProtID)
X MIHQPLGGARSQASYIEIQASEIVYLKERLSCLLAQYRPQEVKSIARDTDRDNFMSSEDA XX
XX KAYGLIDQVLLKRP XX
2. Make a table based on the information that has been obtained according to these criteria:
a. Merops ID (which is the MERXXXXXX)
b. Organism (the name of the organism)
c. Peptidase Unit (PU) (which can be obtained through the data obtained earlier)
d. Active Site (AS) (also from the data earlier)
e. Protein Sequence (the part where the sequences are – you have to delete everything, leaving just the sequences) The above data would become like this:
MIHQPLGGARSQASYIEIQASEIVYLKERLSCLLAQYRPQEVKSIARDTDRDNFMSSEDAKAYGLIDQVLLKRP
3. After finishing, open the website for NCBI (ncbi.nlm.nih.gov). From here, FASTA sequences must be form in order to do protein blast. FASTA sequence is protein sequence with merops ID. It should be like this:
>MERXXXXXX
MIHQPLGGARSQASYIEIQASEIVYLKERLSCLLAQYRPQEVKSIARDTDRDNFMSSEDAKAYGLIDQVLLKRP
PS: Save it in a notepad file
4. Once you’ve done it, click on the blast link which will bring you the blast site and then, click protein blast.
a. Copy and paste the FASTA sequence into the box, and if you did it right, when you click on the empty space of the job title, the merops ID will emerge.
b. Under program selection, click on psi-blast.
c. And then click on BLAST.
d. When everything is loaded into the new window, click on the first line of the alignment scores, and you will be able to observe its function and identities %.
5. Do this a few times with other protein sequences. Save the one with the FASTA sequences all in one notepad.
6. Now, to do ClustalX. The function of ClustalX is to do protein alignment. In order to do ClustalX, you need to have a notepad file with the FASTA sequences.
a. Once you have the file, click on clustal.exe icon and then click on file, and load sequences. Open the FASTA sequence file.
b. Once all the sequence are inside the ClustalX, click on alignment and do complete alignment. Click on align.
c. There will be two new files that was created with the same name but different format. It should be .dnd and .aln.
7. To do Artemis (a DNA sequence viewer) , first you have to search for another organism, which is Burkholderia pseudomallei. Get the protein sequences of ClpP and Lon-A of the organism.
a. Go to Merops website, click on organism and click B, to search for B. pseudomallei. Click on the link.
b. Then, for Lon-A search the site for clan: SJ and family: S16. Click on its merops number. (MERXXXXXX)
c. A pop up will emerge showing its protein sequence. Copy the sequences into a notepad. Do the same as you did the ones from before, which means form the FASTA sequence.
d. Now, do the same for ClpP, which can be found at clan: SK and family: S14.001. Click on its MERXXXXXX
8. Now, click on the artemis_v5.jar.
a. Go to File > Open and select BPS.dbs file.
b. Click Goto and then Navigator. Copy a portion of the protein sequence from Lon-A peptidase and paste it into the ‘Find Amino Acid Strings’. And click Goto. Do the same as ClpP sequence.
9. Next, would be the formation of protein 3D using RasWin programme.
a. Firstly, open RCSB protein data bank and search for Lon-A. Search 1RR9 through the list (page 2) and download a pdb file (it is the first icon under the 1RR9).
b. Then, after saving the file, open RasWin programme.
c. The only thing to do then is to open the file with the RasWin programme.
Thank you for reading. Forgive me for any shortcomings of this post. Hopefully this would help you in understanding CADD class.