Transition State Analysis
We have identified transition state (TS) ensembles from over 1300 high
temperature unfolding simulations of 183 structurally diverse proteins from our
Dynameomics database [1]. To determine whether certain types of contacts were
preferentially lost or gained in the TS ensemble relative to the native state,
the pairwise TS and native state contact maps were compared. We calculated the
fraction of time each pairwise contact was present in the native state
simulation for one protein. The pairwise contact maps from each protein in our
set were then combined and the results are shown in Figure 1. The same analysis
was repeated over just the TS ensemble time points identified from the unfolding
simulations (Figure 2). The pairwise contact map comprised of the TS minus
native state values was then calculated (Figure 3). The data for all proteins
showed a loss of contacts between hydrophobic residues in the TS compared to the
native state. In contrast, there was some gain in contacts between charged
residues [1].
Query to generate the pairwise contact map for each protein in the native state:
DECLARE @pdb4 VARCHAR(4)
DECLARE @table_prefix VARCHAR(64)
DECLARE @server_name VARCHAR(32)
DECLARE @database_name VARCHAR(32)
DECLARE @sim_id INT
DECLARE @start_step INT
DECLARE @end_step INT
DECLARE @first BIT
DECLARE @view_cmd VARCHAR(MAX)
DECLARE @struct_id INT
DECLARE @residues INT
SET @first = 1
DECLARE sim_id_cursor CURSOR FOR
select b.server_name, b.database_name,b.struct_id, a.sim_id, a.start_step,
a.end_step from [transition_states].dbo.[transition_states] as a
join
directory.dbo.[master_property_v] as b
on a.sim_id=b.sim_id
where a.start_step is not null and b.property_abbrev='coord'
OPEN sim_id_cursor
FETCH NEXT FROM sim_id_cursor INTO @server_name, @database_name, @struct_id,
@sim_id, @start_step, @end_step
WHILE @@FETCH_STATUS = 0
BEGIN
SET @table_prefix = '[' + @server_name + '].[' + @database_name + '].'
SET @view_cmd =
'
SELECT '+ CAST(@sim_id AS VARCHAR)+' as sim_id, '+ CAST(@struct_id AS
VARCHAR) +' as struct_id, sa1.three_letter as residue_x, sa2.three_letter as
residue_y, contacts.contact_count
INTO
[res_res_cont_TS_'+ CAST(@sim_id AS VARCHAR) +']
FROM Prep_Support.dbo.Standard_Amino_Acid as sa1
JOIN Prep_Support.dbo.Standard_Amino_Acid as sa2
ON (1=1)
LEFT OUTER JOIN
(
Select k.residue_x,k.residue_y, count (*) as [contact_count] from
(Select distinct j.step, j.residue_number_x, j.residue_x,
j.residue_number_y,j.residue_y from
(SELECT
a.sim_id,
a.step,
a.struct_id as struct_id1,
a.struct_inst as struct_inst1,
c.residue_number as residue_number_x,
c.residue as residue_x,
a.atom_number as atom_number_1,
c.atom_name as atom_name_x,
c.atom_type as atom_type_x,
b.struct_id as struct_id2,
b.struct_inst as struct_inst2,
d.residue_number as residue_number_y,
d.residue as residue_y,
d.atom_number as atom_number_2,
c.atom_name as atom_name_y,
d.atom_type as atom_type_y,
(b.x_coord - a.x_coord) * (b.x_coord - a.x_coord)
+(b.y_coord - a.y_coord) * (b.y_coord - a.y_coord)
+(b.z_coord - a.z_coord) * (b.z_coord - a.z_coord)
as distance
FROM
' + @table_prefix + 'dbo.[coord_'+ CAST(@sim_id AS VARCHAR) +'] a
inner join ' + @table_prefix + 'dbo.[coord_'+ CAST(@sim_id AS VARCHAR) +'] b
on a.[step] = b.[step]
join
' + @table_prefix + 'dbo.[id] as c
on a.atom_number=c.atom_number and a.struct_id=c.struct_id
join
' + @table_prefix + 'dbo.[id] as d
on b.atom_number=d.atom_number and b.struct_id=d.struct_id
WHERE
a.[step] between '+ CAST(@start_step AS VARCHAR) +' and '+ CAST(@end_step AS
VARCHAR) +' and
c.residue_number < d.residue_number-1 and
a.atom_number < b.atom_Number and
c.heavy_atom=1 and d.heavy_atom=1) as j
where j.distance < (case when atom_type_x = ''C'' and atom_type_y = ''C''
then 29.16 else 21.16 end)
group by j.step, j.residue_number_x, j.residue_x, j.residue_number_y,
j.residue_y) as k
group by k.residue_x, k.residue_y
) as contacts
ON (sa1.three_letter = contacts.residue_x and
sa2.three_letter = contacts.residue_y)
'
FETCH NEXT FROM sim_id_cursor INTO @server_name, @database_name, @struct_id,
@sim_id, @start_step, @end_step
EXEC (@view_cmd)
END
CLOSE sim_id_cursor
DEALLOCATE sim_id_cursor
Figure 1: Pairwise Contact Map of the 188 Native State The number of times each
pairwise contact is present in the native state simulation for one protein is
calculated and then divided by the number of structures used in the calculation. The pairwise contact maps for each protein are combined and divided by the
number of proteins.
Figure 2: Pairwise Contact Map of the 188 Transition State Ensembles The number of times each pairwise contact is present in the TS ensemble for one protein is calculated and then divided by the number of structures used in the calculation. The pairwise contact maps for each protein are combined and divided by the number of proteins.
Figure 3: Pairwise Contact Difference Maps of the TS minus the Native State The native value is subtracted from that for the TS and the appropriate bin colored according to the magnitude of the difference. Shades of blue represent a reduction in the number of a contact type in the TS, shades of pink an increase. Neighbouring (i - i and i - i + 1) contacts are excluded in all plots.
References
- Jonsson AL, Scott KA, Daggett V. Dynameomics: A consensus view of the protein unfolding/folding transition state ensemble across a diverse set of protein folds. Biophysical Journal, 97: 2958-2966, 2009.