The ct
file contains data about the base pairs in a RNA secondary structure.
The following is the structure for TRNA12 that was generated by mFold
and the ct file that is associated with that structure:
RNA Secondary Structure
|
|
ct File
|
|
In order for the program to run properly,
the ct file that you download from the web (Zuker's mFold)
or that you generate must appear exactly as the file above. The first line
contains the total number of nucleotides in the structure, the energy associated
with the fold, and the name of the file. The significance of the columns
is as follows:
Column 1: List of the nucleotides from 1 to N (N = total
number of nucleotides).
Column 2: List of the type of nucleotide (A, G, U, or C).
Column 3: List of the nucleotides increasing from zero to N - 1.
Column 4: List of the nucleotides from 2 to N and continuing the column
with zeros to fill any empty spaces.
Column 5: List of the nucleotides that are paired to those listed in
increasing order. Any zeros in the fifth column indicate that the particular
nucleotide is unpaired.
Column 6: A repeat of column 1.
Click on the following ct file if you would like to view the sample
file displayed above as it actually appears in file form: TRNA12
Back to the top.
Laplacian Matrix:
The Laplacian
matrix (L) is a mathematical representation of the connectivity between
the vertices in a RNA graph or topology. It's represented by diagonal
(D) and adjacency (A) components.
The diagonal matrix shows the number of connections each vertex makes with
the other vertices along the diagonal of the matrix. The adjacency matrix
specifies to which vertices each vertex is connected. In a graphical representation
of a RNA structure, any labeling is fine. For example, the tRNA (NDB: TRNA12)
structure shown above has the following tree graph structure with the vertices
randomly labeled:
The corresponding D and A values are
as follows:
D
|
|
A
|
4 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
|
|
0 |
1 |
1 |
1 |
1 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
|
Each column and row in the above matrices
correspond to the graph's vertices. By looking at the diagonal of the diagonal
matrix, you can see that vertex 1 is connected to 4 other vertices, vertex
2 is connected to 1 other vertex, and so on. The corresponding adjacency
matrix specifies these connections explicitly.
The Laplacian matrix is defined from
D and A as follows:
L = D - A
For the example above, we have
as follows:
L
|
4 |
-1 |
-1 |
-1 |
-1 |
-1 |
1 |
0 |
0 |
0 |
-1 |
0 |
1 |
0 |
0 |
-1 |
0 |
0 |
1 |
0 |
-1 |
0 |
0 |
0 |
1 |
|
The Laplacian matrix is a square matrix.
Each column and row in the above matrix represents the vertices in the
tree graph.
A value of -1 in the matrix element
i,j indicates that vertices i and j are connected. For example, by looking
across at row 1, it is apparent that vertex 1 is connected to vertex 2,
3, 4, and 5. By symmetry, the same information is provided by looking down
column 1.
Zeros indicate no connectivity between
corresponding vertices. For example, vertex 2 is not connected to vertex
4.
The diagonals of the Laplacian matrix
are always positive integers. They represent the number of connections
that the particular vertex makes. For example, vertex 1 is connected to
4 other vertices.
Back to the top.
Laplacian Eigenvalues:
The Laplacian matrix is used to calculate
its corresponding eigenvalues.
The total number of vertices in a RNA secondary structure equals the total
number of eigenvalues. The eigenvalue that helps to describe the RNA topology
is the second eigenvalue. The second eigenvalue describes the compactness
of a graph. The range of possible values for the second eigenvalue begin
at zero and increase from there. The more compact a graph is, the higher
its corresponding second eigenvalue.
Back to the top.