conkit.core.ContactMap module

Storage space for a contact map

class ContactMap(id)[source]

Bases: conkit.core.Entity.Entity

A contact map object representing a single prediction

Examples

>>> from conkit.core import Contact, ContactMap
>>> contact_map = ContactMap("example")
>>> contact_map.add(Contact(1, 10, 0.333))
>>> contact_map.add(Contact(5, 30, 0.667))
>>> print(contact_map)
ContactMap(id="example" ncontacts=2)

Attributes

coverage The sequence coverage score
id The ID of the selected entity
ncontacts The number of conkit.core.Contact instances in the conkit.core.ContactMap
precision The precision (Positive Predictive Value) score
repr_sequence The representative conkit.core.Sequence associated with the conkit.core.ContactMap
repr_sequence_altloc The representative altloc conkit.core.Sequence associated with the conkit.core.ContactMap
sequence The conkit.core.Sequence associated with the conkit.core.ContactMap
top_contact The first conkit.core.Contact entry in conkit.core.ContactMap

Methods

add(entity) Add a child to the Entity
assign_sequence_register([altloc]) Assign the amino acids from conkit.core.Sequence to all conkit.core.Contact instances
calculate_jaccard_index(other) Calculate the Jaccard index between two conkit.core.ContactMap instances
calculate_scalar_score() Calculate a scaled score for the conkit.core.ContactMap
copy() Create a shallow copy of conkit.core.Entity
deepcopy() Create a deep copy of conkit.core.Entity
find(indexes[, altloc]) Find all contacts associated with index
match(other[, remove_unmatched, renumber, ...]) Modify both hierarchies so residue numbers match one another.
plot_map([other, reference, altloc, ...]) Produce a 2D contact map plot
remove(id) Remove a child
remove_neighbors([min_distance, inplace]) Remove contacts between neighboring residues
rescale([inplace]) Rescale the raw scores in conkit.core.ContactMap
sort(kword[, reverse, inplace]) Sort the conkit.core.ContactMap
assign_sequence_register(altloc=False)[source]

Assign the amino acids from conkit.core.Sequence to all conkit.core.Contact instances

Parameters:

altloc : bool

Use the res_altloc positions [default: False]

calculate_jaccard_index(other)[source]

Calculate the Jaccard index between two conkit.core.ContactMap instances

This score analyzes the difference of the predicted contacts from two maps,

\[J_{x,y}=\frac{\left|x \cap y\right|}{\left|x \cup y\right|}\]

where \(x\) and \(y\) are the sets of predicted contacts from two different predictors, \(\left|x \cap y\right|\) is the number of elements in the intersection of \(x\) and \(y\), and the \(\left|x \cup y\right|\) represents the number of elements in the union of \(x\) and \(y\).

The J-score has values in the range of \([0, 1]\), with a value of \(1\) corresponding to identical contact maps and \(0\) to dissimilar ones.

Parameters:

other : conkit.core.ContactMap

Returns:

float

The Jaccard distance

Warning

The Jaccard distance ranges from \([0, 1]\), where \(1\) means the maps contain identical contacts pairs.

See also

match, precision

Notes

The Jaccard index is different from the Jaccard distance mentioned in [R5]. The Jaccard distance corresponds to \(1-Jaccard_{index}\).

[R5]Q. Wuyun, W. Zheng, Z. Peng, J. Yang (2016). A large-scale comparative assessment of methods for residue-residue contact prediction. Briefings in Bioinformatics, [doi: 10.1093/bib/bbw106].
calculate_scalar_score()[source]

Calculate a scaled score for the conkit.core.ContactMap

This score is a scaled score for all raw scores in a contact map. It is defined by the formula

\[{x}'=\frac{x}{\overline{d}}\]

where \(x\) corresponds to the raw score of each predicted contact and \(\overline{d}\) to the mean of all raw scores.

The score is saved in a separate conkit.core.Contact attribute called scalar_score

This score is described in more detail in [R7].

[R7]S. Ovchinnikov, L. Kinch, H. Park, Y. Liao, J. Pei, D.E. Kim, H. Kamisetty, N.V. Grishin, D. Baker (2015). Large-scale determination of previously unsolved protein structures using evolutionary information. Elife 4, e09248.
coverage

The sequence coverage score

The coverage score is calculated by analysing the number of residues covered by the predicted contact pairs.

\[Coverage=\frac{x_{cov}}{L}\]

The coverage score is calculated by dividing the number of contacts \(x_{cov}\) by the number of residues in the sequence \(L\).

Returns:

cov : float

The calculated coverage score

See also

precision

find(indexes, altloc=False)[source]

Find all contacts associated with index

Parameters:

index : list, tuple

A list of residue indexes to find

altloc : bool

Use the res_altloc positions [default: False]

Returns:

conkit.core.ContactMap

A modified version of the contact map containing the found contacts

match(other, remove_unmatched=False, renumber=False, inplace=False)[source]

Modify both hierarchies so residue numbers match one another.

This function is key when plotting contact maps or visualising contact maps in 3-dimensional space. In particular, when residue numbers in the structure do not start at count 0 or when peptide chain breaks are present.

Parameters:

other : conkit.core.ContactMap

remove_unmatched : bool, optional

Remove all unmatched contacts [default: False]

renumber : bool, optional

Renumber the res_seq entries [default: False]

If True, res1_seq and res2_seq changes but id remains the same

inplace : bool, optional

Replace the saved order of contacts [default: False]

Returns:

hierarchy_mod

conkit.core.ContactMap instance, regardless of inplace

ncontacts

The number of conkit.core.Contact instances in the conkit.core.ContactMap

Returns:

ncontacts : int

The number of sequences in the conkit.core.ContactMap

plot_map(other=None, reference=None, altloc=False, file_format='png', file_name='contactmap.png')[source]

Produce a 2D contact map plot

Parameters:

other : conkit.core.ContactMap, optional

reference : conkit.core.ContactMap, optional

A ConKit conkit.core.ContactMap [this map refers to the reference contacts]

altloc : bool

Use the res_altloc positions [default: False]

file_format : str, optional

Plot figure format. See matplotlib.pyplot.savefig() for options [default: png]

file_name : str, optional

File name to which the contact map will be printed [default: contactmap.png]

Raises:

RuntimeError

Matplotlib not installed

Warning

If the file_name variable is not changed, the current file will be continuously overwritten.

precision

The precision (Positive Predictive Value) score

The precision value is calculated by analysing the true and false postive contacts.

\[Precision=\frac{TruePositives}{TruePositives - FalsePositives}\]

The status of each contact, i.e true or false positive status, can be determined by running the match() function providing a reference structure.

Returns:

ppv : float

The calculated precision score

See also

coverage

remove_neighbors(min_distance=5, inplace=False)[source]

Remove contacts between neighboring residues

Parameters:

min_distance : int, optional

The minimum number of residues between contacts [default: 5]

inplace : bool, optional

Replace the saved order of contacts [default: False]

Returns:

contact_map : conkit.core.ContactMap

The reference to the conkit.core.ContactMap, regardless of inplace

repr_sequence

The representative conkit.core.Sequence associated with the conkit.core.ContactMap

The peptide sequence constructed from the available contacts using the normal res_seq positions

Returns:

sequence : conkit.coreSequence

Raises:

TypeError

Sequence undefined

repr_sequence_altloc

The representative altloc conkit.core.Sequence associated with the conkit.core.ContactMap

The peptide sequence constructed from the available contacts using the altloc res_seq positions

Returns:

sequence : conkit.core.Sequence

Raises:

ValueError

Sequence undefined

rescale(inplace=False)[source]

Rescale the raw scores in conkit.core.ContactMap

Rescaling of the data is done to normalize the raw scores to be in the range [0, 1]. The formula to rescale the data is:

\[{x}'=\frac{x-min(d)}{max(d)-min(d)}\]

\(x\) is the original value and \(d\) are all values to be rescaled.

Parameters:

inplace : bool, optional

Replace the saved order of contacts [default: False]

Returns:

contact_map : conkit.core.ContactMap

The reference to the conkit.core.ContactMap, regardless of inplace

sequence

The conkit.core.Sequence associated with the conkit.core.ContactMap

Returns:conkit.core.Sequence
sort(kword, reverse=False, inplace=False)[source]

Sort the conkit.core.ContactMap

Parameters:

kword : str

The dictionary key to sort contacts by

reverse : bool, optional

Sort the contact pairs in descending order [default: False]

inplace : bool, optional

Replace the saved order of contacts [default: False]

Returns:

contact_map : conkit.core.ContactMap

The reference to the conkit.core.ContactMap, regardless of inplace

Raises:

ValueError

top_contact

The first conkit.core.Contact entry in conkit.core.ContactMap

Returns:

top_contact : conkit.core.Contact, None