SequenceFile container used throughout ConKit
SequenceAlignmentState
[source]¶Bases: enum.Enum
Alignment states
aligned
= 2¶unaligned
= 1¶unknown
= 0¶SequenceFile
(id)[source]¶Bases: conkit.core._entity._Entity
A sequence file object representing a single sequence file
The SequenceFile
class represents a data structure to hold
Sequence
instances in a single sequence file. It contains
functions to store and analyze sequences.
Examples
>>> from conkit.core import Sequence, SequenceFile
>>> sequence_file = SequenceFile("example")
>>> sequence_file.add(Sequence("foo", "ABCDEF"))
>>> sequence_file.add(Sequence("bar", "ZYXWVU"))
>>> print(sequence_file)
SequenceFile(id="example" nseq=2)
ascii_matrix
¶The alignment encoded in a 2D ASCII matrix
calculate_freq
()[source]¶Calculate the gap frequency in each alignment column
This function calculates the frequency of gaps at each position in the Multiple Sequence Alignment.
Returns:  list


Raises:  MemoryError
RuntimeError

calculate_neff_with_identity
(identity)[source]¶Calculate the number of effective sequences with specified sequence identity
See also
calculate_weights
(identity=0.8)[source]¶Calculate the sequence weights
This function calculates the sequence weights in the the Multiple Sequence Alignment.
The mathematical function used to calculate Meff is
Parameters:  identity : float, optional


Returns:  list

Raises:  MemoryError
RuntimeError
ValueError
ValueError

empty
¶Status of emptiness of sequencefile
filter
(min_id=0.3, max_id=0.9, inplace=False)[source]¶Filter an alignment
Parameters:  min_id : float, optional max_id : float, optional inplace : bool, optional


Returns:  obj

Raises:  MemoryError
RuntimeError
ValueError
ValueError
ValueError

is_alignment
¶A boolean status for the alignment
Returns:  bool


neff
¶The number of effective sequences
nseq
¶The number of sequences
remark
¶The SequenceFile
specific remarks
sort
(kword, reverse=False, inplace=False)[source]¶Sort the SequenceFile
Parameters:  kword : str
reverse : bool, optional
inplace : bool, optional


Returns:  obj

Raises:  ValueError

status
¶An indication of the residue status, i.e true positive, false positive, or unknown
top_sequence
¶The first Sequence
entry in SequenceFile
Returns:  obj


trim
(start, end, inplace=False)[source]¶Trim the SequenceFile
Parameters:  start : int
end : int
inplace : bool, optional


Returns:  obj
