Struct gapi_grpc::google::genomics::v1::Read [−][src]
A read alignment describes a linear alignment of a string of DNA to a [reference sequence][google.genomics.v1.Reference], in addition to metadata about the fragment (the molecule of DNA sequenced) and the read (the bases which were read by the sequencer). A read is equivalent to a line in a SAM file. A read belongs to exactly one read group and exactly one [read group set][google.genomics.v1.ReadGroupSet].
For more genomics resource definitions, see Fundamentals of Google Genomics
Reverse-stranded reads
Mapped reads (reads having a non-null alignment
) can be aligned to either
the forward or the reverse strand of their associated reference. Strandedness
of a mapped read is encoded by alignment.position.reverseStrand
.
If we consider the reference to be a forward-stranded coordinate space of
[0, reference.length)
with 0
as the left-most position and
reference.length
as the right-most position, reads are always aligned left
to right. That is, alignment.position.position
always refers to the
left-most reference coordinate and alignment.cigar
describes the alignment
of this read to the reference from left to right. All per-base fields such as
alignedSequence
and alignedQuality
share this same left-to-right
orientation; this is true of reads which are aligned to either strand. For
reverse-stranded reads, this means that alignedSequence
is the reverse
complement of the bases that were originally reported by the sequencing
machine.
Generating a reference-aligned sequence string
When interacting with mapped reads, it’s often useful to produce a string representing the local alignment of the read to reference. The following pseudocode demonstrates one way of doing this:
out = "" offset = 0 for c in read.alignment.cigar { switch c.operation { case "ALIGNMENT_MATCH", "SEQUENCE_MATCH", "SEQUENCE_MISMATCH": out += read.alignedSequence[offset:offset+c.operationLength] offset += c.operationLength break case "CLIP_SOFT", "INSERT": offset += c.operationLength break case "PAD": out += repeat("*", c.operationLength) break case "DELETE": out += repeat("-", c.operationLength) break case "SKIP": out += repeat(" ", c.operationLength) break case "CLIP_HARD": break } } return out
Converting to SAM’s CIGAR string
The following pseudocode generates a SAM CIGAR string from the
cigar
field. Note that this is a lossy conversion
(cigar.referenceSequence
is lost).
cigarMap = { "ALIGNMENT_MATCH": "M", "INSERT": "I", "DELETE": "D", "SKIP": "N", "CLIP_SOFT": "S", "CLIP_HARD": "H", "PAD": "P", "SEQUENCE_MATCH": "=", "SEQUENCE_MISMATCH": "X", } cigarStr = "" for c in read.alignment.cigar { cigarStr += c.operationLength + cigarMap[c.operation] } return cigarStr
Fields
id: String
The server-generated read ID, unique across all reads. This is different
from the fragmentName
.
read_group_id: String
The ID of the read group this read belongs to. A read belongs to exactly one read group. This is a server-generated ID which is distinct from SAM’s RG tag (for that value, see [ReadGroup.name][google.genomics.v1.ReadGroup.name]).
read_group_set_id: String
The ID of the read group set this read belongs to. A read belongs to exactly one read group set.
fragment_name: String
The fragment name. Equivalent to QNAME (query template name) in SAM.
proper_placement: bool
The orientation and the distance between reads from the fragment are consistent with the sequencing protocol (SAM flag 0x2).
duplicate_fragment: bool
The fragment is a PCR or optical duplicate (SAM flag 0x400).
fragment_length: i32
The observed length of the fragment, equivalent to TLEN in SAM.
read_number: i32
The read number in sequencing. 0-based and less than numberReads. This field replaces SAM flag 0x40 and 0x80.
number_reads: i32
The number of reads in the fragment (extension to SAM flag 0x1).
failed_vendor_quality_checks: bool
Whether this read did not pass filters, such as platform or vendor quality controls (SAM flag 0x200).
alignment: Option<LinearAlignment>
The linear alignment for this alignment record. This field is null for unmapped reads.
secondary_alignment: bool
Whether this alignment is secondary. Equivalent to SAM flag 0x100.
A secondary alignment represents an alternative to the primary alignment
for this read. Aligners may return secondary alignments if a read can map
ambiguously to multiple coordinates in the genome. By convention, each read
has one and only one alignment where both secondaryAlignment
and supplementaryAlignment
are false.
supplementary_alignment: bool
Whether this alignment is supplementary. Equivalent to SAM flag 0x800.
Supplementary alignments are used in the representation of a chimeric
alignment. In a chimeric alignment, a read is split into multiple
linear alignments that map to different reference contigs. The first
linear alignment in the read will be designated as the representative
alignment; the remaining linear alignments will be designated as
supplementary alignments. These alignments may have different mapping
quality scores. In each linear alignment in a chimeric alignment, the read
will be hard clipped. The alignedSequence
and
alignedQuality
fields in the alignment record will only
represent the bases for its respective linear alignment.
aligned_sequence: String
The bases of the read sequence contained in this alignment record,
without CIGAR operations applied (equivalent to SEQ in SAM).
alignedSequence
and alignedQuality
may be
shorter than the full read sequence and quality. This will occur if the
alignment is part of a chimeric alignment, or if the read was trimmed. When
this occurs, the CIGAR for this read will begin/end with a hard clip
operator that will indicate the length of the excised sequence.
aligned_quality: Vec<i32>
The quality of the read sequence contained in this alignment record
(equivalent to QUAL in SAM).
alignedSequence
and alignedQuality
may be shorter than the full read
sequence and quality. This will occur if the alignment is part of a
chimeric alignment, or if the read was trimmed. When this occurs, the CIGAR
for this read will begin/end with a hard clip operator that will indicate
the length of the excised sequence.
next_mate_position: Option<Position>
The mapping of the primary alignment of the
(readNumber+1)%numberReads
read in the fragment. It replaces
mate position and mate strand in SAM.
info: HashMap<String, ListValue>
A map of additional read alignment information. This must be of the form map<string, string[]> (string key mapping to a list of string values).
Trait Implementations
impl Clone for Read
[src]
impl Debug for Read
[src]
impl Default for Read
[src]
impl Message for Read
[src]
fn encode_raw<B>(&self, buf: &mut B) where
B: BufMut,
[src]
B: BufMut,
fn merge_field<B>(
&mut self,
tag: u32,
wire_type: WireType,
buf: &mut B,
ctx: DecodeContext
) -> Result<(), DecodeError> where
B: Buf,
[src]
&mut self,
tag: u32,
wire_type: WireType,
buf: &mut B,
ctx: DecodeContext
) -> Result<(), DecodeError> where
B: Buf,
fn encoded_len(&self) -> usize
[src]
fn clear(&mut self)
[src]
pub fn encode<B>(&self, buf: &mut B) -> Result<(), EncodeError> where
B: BufMut,
[src]
B: BufMut,
pub fn encode_length_delimited<B>(&self, buf: &mut B) -> Result<(), EncodeError> where
B: BufMut,
[src]
B: BufMut,
pub fn decode<B>(buf: B) -> Result<Self, DecodeError> where
Self: Default,
B: Buf,
[src]
Self: Default,
B: Buf,
pub fn decode_length_delimited<B>(buf: B) -> Result<Self, DecodeError> where
Self: Default,
B: Buf,
[src]
Self: Default,
B: Buf,
pub fn merge<B>(&mut self, buf: B) -> Result<(), DecodeError> where
B: Buf,
[src]
B: Buf,
pub fn merge_length_delimited<B>(&mut self, buf: B) -> Result<(), DecodeError> where
B: Buf,
[src]
B: Buf,
impl PartialEq<Read> for Read
[src]
impl StructuralPartialEq for Read
[src]
Auto Trait Implementations
impl RefUnwindSafe for Read
impl Send for Read
impl Sync for Read
impl Unpin for Read
impl UnwindSafe for Read
Blanket Implementations
impl<T> Any for T where
T: 'static + ?Sized,
[src]
T: 'static + ?Sized,
impl<T> Borrow<T> for T where
T: ?Sized,
[src]
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
[src]
T: ?Sized,
pub fn borrow_mut(&mut self) -> &mut T
[src]
impl<T> From<T> for T
[src]
impl<T> Instrument for T
[src]
pub fn instrument(self, span: Span) -> Instrumented<Self>
[src]
pub fn in_current_span(self) -> Instrumented<Self>
[src]
impl<T> Instrument for T
[src]
pub fn instrument(self, span: Span) -> Instrumented<Self>
[src]
pub fn in_current_span(self) -> Instrumented<Self>
[src]
impl<T, U> Into<U> for T where
U: From<T>,
[src]
U: From<T>,
impl<T> IntoRequest<T> for T
[src]
pub fn into_request(self) -> Request<T>
[src]
impl<T> ToOwned for T where
T: Clone,
[src]
T: Clone,
type Owned = T
The resulting type after obtaining ownership.
pub fn to_owned(&self) -> T
[src]
pub fn clone_into(&self, target: &mut T)
[src]
impl<T, U> TryFrom<U> for T where
U: Into<T>,
[src]
U: Into<T>,
type Error = Infallible
The type returned in the event of a conversion error.
pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>
[src]
impl<T, U> TryInto<U> for T where
U: TryFrom<T>,
[src]
U: TryFrom<T>,
type Error = <U as TryFrom<T>>::Error
The type returned in the event of a conversion error.
pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>
[src]
impl<V, T> VZip<V> for T where
V: MultiLane<T>,
[src]
V: MultiLane<T>,
impl<T> WithSubscriber for T
[src]
pub fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self> where
S: Into<Dispatch>,
[src]
S: Into<Dispatch>,