Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to Marko's API #2

Merged
merged 35 commits into from
Sep 7, 2022
Merged
Show file tree
Hide file tree
Changes from 34 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
41cdab5
Add bed serialization
GabrielSimonetto Jul 3, 2022
17a7479
Add bed serialization test
GabrielSimonetto Jul 3, 2022
6b7c622
Add bed from_reader deserialization
GabrielSimonetto Jul 3, 2022
8500ffb
Add basic deserializer (directly from serde docs)
GabrielSimonetto Jul 6, 2022
a83fcbd
Add Deserialization for bed::Record<3>
GabrielSimonetto Jul 7, 2022
d9a5ddc
Add de tests and vec tests
GabrielSimonetto Jul 8, 2022
14960da
Fail from_reader if any record is rejected
GabrielSimonetto Jul 12, 2022
d84c9e5
Introduce test_bed_deserialization_with_whitespaces (currently fails)
GabrielSimonetto Jul 12, 2022
fb52d40
Change expect's to unwrap's
GabrielSimonetto Jul 12, 2022
2ea2769
Fix comments
GabrielSimonetto Jul 13, 2022
114abb1
Remove unwraps and map to custom error
GabrielSimonetto Jul 18, 2022
c900009
Unnest impl blocks
GabrielSimonetto Jul 18, 2022
afb9130
ERASEME: showcasing derive for Deserialize
GabrielSimonetto Jul 19, 2022
07bd3d0
Revert "ERASEME: showcasing derive for Deserialize"
GabrielSimonetto Jul 19, 2022
b5e959b
Example of bed serialization and deserialization.
mmalenic Jul 25, 2022
f5a0dbd
Update example to use noodles Record.
mmalenic Jul 25, 2022
9b465d3
Update comments
mmalenic Jul 25, 2022
e0a892e
Update comments
mmalenic Jul 25, 2022
0484112
Implement verbose ser.rs
GabrielSimonetto Aug 14, 2022
bf0ddad
Add serde_with and wrapper to use Display as serialization
GabrielSimonetto Aug 16, 2022
afcaf12
wip: architecture suggestion
GabrielSimonetto Aug 16, 2022
3877beb
Using a custom Enum to allow all bed record types on AuxiliarBedRecor…
GabrielSimonetto Aug 17, 2022
de846c3
Revert "Using a custom Enum to allow all bed record types on Auxiliar…
GabrielSimonetto Aug 18, 2022
17fe14d
wip: failed attempt to force Display for Record<N>
GabrielSimonetto Aug 18, 2022
46b4400
Revert "wip: failed attempt to force Display for Record<N>"
GabrielSimonetto Aug 18, 2022
658ebed
Generalize for Record<N> using trait stacking
GabrielSimonetto Aug 18, 2022
5afcb5f
wip: loop on deserializer test
GabrielSimonetto Aug 25, 2022
d8edca5
Change wrapper to be newtype_struct, deserializaton works
GabrielSimonetto Aug 29, 2022
eeb9f52
Adapt deserializer for sequences
GabrielSimonetto Aug 30, 2022
d3cc40b
Add test Record<4> deserialization
GabrielSimonetto Aug 30, 2022
530aec3
Cleanup
GabrielSimonetto Aug 30, 2022
80f6b38
Add all serde Record<N> tests
GabrielSimonetto Aug 30, 2022
167c8ab
Clean code:
GabrielSimonetto Sep 1, 2022
9975017
Adapt ser.rs to better showcase the inner workings of the code
GabrielSimonetto Sep 1, 2022
ce6fb3c
Apply change requests
GabrielSimonetto Sep 5, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion noodles-bed/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,6 @@ documentation = "https://docs.rs/noodles-bed"

[dependencies]
noodles-core = { path = "../noodles-core", version = "0.7.0" }
serde = { version = "1" }
serde = { version = "1" }
serde_json = "1.0"
serde_with = "2.0.0"
296 changes: 296 additions & 0 deletions noodles-bed/src/de.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,296 @@
use crate::error;
use crate::record::{BedN, SerdeRecordWrapper};
use error::{Error, Result};
use serde::de::{DeserializeSeed, SeqAccess, Visitor};
use serde::{de, forward_to_deserialize_any, Deserialize};

pub struct RecordDeserializer<'de> {
input: &'de str,
}

fn from_str<'a, T>(s: &'a str) -> Result<T>
where
T: Deserialize<'a>,
{
let mut deserializer = RecordDeserializer::from_str(s);
GabrielSimonetto marked this conversation as resolved.
Show resolved Hide resolved
let t = T::deserialize(&mut deserializer)?;
if deserializer.input.is_empty() {
Ok(t)
} else {
panic!()
GabrielSimonetto marked this conversation as resolved.
Show resolved Hide resolved
}
}

pub fn record_from_str<T>(s: &str) -> Result<T>
where
T: BedN<3> + std::str::FromStr + std::fmt::Display,
<T as std::str::FromStr>::Err: std::fmt::Display,
{
let srw: SerdeRecordWrapper<T> = from_str(s)?;
Ok(srw.0)
}

pub fn vec_record_from_str<T>(s: &str) -> Result<Vec<T>>
where
T: BedN<3> + std::str::FromStr + std::fmt::Display,
<T as std::str::FromStr>::Err: std::fmt::Display,
{
let srw_vec: Vec<SerdeRecordWrapper<T>> = from_str(s)?;
Ok(srw_vec.into_iter().map(|wrap| wrap.0).collect())
}

pub fn from_bytes<'a, T>(_records: &'a [u8]) -> Result<T>
where
T: Deserialize<'a>,
{
todo!()
}

impl<'de> RecordDeserializer<'de> {
pub fn from_str(input: &'de str) -> Self {
GabrielSimonetto marked this conversation as resolved.
Show resolved Hide resolved
RecordDeserializer { input }
}

fn parse_string(&mut self) -> &'de str {
match self.input.find('\n') {
Some(len) => {
let s = &self.input[..len];
self.input = &self.input[len + 1..];
s
}
None => {
let s = self.input;
self.input = "";
s
}
}
}
}

impl<'de, 'a> de::Deserializer<'de> for &'a mut RecordDeserializer<'de> {
type Error = Error;

fn deserialize_any<V>(self, _visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
unreachable!()
}

fn deserialize_seq<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_seq(self)
}

fn deserialize_str<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_borrowed_str(self.parse_string())
}

fn deserialize_newtype_struct<V>(self, _name: &'static str, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
visitor.visit_newtype_struct(self)
}

forward_to_deserialize_any! {
bool i8 i16 i32 i64 i128 u8 u16 u32 u64 u128 f32 f64 char string
bytes byte_buf option unit unit_struct tuple struct
tuple_struct map enum identifier ignored_any
}
}

impl<'de> SeqAccess<'de> for RecordDeserializer<'de> {
type Error = Error;

fn next_element_seed<T>(&mut self, seed: T) -> Result<Option<T::Value>>
where
T: DeserializeSeed<'de>,
{
if self.input.is_empty() {
Ok(None)
} else {
seed.deserialize(&mut *self).map(Some)
}
}
}

#[cfg(test)]
mod serde_tests {
use crate::{
record::{Color, Name, Score, Strand},
Record,
};

use super::*;

#[test]
fn test_from_string_single_auxiliar_bed_record_wrapper() {
let input = "sq0\t7\t13\n";
let result: Record<3> = record_from_str(input).unwrap();

let expected = Record::<3>::builder()
.set_reference_sequence_name("sq0")
.set_start_position(noodles_core::Position::try_from(8).unwrap())
.set_end_position(noodles_core::Position::try_from(13).unwrap())
.build()
.unwrap();

assert_eq!(result, expected);
}

#[test]
fn test_from_string_multiple_auxiliar_bed_record_wrapper() {
let input = "sq0\t7\t13\nsq1\t13\t18\n";
let result: Vec<Record<3>> = vec_record_from_str(input).unwrap();

let record1 = Record::<3>::builder()
.set_reference_sequence_name("sq0")
.set_start_position(noodles_core::Position::try_from(8).unwrap())
.set_end_position(noodles_core::Position::try_from(13).unwrap())
.build()
.unwrap();
let record2 = Record::<3>::builder()
.set_reference_sequence_name("sq1")
.set_start_position(noodles_core::Position::try_from(14).unwrap())
.set_end_position(noodles_core::Position::try_from(18).unwrap())
.build()
.unwrap();
let expected = vec![record1, record2];

assert_eq!(result, expected);
}

#[test]
fn test_from_string_single_auxiliar_bed_record_4_wrapper() {
let input = "sq0\t7\t13\tndls1";
let result: Record<4> = record_from_str(input).unwrap();

let expected = Record::<4>::builder()
.set_reference_sequence_name("sq0")
.set_start_position(noodles_core::Position::try_from(8).unwrap())
.set_end_position(noodles_core::Position::try_from(13).unwrap())
.set_name("ndls1".parse::<Name>().unwrap())
.build()
.unwrap();

assert_eq!(result, expected);
}

#[test]
fn test_to_string_single_auxiliar_bed_record_5_wrapper() {
let input = "sq0\t7\t13\t.\t21";
let result: Record<5> = record_from_str(input).unwrap();

let expected = Record::<5>::builder()
.set_reference_sequence_name("sq0")
.set_start_position(noodles_core::Position::try_from(8).unwrap())
.set_end_position(noodles_core::Position::try_from(13).unwrap())
.set_score(Score::try_from(21).unwrap())
.build()
.unwrap();

assert_eq!(result, expected);
}

#[test]
fn test_to_string_single_auxiliar_bed_record_6_wrapper() {
let input = "sq0\t7\t13\t.\t0\t+";
let result: Record<6> = record_from_str(input).unwrap();

let expected = Record::<6>::builder()
.set_reference_sequence_name("sq0")
.set_start_position(noodles_core::Position::try_from(8).unwrap())
.set_end_position(noodles_core::Position::try_from(13).unwrap())
.set_strand(Strand::Forward)
.build()
.unwrap();

assert_eq!(result, expected);
}

#[test]
fn test_to_string_single_auxiliar_bed_record_7_wrapper() {
let input = "sq0\t7\t13\t.\t0\t.\t7";
let result: Record<7> = record_from_str(input).unwrap();

let start = noodles_core::Position::try_from(8).unwrap();

let expected = Record::<7>::builder()
.set_reference_sequence_name("sq0")
.set_start_position(start)
.set_end_position(noodles_core::Position::try_from(13).unwrap())
.set_thick_start(start)
.build()
.unwrap();

assert_eq!(result, expected);
}

#[test]
fn test_to_string_single_auxiliar_bed_record_8_wrapper() {
let input = "sq0\t7\t13\t.\t0\t.\t7\t13";
let result: Record<8> = record_from_str(input).unwrap();

let start = noodles_core::Position::try_from(8).unwrap();
let end = noodles_core::Position::try_from(13).unwrap();

let expected = Record::<8>::builder()
.set_reference_sequence_name("sq0")
.set_start_position(start)
.set_end_position(end)
.set_thick_start(start)
.set_thick_end(end)
.build()
.unwrap();

assert_eq!(result, expected);
}

#[test]
fn test_to_string_single_auxiliar_bed_record_9_wrapper() {
let input = "sq0\t7\t13\t.\t0\t.\t7\t13\t255,0,0";
let result: Record<9> = record_from_str(input).unwrap();

let start = noodles_core::Position::try_from(8).unwrap();
let end = noodles_core::Position::try_from(13).unwrap();

let expected = Record::<9>::builder()
.set_reference_sequence_name("sq0")
.set_start_position(start)
.set_end_position(end)
.set_thick_start(start)
.set_thick_end(end)
.set_color(Color::RED)
.build()
.unwrap();

assert_eq!(result, expected);
}

#[test]
fn test_to_string_single_auxiliar_bed_record_12_wrapper() {
let input = "sq0\t7\t13\t.\t0\t.\t7\t13\t0\t1\t2\t0";
let result: Record<12> = record_from_str(input).unwrap();

let start = noodles_core::Position::try_from(8).unwrap();
let end = noodles_core::Position::try_from(13).unwrap();

let expected = Record::<12>::builder()
.set_reference_sequence_name("sq0")
.set_start_position(start)
.set_end_position(end)
.set_thick_start(start)
.set_thick_end(end)
.set_blocks(vec![(0, 2)])
.build()
.unwrap();

assert_eq!(result, expected);
}
}
44 changes: 7 additions & 37 deletions noodles-bed/src/error.rs
Original file line number Diff line number Diff line change
@@ -1,64 +1,34 @@
use std;
use std::fmt::{self, Display};
use std::io;
use std::io::ErrorKind;

use serde::{de, ser};

pub type Result<T> = std::result::Result<T, Error>;

// This is a bare-bones implementation. A real library would provide additional
// information in its error type, for example the line and column at which the
// error occurred, the byte offset into the input, or the current key being
// processed.
#[derive(Debug)]
pub enum Error {
// One or more variants that can be created by data structures through the
// `ser::Error` and `de::Error` traits. For example the Serialize impl for
// Mutex<T> might return an error because the mutex is poisoned, or the
// Deserialize impl for a struct may return an error because a required
// field is missing.
Message(String),

// Zero or more variants that can be created directly by the Serializer and
// Deserializer without going through `ser::Error` and `de::Error`. These
// are specific to the format, in this case JSON.
Eof,
Syntax,
ExpectedBoolean,
ExpectedInteger,
ExpectedString,
ExpectedNull,
ExpectedArray,
ExpectedArrayComma,
ExpectedArrayEnd,
ExpectedMap,
ExpectedMapColon,
ExpectedMapComma,
ExpectedMapEnd,
ExpectedEnum,
TrailingCharacters,
Error(io::Error),
}

impl ser::Error for Error {
fn custom<T: Display>(msg: T) -> Self {
Error::Message(msg.to_string())
Error::Error(io::Error::new(ErrorKind::Other, msg.to_string()))
}
}

impl de::Error for Error {
fn custom<T: Display>(msg: T) -> Self {
Error::Message(msg.to_string())
Error::Error(io::Error::new(ErrorKind::Other, msg.to_string()))
}
}

impl Display for Error {
fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
match self {
Error::Message(msg) => formatter.write_str(msg),
Error::Eof => formatter.write_str("unexpected end of input"),
/* and so forth */
_ => todo!()
Error::Error(err) => formatter.write_str(&err.to_string()),
}
}
}

impl std::error::Error for Error {}
impl std::error::Error for Error {}
6 changes: 3 additions & 3 deletions noodles-bed/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@ mod reader;
pub mod record;
mod writer;

pub use self::{reader::Reader, record::Record, writer::Writer};
pub use self::{reader::Reader, record::Record, writer::Writer};

// SerDe
mod de;
mod error;
mod ser;

//pub use de::{from_str, Deserializer};
pub use de::{from_bytes, record_from_str, vec_record_from_str, RecordDeserializer};
pub use error::{Error, Result};
pub use ser::{to_string, Serializer};
pub use ser::{record_to_string, to_bytes, vec_record_to_string, RecordSerializer};
Loading