Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible faster version of chunk_on #22

Open
CodyKochmann opened this issue Mar 28, 2020 · 1 comment
Open

Possible faster version of chunk_on #22

CodyKochmann opened this issue Mar 28, 2020 · 1 comment
Assignees

Comments

@CodyKochmann
Copy link
Owner

CodyKochmann commented Mar 28, 2020

Test to see if this version of chunk_on has mo powah than the previous version using deques.

# -*- coding: utf-8 -*-
# @Author: Cody Kochmann

from itertools import groupby, chain

def chunk_on(pipeline, new_chunk_signal, output_type=tuple):
    ''' split the stream into seperate chunks based on a new chunk signal '''
    assert callable(new_chunk_signal), 'chunks needs new_chunk_signal to be callable'
    assert callable(output_type), 'chunks needs output_type to be callable'
    
    holder = None
    for signal, group in groupby(pipeline, new_chunk_signal):
        if signal:
            for i in group:
                holder = [i]
                break
            for i in group:
                yield output_type(holder)
                holder = [i]
        else:
            if holder is None:
                yield output_type(group)
            else:
                yield output_type(chain(holder, group))
                holder = None
    if holder:
        yield output_type(holder)
@CodyKochmann CodyKochmann self-assigned this Mar 28, 2020
@CodyKochmann
Copy link
Owner Author

This one might be faster as well...

# -*- coding: utf-8 -*-
# @Author: Cody Kochmann

from itertools import groupby, chain

def chunk_on(pipeline, new_chunk_signal, output_type=tuple):
    ''' split the stream into seperate chunks based on a new chunk signal '''
    assert callable(new_chunk_signal), 'chunks needs new_chunk_signal to be callable'
    assert callable(output_type), 'chunks needs output_type to be callable'
    
    holder = []
    
    for signal, group in groupby(pipeline, new_chunk_signal):
        if signal:
            for i in group:
                holder = [i]
                break
            for i in group:
                yield output_type(holder)
                holder = [i]
        else:
            yield output_type(
                chain(
                    holder, 
                    group
                )
            )
            holder = []
    if holder:
        yield output_type(holder)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant