Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to parse/tokenize only INDENT and DEDENT tokens? #2

Open
leodevbro opened this issue Feb 9, 2021 · 2 comments
Open

How to parse/tokenize only INDENT and DEDENT tokens? #2

leodevbro opened this issue Feb 9, 2021 · 2 comments

Comments

@leodevbro
Copy link

Hello, I'm working on my VS Code extension, blockman, it renders blocks based on nested code blocks to make it easier to visually perceive the code.

video:
https://youtu.be/2Ajh8WQJvHs

The "dt-python-parser" package works really well, I use "getAllTokens" function to get all the tokens from python text file and then filter only type 93 (INDENT) and type 94 (DEDENT).

I need only INDENT and DEDENT locations, nothing more, but the "getAllTokens" function tokenizes everything and therefore losing much time, so if the file has more than 1000 lines, like 5000 lines or so, getAllTokens function takes many seconds to return the value, so rendering blocks take more time and waiting is not very comfortable for a user.

So, for the optimization, can I use the parser/tokenizer in such way that it would parse only INDENT/DEDENT locations and nothing more?

@ProfBramble
Copy link
Contributor

Hi leodevbro,

Thanks for contacting dt-python-parser support!

Your feedback on the efficiency of the package is one of the highlights of our Roadmap for future releases, and we will consider adding filtering support to "getAllTokens" in the future release

@leodevbro
Copy link
Author

Hello again, It's been 8 months since I posted this feature request. I would like to ask you, is there now any plans to implement filter option for getAllTokens function? It would be super super good. Currently it takes about 10 seconds to tokenize 10,000 line file, it is too long, I need only INDENT and DEDENT tokens, but it tokenizes everything with the sacrifice of speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants