Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve efficiency of read.csv and write.csv #41

Open
hturner opened this issue Aug 22, 2023 Discussed in #7 · 0 comments
Open

improve efficiency of read.csv and write.csv #41

hturner opened this issue Aug 22, 2023 Discussed in #7 · 0 comments
Labels
I/O Issues related to input/output

Comments

@hturner
Copy link
Member

hturner commented Aug 22, 2023

Discussed in #7

Originally posted by tdhock July 1, 2023
Hi! I will not be attending the sprint, but I had a couple of ideas related to improving efficiency of read.csv and write.csv.

Probably the more important issue to address would be read.csv, which had time complexity quadratic in number of columns, see this issue for some empirical analysis:
tdhock/atime#8

Another issue was that write.csv uses linear memory, whereas other CSV writers use only constant memory (this is not that big of an issue though, because anyways you need linear memory to store the data in R before writing to CSV)
tdhock/atime#10

@gmbecker @bastistician may be able to help mentor? They worked on fixing a similar efficiency issue tdhock/atime#9

@hturner hturner added the I/O Issues related to input/output label Aug 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I/O Issues related to input/output
Projects
None yet
Development

No branches or pull requests

1 participant