-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fails on binary field data #1
Comments
Thanks for the report! If you could make available an example dump that contained such a binary string, I could tackle it. Or if you want to issue a PR I would gladly review it. |
Test file available at https://gist.github.com/HerbCSO/2b4213cdc1e81df121e8 (uuencoded). I'd love to do a PR, but as I mentioned I'm not really sure how to approach it. To reproduce, once you have the file uudecoded, run
|
+1 |
2 similar comments
+1 |
+1 |
If anyone is interested I've modified it here: #2 |
adding mysqldump option: --hex-blob seems to be a workaround for this issue at the moment. |
I've made a bash script here: https://github.com/meme-lord/hasten_py/blob/master/hasten.sh |
Interesting script, although that isn't quite what hasten does. Hasten removes the key definitions on new tables, and then adds them back in at the end via alters. Your script is interesting, and I think the auto-commit 0 angle is interesting, but for very large databases I'm not sure one commit at the very end is a great idea. I am probably going to write my own script that disables auto-commit, and then has a configurable number of INSERT lines to add COMMITs after. To provide a happy medium between: Which I think should speed up large table innodb INSERTs considerably, without hasten's sort of complex approach. |
That's an excellent idea. I can try implement it the next time I need to do a large import. |
It definitely does, but I think that is best handled by mysqldump's --extended-insert flag :
|
MySQL allows storing "binary" data types.
mysqldump
outputs these as byte strings. This results in invalid UTF-8 sequences, which cause the following stack trace when the dump file is piped throughhasten
:I imagine it would be rather difficult to identify these binary strings and treat the separately in order to avoid this. I guess the other alternative would be to load the file as ASCII 8-bit, but then I'm not sure what would happen with valid UTF-8 characters...
Bit of a pickle, not sure if there is a good solution, but wanted to at least get the issue out there. I was trying it out on one of our dumps to see if it would speed things up on restore, which we sorely need, but this is unfortunately a blocker for using hasten.
The text was updated successfully, but these errors were encountered: