ParallelMySQLDump
Contents |
[edit] Adding parallelism to mysqldump
This is part of the Summer of Code 2008 project proposals.
[edit] Synopsys
The idea of this project is to parallelize 'mysqldump' program. Currently, MySQL is fully multi-threaded and can handle larger databases. But dumping these worst-case scenarios can be really painful. By parallelizing the mysqldump tool, significant performance improvements can be achieved for both power and small users of MySQL.
[edit] Parallelization
Incremental steps to achieve parallelizaton:
- For each table, fork a process that will dump it into its own file (considering a single machine running the program).
- Limit the number of process created in a single machine.
- Create a limited number of concurrent threads inside each process.
- Limit the number of concurrent threads dumping.
- Add support to multi-computer parallelism.
- [OPTIONAL] Enable across-machine dumps (dumps generated by one machine are moved to another).
[edit] Timeline
- Examine mysqldump code - until May, 23
- Parallel version of mysqldump (step S4 completed) working and tested - until mid-June*
- Parallel version of mysqldump (step S7 completed) working and tested - until mid-August
- To be delivered also to Multi-core Systems course
[edit] Deliverables
- A parallelized version of mysqldump
- Updated version of mysqldump page at MySQL Reference Manual