NDB Cluster Internals/NDB Cluster Start Phases/ System Restart Handling in Phase 4

5.21 System Restart Handling in Phase 4

这包括以下步骤:

  1. The master sets the latest GCI as the restart GCI, and then synchronizes its system file to all other nodes involved in the system restart.

  2. The next step is to synchronize the schema of all the nodes in the system restart. This is performed in 15 passes. The problem we are trying to solve here occurs when a schema object has been created while the node was up but was dropped while the node was down, and possibly a new object was even created with the same schema ID while that node was unavailable. In order to handle this situation, it is necessary first to re-create all objects that are supposed to exist from the viewpoint of the starting node. After this, any objects that were dropped by other nodes in the cluster while this node wasdeadare dropped; this also applies to any tables that were dropped during the outage. Finally, any tables that have been created by other nodes while the starting node was unavailable are re-created on the starting node. All these operations are local to the starting node. As part of this process, is it also necessary to ensure that all tables that need to be re-created have been created locally and that the proper data structures have been set up for them in all kernel blocks.

    After performing the procedure described previously for the master node the new schema file is sent to all other participants in the system restart, and they perform the same synchronization.

  3. All fragments involved in the restart must have proper parameters as derived fromDBDIH. This causes a number ofSTART_FRAGREQsignals to be sent fromDBDIHtoDBLQH. This also starts the restoration of the fragments, which are restored one by one and one record at a time in the course of reading the restore data from disk and applying in parallel the restore data read from disk into main memory. This restores only the main memory parts of the tables.

  4. Once all fragments have been restored, aSTART_RECREQmessage is sent to all nodes in the starting cluster, and then all undo logs for any Disk Data parts of the tables are applied.

  5. After applying the undo logs inLGMAN, it is necessary to perform some restore work inTSMANthat requires scanning the extent headers of the tablespaces.

  6. Next, it is necessary to prepare for execution of the redo log, which log can be performed in up to four phases. For each fragment, execution of redo logs from several different nodes may be required. This is handled by executing the redo logs in different phases for a specific fragment, as decided inDBDIHwhen sending theSTART_FRAGREQsignal. AnEXEC_FRAGREQsignal is sent for each phase and fragment that requires execution in this phase. After these signals are sent, anEXEC_SRREQsignal is sent to all nodes to tell them that they can start executing the redo log.

    Note

    Before starting execution of the first redo log, it is necessary to make sure that the setup which was started earlier (in Phase 4) byDBLQHhas finished, or to wait until it does before continuing.

  7. Prior to executing the redo log, it is necessary to calculate where to start reading and where the end of the redo log should have been reached. The end of the redo log should be found when the last GCI to restore has been reached.

  8. After completing the execution of the redo logs, all redo log pages that have been written beyond the last GCI to be restore are invalidated. Given the cyclic nature of the redo logs, this could carry the invalidation into new redo log files past the last one executed.

  9. After the completion of the previous step,DBLQHreport this back toDBDIHusing aSTART_RECCONFmessage.

  10. When the master has received this message back from all starting nodes, it sends aNDB_STARTCONFsignal back toNDBCNTR.

  11. TheNDB_STARTCONFmessage signals the end ofSTTORphase 4 toNDBCNTR, which is the only block involved to any significant degree in this phase.