summaryrefslogtreecommitdiff
path: root/contrib/replication/README
blob: 4deb8a78dcf12120bc623a9f78851df0ec2ff240 (plain)
  1. README
  2. ------------
  3. $Id$
  4. Christopher Browne
  5. cbbrowne@gmail.com
  6. 2006-09-29
  7. ------------
  8. The script configure-replication.sh is intended to allow the gentle user
  9. to readily configure replication of the LedgerSMB database schema
  10. using the Slony-I replication system for PostgreSQL.
  11. For more general details about Slony-I, see <http://slony.info/>
  12. This script uses a number of environment variables to determine the
  13. shape of the configuration. In many cases, the defaults should be at
  14. least nearly OK...
  15. Global:
  16. CLUSTER - Name of Slony-I cluster
  17. NUMNODES - Number of nodes to set up
  18. PGUSER - name of PostgreSQL superuser controlling replication
  19. PGPORT - default port number
  20. PGDATABASE - default database name
  21. For each node, there are also four parameters; for node 1:
  22. DB1 - database to connect to
  23. USER1 - superuser to connect as
  24. PORT1 - port
  25. HOST1 - host
  26. It is quite likely that DB*, USER*, and PORT* should be drawn from the
  27. default PGDATABASE, PGUSER, and PGPORT values above; that sort of
  28. uniformity is usually a good thing.
  29. In contrast, HOST* values should be set explicitly for HOST1, HOST2,
  30. ..., as you don't get much benefit from the redundancy replication
  31. provides if all your databases are on the same server!
  32. slonik config files are generated in a temp directory under /tmp. The
  33. usage is thus:
  34. 1. preamble.slonik is a "preamble" containing connection info used by
  35. the other scripts.
  36. Verify the info in this one closely; you may want to keep this
  37. permanently to use with other maintenance you may want to do on the
  38. cluster.
  39. 2. create_set.slonik
  40. This is the first script to run; it sets up the requested nodes as
  41. being Slony-I nodes, adding in some Slony-I-specific config tables
  42. and such.
  43. You can/should start slon processes any time after this step has run.
  44. 3. store_paths.slonik
  45. This is the second script to run; it indicates how the slons
  46. should intercommunicate. It assumes that all slons can talk to
  47. all nodes, which may not be a valid assumption in a
  48. complexly-firewalled environment.
  49. 4. create_set.slonik
  50. This sets up the replication set consisting of the whole bunch of
  51. tables and sequences that make up the LedgerSMB database schema.
  52. When you run this script, all that happens is that triggers are
  53. added on the origin node (node #1) that start collecting updates;
  54. replication won't start until #5...
  55. There are two assumptions in this script that could be invalidated
  56. by circumstances:
  57. 1. That all of the LedgerSMB tables and sequences have been
  58. included.
  59. This becomes invalid if new tables get added to LedgerSMB and
  60. don't get added to the TABLES list in the generator script.
  61. 2. That all tables have been defined with primary keys.
  62. This *should* be the case soon if not already.
  63. 5. subscribe_set_2.slonik
  64. And 3, and 4, and 5, if you set the number of nodes higher...
  65. This is the step that "fires up" replication.
  66. The assumption that the script generator makes is that all the
  67. subscriber nodes will want to subscribe directly to the origin
  68. node. If you plan to have "sub-clusters", perhaps where there is
  69. something of a "master" location at each data centre, you may
  70. need to revise that.
  71. The slon processes really ought to be running by the time you
  72. attempt running this step. To do otherwise would be rather
  73. foolish.
  74. Once all of these slonik scripts have been run, replication may be
  75. expected to continue to run as long as slons stay running.
  76. If you have an outage, where a database server or a server hosting
  77. slon processes falls over, and it's not so serious that a database
  78. gets mangled, then no big deal: Just restart the postmaster and
  79. restart slon processes, and replication should pick up.
  80. If something does get mangled, then actions get more complicated:
  81. 1 - If the failure was of the "origin" database, then you probably want
  82. to use FAIL OVER to shift the "master" role to another system.
  83. 2 - If a subscriber failed, and other nodes were drawing data from it,
  84. then you could submit SUBSCRIBE SET requests to point those other
  85. nodes to some node that is "less mangled." That is not a real big
  86. deal; note that this does NOT require that they get re-subscribed
  87. from scratch; they can pick up (hopefully) whatever data they
  88. missed and simply catch up by using a different data source.
  89. Once you have reacted to the loss by reconfiguring the surviving nodes
  90. to satisfy your needs, you may want to recreate the mangled node. See
  91. the Slony-I Administrative Guide for more details on how to do that.
  92. It is not overly profound; you need to drop out the mangled node, and
  93. recreate it anew, which is not all that different from setting up
  94. another subscriber.