
*** $Id: offlinegc6x.txt,v 1.1 2007-09-14 22:41:16 normg Exp $ ***

Offline Garbage Collection User Action Libraries For GemStone 6.x

September 11, 2007

For customers running large repositories on GemStone 6.x, global 
garbage collection can be a daunting and tedious task.  The main
problem is the duration of the markForCollection (MFC) or 
findDisconnectedObjects (FDC) operation can, in some cases, take more than
12 days to complete.

In GemStone/64, there exists a fast offline GC process.  In short,
an optimized FDC is run on a copy of the production database on 
a dedicated system.  The fast FDC consumes most of the hardware
resources in order to finish the operation as quickly as possible.
Using this technique, the fast FDC operation has completed in under
24 hours on repositories approaching 2 billion objects in size.

Small changes were made to the offline GC and fast FDC code to 
allow GemStone 6.x production customers to take advantage of the
fast FDC capability of GemStone/64.  This is done by performing an
abbreviated conversion of a copy of the 6.x production database
to GemStone/64 format.  The fast FDC can then be run on the
converted copy which produces a binary file of objects written
in 6.x format.  That file can then be used to run a markGcCandidates
(MGC) GC sweep on the production 6.x database.  The MGC runs
relatively quickly even on very large repositories and typically takes
no more than 6 hours to complete.

Starting in GemStone64 version 2.2.3, scripts and tools are provided
to perform the above operations.  User action code written in C++
is provided in $GEMSTONE/examples/offlinegc6x/offlinegc6x.cc .
These user actions are intended to be run on the 6.x production
system, not the GemStone/64 system.  They must be compiled using
a C++ compiler with $GEMSTONE set to a 6.x GemStone product tree.
Example 'make' files are provided for all supported platforms except
for MS Windows.  GemStone Technical Support can also provide
pre-compiled user action shared libraries upon request.


Here are the steps for running a fast offline GC on a 6.x database
using the GS/64 fast FDC techniques described above:

  1) Set the environment variable GEMSTONE_22 to reference the
     GemStone64 2.2.3 or later product tree.

 2a) File in $GEMSTONE_22/upgrade/preConvFor61Gc.topaz to the 6.x 
     production database.  This must be done BEFORE making the copy for
     the conversion and offline GC.  This step needs to be performed only
     once on the 6.x production system.

 2b) Disable epoch garbage collection on the production database.

  3) Shutdown the 6.x production database and make a copy of the extents.

  4) Start the copy of 6.x production using startstone.

  5) Run $GEMSTONE_22/upgrade/convprepForGc6x on the copy.  This will
     shutdown the database when finished.

  6) Start a new 2.x stone with a large shared page cache and pregrow 
     the extents .

  7) run $GEMSTONE_22/bin/conv61To2xForFastGc to perform the low level 
     conversion.  The new 2.x database will be shutdown automatically 
     when this step is completed.

  8) Restart the new 2.x stone using startstone

  9) Run the $GEMSTONE_22/bin/startotcachewarmers script.  This will 
     start object table cache warmer gems to load the object table into 
     the shared page cache.

 10) Run the $GEMSTONE_22/bin/startfastfdcto6xfile script to start 
     the fast offline FDC.  This will produce a binary file containing
     a list of dead objects found by the FDC.  The file will be in 
     GemStone 6.x oop format.

Running the MGC on production:

  1) Build the offline GC user action library in
     $GEMSTONE_22/examples/offlinegc6x.  Only this version of the 
     offline GC user action will work correctly with the file produced
     by the fast FDC.  Remember to set $GEMSTONE to reference your 6.x 
     GemStone tree when compiling and linking.  If you cannot build
     the user action library, contact GemStone support to obtain it.

  2) Copy the user action library built in step 1 into $GEMSTONE/ualib
     where $GEMSTONE references your 6.x GemStone product tree.

  3) Run the script $GEMSTONE_22/examples/offlinegc6x/run6x-mgc.
     This script requires 3 arguments in order as follows:

         run6x-mgc [stoneName] [SystemUser password] [FDC filename]
 
     The 1st argument is the production database name
     The 2nd argument is the SystemUser password
     The 3rd argument is the file generated by the fast FDC process.
     Also, the GEMSTONE env var must be set to reference your 6.x
     product tree.
