Professional Documents
Culture Documents
Theabilitytoquoteisaserviceablesubstituteforwit.SomersetMaugham SusanBantin,Infobright,20090705
Infobright, Inc. 47 Colborne Street, Suite 403 Toronto, Ontario M5E 1P8 Canada www.infobright.com www.infobright.org
HowtoBackupandRestore
TableofContents
Synopsis Introduction Methodology Example Summary 3 3 4 7 8
TheIntelligentDatabaseforBusinessIntelligence
Page |2
HowtoBackupandRestore
Synopsis
Aregularlyscheduledbackupprocedureisessentialinensuringsystemreliability. The following document describes how Infobright stores database and Knowledge Gridfilesandhowtosuccessfullybackupandrestoreyourdatawarehouse. Infobright uses the native file system to store all data files, therefore any backup toolcanbeusedtobackupthedata.Datafilesarestoredinacompressedformat andsobackuptimesareconsiderablyfasterthanotherdatawarehousesolutions. Duringregulardatawarehouseaccess(readandwrite),thetablesarelocked,andso caution is required to ensure that loads and queries are not occurring during the backupwindow.
Introduction
Backup of Infobright is straightforward and consists of making a copy of the compressed data warehouse files and associated Knowledge Grid files. Infobright usesthenativefilessystem,andassuch,doesnotneedadedicatedagenttoperform backups.Forafullbackup,simplybackupthedirectorycontainingthedatabase. Foranincrementalbackup,itissufficienttobackupthenewlycreatedfilesandany modifiedfiles;newdataisaddedin2GBfiles.EnsurethattheKnowledgeGridfiles are always backed up, as changes can occur to the metadata during query operations without the addition of new data. This procedure is also supported by anybackuptool. Whenrestoringyourdatawarehouse,werecommendafullrestoreofalldatabase filesincludingtheKnowledgeGridfiles.
TheIntelligentDatabaseforBusinessIntelligence
Page |3
HowtoBackupandRestore
Methodology
BackupProcedure To back up the Infobright databases, copy the entire directory containing the Infobright databases, including the Knowledge Grid. This is usually the data subdirectoryinyourInfobrightinstallationdirectory. Thesafestmethodsofensuringacompletebackupofthedatabaseis: 1. Shutdownthedatabasebeforemakingacopyor, 2. Lockthetablesandtakeasnapshot You can take advantage of incremental backups, since only some of the database files are updated when new data is imported. Be sure to do a full backup occasionally. Important:RegardingtheKnowledgeGrid,somefilesintheKNFolderareupdated whenqueries(usingJOIN)arerunsobesuretobackuptheKNFolderonaregular basis,evenwhenmakingincrementalbackups. RestoreProcedure TorestoretheInfobrightdatabasesfromabackupcopy,dothefollowing: 1. Replace the entire data directory with the backup copy. This is usually the datasubdirectoryinyourInfobrightinstallationdirectory. 2. ReplacetheKNFolderwiththebackupcopy(iftheKNFolderisnotinsidethe datadirectory). Important: Do not manually modify database files or move them from one active databasetoanotherthismayleadtodatacorruptionandunpredictableresults. ArchivedInstances IfyouwanttosetupafullyarchivedinstanceofyourInfobrightdatawarehouse,it is necessary to install Infobright to another location with another instance using different port and file directories. The original data must be transferred by exporting it using SELECT INTO OUTFILE in binary or text format and then TheIntelligentDatabaseforBusinessIntelligence
Page |4
HowtoBackupandRestore
loadingthedataintothesecondinstanceusingLOADDATAINFILE.Currently,there isnotamethodoftransferringthedatawithoutusingadecompressionandload. CanIrestoreasingledatabasetable? When restoring tables, it is important to ensure that the Knowledge Grid is upto date,thereforeafullrestoreoftheentireinstanceisrecommended,ratherthanjust a single table. For this reason, when doing either full or incremental backups, the KnowledgeGridshouldalwaysbeupdatedaswell. CanIrenamethedatafilefolder? InfobrighttablesaregloballynumberedinordertoidentifyKnowledgeNodefiles. Therefore,whileyoucanrenametheentiredatabasebyrenamingthefolderondisk, youshouldnotcopyadatabasefolderfromoneactiveinstancetoanother,orwithin the same active instance (e.g. in an effort to make a backup). Copying database folders within one instance may result in different tables with the same globally assigned number, which may lead to errors in query results or an unstable environment.AbackupofthewholedatabasefolderincludingtheKnowledgeGrid isrecommended. Note:thebrighthouse.seqfileisusedtostorethelargesttablenumberused,andis modifiedwhenCREATETABLEisusedwithintheInfobrightstorageengine.Editing itmayallowforcopyingadatabasefromoneactiveinstancetoanothersafely,butit isnotrecommended. DotheKnowledgeGridfilesneedtobebackedupseparately? Even if there are no data changes, the packtopack nodes within the Knowledge Gridareconstantlyupdatedtoreflectrelationshipsfoundduringqueryoperations. AKnowledgeGridbackupwouldprovidetheabilitytorestorepacktopacknodes andisrecommended. WithinoneinstanceofInfobright,allKnowledgeGridfilesarestoredtogetherforall databases. It is not currently possible to distinguish specific Knowledge Grid files associated with a specific database. All Knowledge Grid files within the KNFolder shouldbebackedupeverytime. TheIntelligentDatabaseforBusinessIntelligence
Page |5
HowtoBackupandRestore
HowdoesInfobrightmanagedatabaselocks?
Our locking model follows the standard MySQL model for managing transactions. MySQLdoeshavecommandstoexplicitlylockandunlocktables.Italsolocksduring an update and will automatically unlock the table after a commit (commit may be automated depending on the value of the autocommit variable). If there is a lock against a table the next operation will queue up according to the following priorities: 1. WhenaWRITElockisissued.Iftherearenolockscurrentlyonthetable,the WRITE lock is granted without queuing. Otherwise, the lock is put into the WRITElockqueue. 2. WhenaREADlockisissued.IfthetablehasnoWRITElocksonit,theREAD lock is granted without queuing. Otherwise, the lock request is put into the READlockqueue. Whenever a lock is released, threads in the WRITE locks queue are given priority overthoseintheREADqueue.Therefore,ifathreadisrequestingaWRITElock,it willgetthelockwithminimaldelay. Intheeventoffrequenttableaccess,forexampleloadsscheduledevery5minutes, the time to backup may exceed the time available. In this event it is critical that a snapshotofthefilesystembetakeninsteadofsimplycopyingthedatafiles. Note:Noteverytypeoffilesystemsupportssnapshots.Afewthatdoare:SunZFS, and OESLinux (uses SUSE Linux Enterprise Server). ZFS is available for Linux as well. See also Zmanda "(it can use Snapshots for instant full backups if LVM, ZFS, NetApporVxFSisbeingused)."
TheIntelligentDatabaseforBusinessIntelligence
Page |6
HowtoBackupandRestore
Example
Thefastestwaytobackupthedatabaseistobackupthedatadirectoryusingtypical toolsavailablewithinyourfilesystem.ThelocationoftheKFolderisavailablefrom the brighthouse.ini. By default it is a directory named BH_RSI_Repository in your datadir.Ifithasnotbeenchangedspecifically,itwillread:
KNFolder = BH_RSI_Repository
ToarchivethedatabasetoanotherinstanceofInfobright,itisnecessarytoexport andloadthedataasfollows: To export a table, use the select into outfile command. For a quicker recovery,usethebinaryformat:
set @bh_dataformat = 'binary'; select * from mytable into outfile '/tmp/mytable.bu' fields terminated by '\t';
Unfortunately binary data format is not available in ICE and you will need to export/importviaatextfile.
TheIntelligentDatabaseforBusinessIntelligence
Page |7
HowtoBackupandRestore
Summary
Backing up an instance of Infobright is as straightforward as taking a snapshot or copyofthedatafiles,includingallKnowledgeGridfiles,andthenrestoringallfiles totheoriginalinstance.Inthecaseofcreatinganarchivedinstance,itisnecessary toexportandreloadthedataintoasecondinstallationofInfobright. Inallcases,whetherdoingafullorincrementalbackup,itisnecessarytobackupthe KNFoldertoensureconsistencybetweentheKnowledgeGridandthedatabasefiles atalltimes. AndbecauseInfobrightusesthenativefilesystemtostorealldatafiles,anybackup toolcanbeusedtobackupthedata.
TheIntelligentDatabaseforBusinessIntelligence
Page |8