Hot File

PostgreSQL Synchronization Tool

View: 664    Dowload: 0   Comment: 0   Post by: hanhga  
Author: none   Category: PostgreSQL   Fields: Education - Schools

0 point/1 review File has been tested

Download   postgresql-synchronization.zip (35.72 KB)

You need to Sign In to download the file postgresql-synchronization.zip
If you do not have an account then Sign up for free here

Download error   Report copyright violation

This article introduces a method of synchronizing two PostgreSQL databases. Although, this seems to be an easy task, no product (slony, londiste, ...) really satisfied the needs within the maps.bremen.de project. Either they have special prerequisites that didn't apply for our problem, or they didn't support synchronizing of large objects. Large objects are used to store tiles of a street/aerial map within PostgreSQL. My GIS-server queries the database and gets the tiles out. By using this construction, we are getting a flexible infrastructure for updating and maintaining different versions of the maps. Everything was working fine until the service needs to be spread over three servers. How can we easily synchronize the databases? I really found no working solution that was clean and easy to use.

Introduction

psync

As I'm using rsync on a regular basis, I thought why isn't there a tool for databases that simply does nothing more than synchronizing two databases on demand - no sophisticated triggers or additional changes of the database. psync is a simple application that synchronizes two PostgreSQL databases. The concept can be adapted or even encapsulated so that other databases can also be used.

The idea of psync is to use two database connections and update the tables and their content like this:

  1. fetch all tables of the source and destination databases
  2. for each table in the source, do:
    1. create table if not exist on destination
    2. sync contents of table
  3. for each table in destination, do:
    1. delete table if not exists on the source

Although the algorithm looks simple, there are many problems to overcome.

  1. Duplicating tables
  2. Handling of special columns, e.g., oid for large objects
  3. Constraints on tables
  4. Special structures of extensions

Duplicating Tables

In order to duplicate a table, we need to find out which columns exist and which primary keys we have and so on. With this information, we can assemble a CREATE TABLE command that creates the table. In PostgreSQL, the information is stored in a table named information_schema.columns: column_name, is_nullable, and data_type. Primary keys are a little bit more complicated as we need two tables information_schema.table_constraints and information_schema.key_column_usage. See PSync.PrimaryKeys for the whole SQL statement.

Large Object Identifiers

The current implementation of psync focuses on the oid data type for handling large objects. Large objects store binary data and are identified using an integer. Tables with an oid column store only references to a large object. Most replication solutions can't handle large objects.

Certainly, we can't just copy the identifiers from the source to the destination, we need to compare the data itself. Always copying the whole object is not a solution as they may be big and this will waste network bandwidth and time when they are equal. The solution used by psync is hash values (currently the MD5 algorithm is used). In order to keep the traffic low, these hashes are calculated by the database using a custom function without transferring any data. psync compares these values and copies the object on demand.

The PostgreSQL function for calculating an MD5 hash value is (original source):

CREATE OR REPLACE FUNCTION md5(id oid)
RETURNS text
as $$
DECLARE
 fd        integer;
 size      integer;
 hashval   text;
 INV_READ  constant integer := 262144; -- 0x40000 from libpq-fs.h
 SEEK_SET  constant integer := 0;
 SEEK_END  constant integer := 2;
BEGIN
 IF id is null THEN
   RETURN NULL;
 END IF;
 fd   := lo_open(id, INV_READ);
 size := lo_lseek(fd, 0, 2);
 PERFORM lo_lseek(fd, 0, 0);
 hashval := md5(loread(fd, size));
 PERFORM lo_close(fd);
 RETURN hashval;
END;
$$
language plpgsql stable strict;
comment on FUNCTION md5(id oid) is 'Calculates the md5 sum of a large object.';

The process gets even more complex as there maybe more than one oid column in a table. The following code segment shows the process of constructing the SQL query for retrieving the hash values:

bool first;
string _select = "select";
string _from = " from";
string _where = " where";

// Compose the select block
first = true;
foreach(int oid in oids) {
  if (!first)
    _select += ",";
  // Get the md5 hash value for the oid-column
  _select += string.Format(" md5({0})", result.ColumnName(oid));
  first = false;
}

// Data from the table we are currently working on
_from += table;

// The row identified by the primary keys of the table
first = true;
foreach(string key in primaryKeys) {
  if (!first)
    _where += " and";
  _where += string.Format(" {0}='{1}'", key, result[row, result.ColumnIndex(key)]);
  first = false;
}
command = _select + _from + _where;

After building the SQL command, the results are compared:

Result r1, r2;
try {
  r1 = dst.Exec(command);
  if (r1.Rows > 0) {
    r2 = src.Exec(command);

    // Compare large objects
    for(int c=0; c<r1.Columns; c++) {
      if (r1[0, c] != r2[0, c]) {
        // md5 hashes are different
        int id = CopyOid(result.GetInt(row, oids[c]));
        command = string.Format("update {0} set {1}='{2}'{3}", 
                                table, result.ColumnName(oids[c]), id, _where);
        dst.Exec(command).Dispose();
      }
    }

    r1.Dispose();
    r2.Dispose();
  }
  else {
    // Row missing on destination
    r1.Dispose();
    command = string.Format("insert into {0} values(", table);
    first = true;
    for(int col=0; col<result.Columns; col++) {
      if (!first)
        command += ",";
      if (oids.Contains(col)) {
        int id = CopyOid(result.GetInt(row, oid));
        command += string.Format(" '{0}'", id);
      }
      else 
        command += string.Format(" '{0}'", result[row, col]);
      first = false;
    }
    command += ")";
    dst.Exec(command).Dispose();
  }
} catch(PostgreSQLException e) {
  // Large object id exist on destination but does not reference an object
  foreach(int oid in oids) {
    int id = CopyOid(result.GetInt(row, oid));
    command = string.Format("update {0} set {1}='{2}'{3}", 
                            table, result.ColumnName(oid), id, _where);
    dst.Exec(command).Dispose();
  }
}

Using the Code

The code is implemented in two parts. The npq namespace implements a low-level wrapper around libpq (class PG) together with a high-level interface for easier use.

Example use of the library:

PostgreSQL db = new PostgreSQL("hostaddr='127.0.0.1' port='5432' requiressl='1' " + 
                               "user='XXX' password='XXX' dbname='XXX'");
Console.WriteLine("Server {0} Protocol {1}", db.Version, db.Protocol);

Result result = db.Exec("select * from some_table");
Console.WriteLine("Result {0}", result.Valid);
Console.Write("{0,5} ", "Nr");
for(int f=0; f<result.Columns; f++)
  Console.Write("| {0,15}:{1,8}", result.ColumnName(f), result.ColumnType(f));
Console.WriteLine();
for(int n=0; n<result.Rows; n++) {
  Console.Write("{0,5} ", n);
  for(int f=0; f<result.Columns; f++)
    Console.Write("| {0,24}", result[n, f]);
}
// You'll need to explicitly call Dispose to free associated ressources
result.Dispose();
db.Dispose();

Handling of large objects is done like this:

LargeObject lo = new LargeObject(db);
lo.Open(007);
byte[] tmp = new byte[1024];
int s = 0;
FileStream fs = new FileStream(result[n, 0]+".jpg", FileMode.Create);
while ((s = lo.Read(tmp, 1024)) > 0)
  fs.Write(tmp, 0, s);
fs.Close();
lo.Close();

The code was developed under Linux using Mono and MonoDevelop. It should aslo work under Windows but was not tested. Linux users must add the following configuration (npq.dll.config) to correctly map the library.

<configuration>
  <dllmap dll="libpq.dll" target="libpq.so" os="!windows" />
</configuration>

PostgreSQL Synchronization Tool

PostgreSQL Synchronization Tool Posted on 03-04-2014  This article introduces a method of synchronizing two PostgreSQL databases. Although, this seems to be an easy task, no product (slony, londiste, ...) really satisfied the needs within the maps.bremen.de project. Either they have special prerequisites that didn't apply for our problem, or they didn't support synchronizing of large objects. Large objects are used to store tiles of a street/aerial map within PostgreSQL. My GIS-server queries the database and gets the tiles out. By using this construction, we are getting a flexible infrastructure for updating and maintaining different versions of the maps. Everything was working fine until the service needs to be spread over three servers. How can we easily synchronize the databases? I really found no working solution that was clean and easy to use. 5/10 664

Comment:

To comment you must be logged in members.

Files with category

  • Accessing: MySQL Database using MySQL C API

    View: 2447    Download: 2   Comment: 0   Author: none  

    Accessing: MySQL Database using MySQL C API

    Category: PostgreSQL
    Fields: Other

    0.75/6 review
    This article was written for beginners, developers that do not know how to access MySql database using MySQL C API and would like to have a small introduction.

  • Full Database Abstraction Layer Generator

    View: 2533    Download: 0   Comment: 0   Author: none  

    Full Database Abstraction Layer Generator

    Category: PostgreSQL
    Fields: Other

    0/23 review
    This article presents yet another Data Abstraction Layer generator (popular topic). The DAL structure is directly inspired from the DotNetNuke, DAL document and the SQL generation is done using SQLpp. The generator takes care of pretty much...

  • Code .NET 2.0; Build Java; Run Linux

    View: 2043    Download: 0   Comment: 0   Author: none  

    Code .NET 2.0; Build Java; Run Linux

    Category: PostgreSQL
    Fields: Other

    1.5/3 review
    This article is in the Product Showcase section for our sponsors at CodeProject. These reviews are intended to provide you with information on products and services that we consider useful and of value to developers. This is a showcase review for...

  • Postgres Database Backup/Restore From C#

    View: 2951    Download: 2   Comment: 0   Author: none  

    Postgres Database Backup/Restore From C#

    Category: PostgreSQL
    Fields: Other

    0/10 review
    This article provides Postgres database backup/restore functionality. By using this application we can check whether a Postgres database is installed or not in a system. If a Postgres database is installed then the Postgres databases list is...

  • Installing Redmine on Windows in production

    View: 5443    Download: 0   Comment: 0   Author: none  

    Installing Redmine on Windows in production

    Category: PostgreSQL
    Fields: Other

    0/10 review
    This is a very simple step by step guide to install Redmine on Windows server in production. This guide is applicable to IIS 7+ based servers, which include Windows Server 2008, 2008 R2, Windows Vista and Windows 7. Latter two should not be used for...

  • A SQL Management Console for MSSQL 2000 & 2005, MySQL 5.0,...

    View: 3273    Download: 2   Comment: 0   Author: none  

    A SQL Management Console for MSSQL 2000 & 2005, MySQL 5.0,...

    Category: PostgreSQL
    Fields: Other

    0.5625/8 review
    Since MSSQL is not holding the lion share of the market yet, integration/interaction with other RDBMS like PL/Oracle, MySQL, FireBird, DB2, is inevitable. With the class System.Data.Common in ADO .NET 2.0, different data providers can write the...

  • Creating an Interactive Map in ASP.NET 2.0 Using SharpMap

    View: 4377    Download: 5   Comment: 0   Author: none  

    Creating an Interactive Map in ASP.NET 2.0 Using SharpMap

    Category: PostgreSQL
    Fields: Other

    0/2 review
    Showing maps on the web is becoming more and more common. The most well-known ones are Google Maps and Microsoft's Virtual Earth. Creating your own websites with interactive maps require you to get hold on the map-data and convert this data to...

  • DevShot - Database Snapshots for Developers

    View: 956    Download: 0   Comment: 0   Author: none  

    DevShot - Database Snapshots for Developers

    Category: PostgreSQL
    Fields: Other

    0/2 review
    A few weeks back, I got an idea of writing a simple tool for switching database states between coding cycles. I thought this would be a good tool for anyone who is working on database related projects and have to switch between database states since...

 
Newsletter Email

File suggestion for you

File top downloads

logo codetitle
Codetitle.com - library source code to share, download the file to the community
Copyright © 2015. All rights reserved. codetitle.com Develope by Vinagon .Ltd