I'm glad to announce pgModeler 0.9.3-beta!
Partial diffs, improved reverse engineering and much more!

Hi there! Despite the worrying moment we're living in the middle of the COVID pandemic I always got thrilled with any new pgModeler release I'm about the launch! Also, it makes me really happy all the positive feedback I receive from users from all around the world from time to time. My project is being praised and recognized as a powerful tool that came to cover a gap in the PostgreSQL ecosystem. This is the kind of news any developer loves to receive and that is the fuel that moves almost all open source project in the world. This is no different with pgModeler because I take seriously any suggestions and, mainly, criticisms and I make my best to improve the tool in order to fit the users needs. So it's because this kind of engagement that I'm really admirer of the open source community!

This time I'm specially glad to announce that pgModeler counts with a long requested and awaited functionality: the partial model/database comparison or, simply, partial diff. Of course this is the first preview of the feature so it still has some bugs but the results I got with my test routines were pretty satisfying. I know that there's always room for improvements and I hope I can keep receiving the same positive feedback and reports so we can make pgModeler even better together. There was other interesting enhancements in 0.9.3-beta and I'll detail some of them in the next sections.

Database model changelog

Before I start to detail the partial diff I need to explain this new addition to the database model. In order to make the partial diff as user-friendly as possible I needed to create a minimalist changelog in the database model. As the name suggests, it registers any create, update or delete operation performed over any object in the model. Basically, the internal changelog of a database model will look like this:

As can be seen the changelog is pretty simple, it contains a set of entries each one composed by the date of the action (in local date/time), the object's signature, the type of the object and the action taken. This simple list is the basis of the partial diff which I'll explain in the further. For now, it is enough to know that the model's changelog has, by default, a limited lifetime. It lasts while the database model isn't closed. But the user is capable of force the writing of the changelog in the database model file which can be useful to store all the modification history and use it in the partial diff anytime.

Despite the usefulness storing the changelog in the file there's an important side effect: the quick bloating of the XML code that represents the database model. So it's up to the user to decide if it's worth to store the changelog in the model file knowing that it can grow quickly or if using a not persisted changelog is okay for the model being worked. What I can suggest is: if you are working on the model for the first time and the database it represents does not exist or you don't intend to develop a database incrementally (add objects to the model > generate diffs > repeat) you don't need to save the changelog to the model file otherwise if you constantly work by adding objects to a database model and generating diffs it's useful to keep track of the model's changes by persisting them in the model file.

In order to have some control over the model's changelog a simple widget was created and can be used to tweak the behavior of the changelog and check some useful information like the last change made to the model and how many entries the changelog has.

Partial diff

This feature consists in taking an input database or model, apply user-provided filters and diminish the amount of objects to be compared. This brings us a significant gain of speed in the whole comparison process because only a subset of the database/model will be handled. The partial diff has two ways of filtering objects: 1) through the user-provided filters like the ones introduced in 0.9.3-alpha1; 2) by using the database model changelog to filter automatically the modified objects.

In order to use the partial diff through manual filtering there's the need to specify a set of object filters composed by the object type and a search pattern (wildcard string or regular expression). Once all the filters are set it's time to retrieve the objects matching them which is done by clicking Apply in the filter widget. All matching objects will be listed in the Input grid at the right of the Partial diff tab. Now, just click Generate to start the partial diff process.

Now, in order to use the filtering based on the model's changelog one need to check the option By modification date and the filter widget will display the fields as shown in the image below. After that, the user just need to specify a date/time interval which will be matched against the changelog and retrieve the modified objects by clicking Apply. Internally, pgModeler generates, based on the objects found, filters like the ones in the above image using objects signatures as the filters patterns. These filters are then used in the whole partial diff process so the correct result can be produced.

Despite the huge improvement on the diff operation with the introduction of the partial comparison the process still has its deficiencies. For instance, the diff isn't able to detect objects moved between schemas, let's say, schema_a.table to schema_b.table, in this case the diff will always understand that schema_a.table was dropped and schema_b.table was created. I'm still studying the best way to implement this lacking feature in the diff, maybe the model's changelog can be a solution but is still too soon to say but it's just a matter of time so I came up with a solution. Lastly, the video below shows the partial diff based on the model's changelog in action.



Command line interface improvements

The pgModeler's command line interface received some improvements in this release. The first one is the ability to check if the user is mixing options from different operations aborting the execution in positive cases. For instance, in previous versions it was possible to specify the operation --diff and use the option --zoom which is related to exporting operation. In that case only the options related to the specified operation were used and the unrelated ones were ignored. The only problem in this approach is that mixing options from different operations could introduce noise causing some confusion. Now, the CLI will abort the execution in any attempt of mixing incompatible options. This makes the CLI more solid and reliable.

Another improvement to the CLI was the addition of the partial diff support. The options --partial, --force, --start-date and --end-date are the ones to be used when the user wants to perform a partial comparison between a model and a database or between two databases. For instance, in order to perform a partial diff by using objects modified in the model since June, 1st the following can be used:

pgmodeler-cli --diff --partial --input my_model.dbm --compare-to my_database --conn-alias local-db --start-date 2020-06-01 --apply

The command above is a basic example but there are a plenty of options that you can mix to it in order to get more refined results depending on your needs. Another example of partial diff is if you want to perform it between two databases, in that case, a basic command would be:

pgmodeler-cli --diff --partial --input-db my_dev_db --compare-to my_prod_db --conn-alias local-db --apply --filter-objects "table:schema_a.*:wildcard" --force-children all

The example above will filter all the objects in the schema_a in the databases my_dev_db and my_prod_db generating the proper diff code. Again, you can play around with the other diff options to create improved results.

Finally, the CLI's menu was rearranged in order to give it a better semantics by separating what is a operation switcher and what is the options related to each operation.

pgModeler 0.9.3-beta command line interface.
PostgreSQL Database Modeler Project - pgmodeler.io
Copyright 2006-2020 Raphael Araújo e Silva 

Usage: pgmodeler-cli [OPTIONS]

This CLI tool provides several operations over models and databases without the need to perform them
in pgModeler's graphical interface. All available options are described below.

Operation mode options: 
  -ef, --export-to-file             Export the input model to a sql script file.
  -ep, --export-to-png              Export the input model to a png image.
  -es, --export-to-svg              Export the input model to a svg file.
  -ec, --export-to-dict             Export the input model to a data directory in HTML format.
  -ed, --export-to-dbms             Export the input model directly to a PostgreSQL server.
  -lc, --list-conns                 List available connections in file connections.conf.
  -im, --import-db                  Import a database to an output file.
  -df, --diff                       Compares a model and a database or two databases generating the SQL script to sync the latter in relation to the first.
  -fm, --fix-model                  Try to fix the structure of the input model file in order to make it loadable again.
  -cc, --create-configs             Create the pgModeler's configuration folder and files in the user's local storage.
  -mt, --dbm-mime-type [ACTION]     Handles the file association to .dbm files. The ACTION can be [install | uninstall].
  -h, --help                        Show this help menu.

General options: 
  -if, --input [FILE]               Input model file (.dbm). This is mandatory for export and fix operations.
  -id, --input-db [DBNAME]          Input database name. This is mandatory for import operation.
  -of, --output [FILE]              Output file. This is mandatory for fixing model or exporting to file, png or svg.
  -v, --pgsql-ver                   Force the PostgreSQL version syntax when generating SQL code.
  -s, --silent                      Silent execution. Only critical messages and errors are shown during process.

PNG and SVG export options: 
  -sg, --show-grid                  Draws the grid in the exported image.
  -sl, --show-delimiters            Draws the page delimiters in the exported image.
  -pp, --page-by-page               Each page will be exported in a separated png image. (Only for PNG images)
  -zf, --zoom [FACTOR]              Applies a zoom (in percent) before export to png image. Accepted zoom interval: 5-500 (Only for PNG images)

Data dictionary export options: 
  -sp, --splitted                   The data dictionaries are generated in separated files inside the selected output directory.
  -ni, --no-index                   Avoids the generation of the index that is used to help navigating through the data dictionary.

Connection options: 
  -ca, --conn-alias [ALIAS]         Connection configuration alias to be used.
  -H, --host [HOST]                 PostgreSQL host in which a task will operate.
  -p, --port [PORT]                 PostgreSQL host listening port.
  -u, --user [USER]                 PostgreSQL username.
  -w, --passwd [PASSWORD]           PostgreSQL user password.
  -D, --initial-db [DBNAME]         Connection's initial database.

DBMS export options: 
  -ir, --ignore-duplicates          Ignores errors related to duplicated objects that eventually exist in the server.
  -ic, --ignore-error-codes [CODES] Ignores additional errors by their codes. A comma-separated list of alphanumeric codes should be provided.
  -dd, --drop-database              Drop the database before execute a export process.
  -do, --drop-objects               Runs the DROP commands attached to SQL-enabled objects.
  -sm, --simulate                   Simulates an export process by executing all steps but undoing any modification in the end.
  -tn, --use-tmp-names              Generates temporary names for database, roles and tablespaces when in simulation mode.

Database import options: 
  -ie, --ignore-errors              Ignore all errors and try to create as many as possible objects.
  -is, --import-sys-objs            Import system built-in objects. This option causes the model bloating due to the importing of unneeded objects.
  -ix, --import-ext-objs            Import extension objects. This option causes the model bloating due to the importing of unneeded objects.
  -fo, --filter-objects [FILTER]    Causes the import process to import only those objects matching the filter(s). The FILTER should be in the form type:pattern:mode.
  -om, --only-matching              Causes only objects matching the provided filter(s) to be imported. Those not matching filter(s) are discarded.
  -mn, --match-by-name              Causes the objects matching to be performed over their names instead of their signature ([schema].[name]).
  -fc, --force-children [OBJECTS]   Forces the importing of children objects related to tables/views/foreign tables matched by the filter(s). The OBJECTS is a comma separated list types.
  -d, --debug-mode                  Run import in debug mode printing all queries executed in the server.

Diff options: 
  -ct, --compare-to [DBNAME]        The database used in the comparison. All the SQL code generated is applied to it.
  -pd, --partial                    Toggles the partial diff operation. A set of objects filters should be provided using the import option --filter-objects.
  -ff, --force                      Forces a full diff if the provided filters were not able to retrieve objects for a partial diff operation.
  -st, --start-date                 Matches all database model objects in which modification date starts in the specified date. (Only for partial diff)
  -et, --end-date                   Matches all database model objects in which modification date ends in the specified date. (Only for partial diff)
  -sd, --save                       Save the generated diff code to output file.
  -ad, --apply                      Apply the generated diff code on the database server.
  -np, --no-preview                 Don't preview the generated diff code when applying it to the server.
  -dc, --drop-cluster-objs          Drop cluster level objects like roles and tablespaces.
  -rv, --revoke-perms               Revoke permissions already set on the database. New permissions configured in the input model are still applied.
  -dm, --drop-missing               Drop missing objects. Generates DROP commands for objects that are present in the input model but not in the compared database.
  -fd, --force-drop-cols            Force the drop of missing columns and constraints. Causes only columns and constraints to be dropped, other missing objects aren't removed.
  -rn, --rename-db                  Rename the destination database when the names of the involved databases are different.
  -nd, --no-cascade                 Don't drop or truncate objects in cascade mode.
  -tt, --trunc-type-change          Truncate tables prior to alter columns. Avoids errors related to type casting when the new type of a column isn't compatible to the old one.
  -ns, --no-sequence-reuse          Don't reuse sequences on serial columns. Drop the old sequence assigned to a serial column and creates a new one.
  -nf, --force-recreate-objs        Force the recreating of objects. Instead of an ALTER command a DROP and CREATE commands are used to create a new version of the objects.
  -nu, --only-unmodifiable          Recreate only the unmodifiable objects. These objects are the ones which can't be changed via ALTER command.

Model fix options: 
  -ft, --fix-tries [NUMBER]         Model fix tries. When reaching the maximum count the invalid objects will be discarded.

File association options: 
  -sw, --system-wide                The file association to .dbm files will be applied in a system wide level instead of to the current user.


** The FILTER value in --filter-objects option has the form type:pattern:mode. 
   * The `type' is the type of object to be filtered and accepts the following values (invalid types ignored): 
     > schema, table, view, aggregate, cast, collation, constraint, conversion, domain
     > eventtrigger, extension, foreigndatawrapper, foreignserver, foreigntable, function, index, language, opclass
     > operator, opfamily, policy, role, rule, sequence, tablespace, trigger, usermapping

   * The `pattern' is the text pattern which is matched against the objects names.

   * The `mode' is the way the pattern is matched. This one accepts two values: 
     > `wildcard' causes the pattern to be used as a wildcard string while matching objects names.
     > `regexp' causes the pattern to be treated as a POSIX regular expression while matching objects names.

   * The option --force-children has effect only when used with --only-matching and will avoid discarding children of matched tables.
     Other tables eventually imported which are dependencies of the matched objects will have their children discarded.
     The comma separated list of table children objects accepts the values:
     > constraint, index, policy, rule, trigger
     > Use the special keyword `all' to force all children objects.

   * NOTES: all comparisons during filtering process are case insensitive.
     Using the filtering options may cause the importing of additional objects due to the automatic dependency resolution.

** The diff process allows the usage all options related to the import operation.
   It also accepts the following export operation options: `--ignore-duplicates', `--ignore-error-codes'

** The partial diff operation will always force the options --only-matching and --force-children = all for more reliable results.
   * The options --start-date and --end-date accepts the ISO8601 date/time format: yyyy-MM-dd hh:mm:ss

** When running the diff using two databases (--input-db and --compare-to) there's the option to specify two separated connections/aliases.
   If only one connection is set then it will be used to import the input database as well to retrieve the database used in the comparison.
   A second connection can be specified by appending a 1 to any connection configuration parameter listed above.
   This causes the connection to be associated to --compare-to exclusively.


Miscellaneous

The objects filtering widget introduced in 0.9.3-alpha1 was slightly redesigned by moving the options in form of check boxes to a popup menu in order to make the widget more compact and easier to use. Due to the partial diff introduction, the reverse engineering needed to be patched in such way to make it able to create sequences assigned to columns through their default values when the automatic dependencies resolution is enabled.

As always, some bugs were fixed being the majority related to crashes. But there was some problems affecting the database design like the one that was rendering incorrectly the relationships when the combination of collapsed tables and the "FK to PK" connection mode was in use. Another fix is related to layers renaming in design view which will not cause crashes in the application anymore.

If you like to build pgModeler from source know that now the code is properly ported to Qt 5.15.x and GCC9. Talking about building from the source, do you know that the project now counts with a CI/CD to test builds on all Qt versions from 5.9 to 5.15 on the three main platforms? Thanks to GitHub actions you can see if pgModeler is building on the Qt version you're currently using.

Finally, you can check the complete list of changes of this release in the file CHANGELOG.md. Please, don't forget to leave your feedback in the comments section or at GitHub. Your suggestions/criticisms are pretty important to keep moving this project!

Until next time and take care! ;)

Comments (0) Add a comment

Add new comment

  • 0/1024