|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.afcs.warts.db.DataComparison
A DataComparison instance analyses and stores the differences between two datasets. Datasets can only be compared if the tables from which they were loaded are "equivalent", which means that they have the same columns, even if they are in different schemas.
The first step in comparing datasets is to build a list containing the union of the primary keys in the first and second data sets. We then iterate through these keys. If a row exists in the first dataset with the key but not in the second, then that row is marked as a required insertion. If a row in the second dataset matches but not the first, then that row is marked as a required deletion.
If the row exists in both datasets, then we iterate through all of the
columns looking for differences. If one of the values is null and one isn't,
then that's an obvious difference. If both values are non-null and are
instances of
DataHighBitAnalysis
, then
we compare the strings contained in the analysis. If the 'compare encoding'
flag was set to true at initialisation, then we also compare the 'data
class' calculated by the analysis. For any other data type, we simply use
Object.equals(java.lang.Object)
to compare them.
LICENSE: This code is released to the public domain and may be used for any purpose whatsoever without permission or acknowledgment.
Field Summary | |
static int |
ROW_DELETION
This may be returned from getRowClassification(int) to indicate
that the specified row requires deletion (i.e., was in the second dataset
but not in the first). |
static int |
ROW_INSERTION
This may be returned from getRowClassification(int) to indicate
that the specified row requires insertion (i.e., was in the first dataset
but not in the second). |
static int |
ROW_UNCHANGED
This may be returned from getRowClassification(int) to indicate
that all columns in the specified row have identical values in both
datasets. |
static int |
ROW_UPDATED
This may be returned from getRowClassification(int) to indicate
that the specified row was in both datasets but does not have identical
values in all non-primary key columns (although some differences may just
be encoding differnces). |
Constructor Summary | |
DataComparison(DataSet setOne,
DataSet setTwo)
Initialises a comparison of the two data sets. |
|
DataComparison(DataSet setOne,
DataSet setTwo,
boolean compareEncoding)
Initialises a comparison of the two data sets. |
|
DataComparison(DataSet setOne,
DataSet setTwo,
boolean[] compareColumns,
boolean compareEncoding)
Initialises a comparison of the two data sets, optionally excluding comparison of certain columns, and optionally ignoring encoding differences. |
Method Summary | |
boolean |
foundDifferences()
Returns true if differences were found between the datasets provided at initialisation. |
java.util.List |
getAllColumnValues(int rowIndex)
Returns a list containing the values of both the primary key columns and non primary columns for the specified row. |
TableDescription |
getDestTableDescription()
Returns the table description for the second dataset, which also provides access to account information. |
java.util.List |
getNonPrimaryKeyColumnValues(int rowIndex)
Returns an unmodifiable list containing the non-primary key column values for the specified row. |
int |
getNumRows()
Returns the total number of rows in the comparison, which is defined by the union of the primary keys from the first and second datasets. |
int |
getNumRowsDeleted()
Returns the number of rows that were in dataset one and not in dataset two, based on the presence of primary keys. |
int |
getNumRowsInserted()
Returns the number of rows that were in dataset two and not in dataset one, based on the presence of primary keys. |
int |
getNumRowsUnchanged()
Returns the number of rows that were in both datasets (based on the presence of identical primary keys) with unchanged non-primary key column values (considering only the columns that were compared). |
int |
getNumRowsUpdated()
Returns the number of rows that were in both datasets (based on the presence of identical primary keys) with changed non-primary key column values (considering only the columns that were compared), where these changes might be just encoding differences. |
int |
getNumValuesInColumnUpdated(int columnIndex,
boolean justEncodingDifferences)
Returns the number of values in the columnIndex'th non-primary key column that differed between the source and destination data sets. |
java.util.List |
getPrimaryKeyColumnValues(int rowIndex)
Returns an unmodifiable list containing the primary key column values for the specified row. |
int |
getRowClassification(int rowIndex)
Returns the row classification (which will be equal to one of the ROW constants defined in this class) for the specified row. |
TableDescription |
getSourceTableDescription()
Returns the table description for the first dataset, which also provides access to account information. |
TableDescription |
getTableDescription()
Returns the table description for the first dataset, which also provides access to account information. |
java.lang.Object |
getValueAt(int rowIndex,
int columnIndex)
Returns the object stored during the comparison of the data at the specified row and column indices. |
boolean |
rowHasEncodingDifferences(int rowIndex)
Returns true if the specified row contains differences between the two datasets that are just encoding differences. |
void |
rowSynchronized(int rowIndex)
This method is called by DataSynchronizationAction after
synchronization has been performed on the specified row of the and it
marks the row as unchanged, replacing any CellDifference instances
in the row with the new value. |
java.lang.String |
toString()
Returns a text description of the current instance that can be used for debugging purposes. |
boolean |
wasColumnCompared(int columnIndex)
Returns true if the values in the columnIndex'th non-primary key column of the source and destination data sets were compared during analysis. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
public static final int ROW_INSERTION
getRowClassification(int)
to indicate
that the specified row requires insertion (i.e., was in the first dataset
but not in the second).
public static final int ROW_DELETION
getRowClassification(int)
to indicate
that the specified row requires deletion (i.e., was in the second dataset
but not in the first).
public static final int ROW_UPDATED
getRowClassification(int)
to indicate
that the specified row was in both datasets but does not have identical
values in all non-primary key columns (although some differences may just
be encoding differnces).
public static final int ROW_UNCHANGED
getRowClassification(int)
to indicate
that all columns in the specified row have identical values in both
datasets.
Constructor Detail |
public DataComparison(DataSet setOne, DataSet setTwo)
setOne
- The first data set to look at.setTwo
- The second data set to look at.
java.lang.IllegalArgumentException
- If the two datasets cannot be compared
because the table descriptions associated with each
dataset are not equivalent.
java.lang.NullPointerException
- If either argument is null.public DataComparison(DataSet setOne, DataSet setTwo, boolean compareEncoding)
setOne
- The first data set to look at.setTwo
- The second data set to look at.compareEncoding
- Whether to compare the encoding of textual data.
java.lang.IllegalArgumentException
- If the two datasets cannot be compared
because the table descriptions associated with each
dataset are not equivalent.
java.lang.NullPointerException
- If either argument is null.public DataComparison(DataSet setOne, DataSet setTwo, boolean[] compareColumns, boolean compareEncoding)
setOne
- The first data set to look at.setTwo
- The second data set to look at.compareColumns
- An array with a flag for each non-primary key
column (the array must be that length) indicating whether
comparisons should be run for that column. If null,
all columns will be compared.compareEncoding
- Whether to compare the encoding of textual data.
java.lang.IllegalArgumentException
- If the two datasets cannot be compared
because the table descriptions associated with each
dataset are not equivalent.
java.lang.NullPointerException
- If either argument is null.Method Detail |
public java.lang.Object getValueAt(int rowIndex, int columnIndex)
Returns the object stored during the comparison of the data at the specified row and column indices. The row index references into a row list that is the union of all primary keys in both data sets, sorted in primary key order. The columns are stored in the same order as the table description.
If the specified row is an insertion, then the value returned will be
from the first dataset. If the specified row is a deletion, then the value
returned will be from the second dataset. A CellDifference
instance
will be returned if the indices reference a non-primary key column value
that differed between the two datasets received at initialisation.
getValueAt
in interface TabularData
rowIndex
- The row to return data from.columnIndex
- The column to return data from.
java.lang.IllegalArgumentException
- If either index is out of bounds.public java.util.List getAllColumnValues(int rowIndex)
rowIndex
- The row to return the list of primary key columns for.
public java.util.List getPrimaryKeyColumnValues(int rowIndex)
rowIndex
- The row to return the list of primary key columns for.
public java.util.List getNonPrimaryKeyColumnValues(int rowIndex)
rowIndex
- The row to return the list of primary key columns for.
public boolean foundDifferences()
public int getNumRows()
getNumRows
in interface TabularData
public int getNumRowsDeleted()
public int getNumRowsInserted()
public int getNumRowsUnchanged()
public int getNumRowsUpdated()
public int getRowClassification(int rowIndex)
rowIndex
- The row to examine.
java.lang.IllegalArgumentException
- If rowIndex is out of range.public boolean rowHasEncodingDifferences(int rowIndex)
rowIndex
- The row to examine.
java.lang.IllegalArgumentException
- If rowIndex is out of range.public boolean wasColumnCompared(int columnIndex)
columnIndex
- The index of the column to compare (primary key
columns are never compared, so indexing covers the
non-primary key columns only).
java.lang.IllegalArgumentException
- If columnIndex is out of range.public int getNumValuesInColumnUpdated(int columnIndex, boolean justEncodingDifferences)
columnIndex
- The index of the column to return results for (primary
key columns are never compared, so indexing covers the
non-primary key columns only).justEncodingDifferences
- If true, only the number of encoding
differences is returned.
java.lang.IllegalArgumentException
- If columnIndex is out of range.public TableDescription getTableDescription()
getTableDescription
in interface TabularData
public TableDescription getSourceTableDescription()
public TableDescription getDestTableDescription()
public void rowSynchronized(int rowIndex)
DataSynchronizationAction
after
synchronization has been performed on the specified row of the and it
marks the row as unchanged, replacing any CellDifference
instances
in the row with the new value. It is assumed that all text values were
inserted into the second database as UTF-8, so where the source was
Latin-1, this may be marked as an encoding difference.
rowIndex
- The row to mark as synchronized.
java.lang.IllegalArgumentException
- If rowIndex is out of range.public java.lang.String toString()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |